Skip to content

Conversation

@cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Oct 24, 2025

Corresponding of fluent/fluent-bit#10691.

Summary by CodeRabbit

  • Documentation
    • Corrected S3 reference link and clarified compression/format options and availability
    • Added comprehensive Parquet guidance: build requirements, enablement steps, testing examples, and usage notes (including upload/encoding considerations)
    • Clarified gzip content-encoding guidance and provided example configurations for validation and testing

✏️ Tip: You can customize this high-level summary in your review settings.

@cosmo0920 cosmo0920 requested review from a team as code owners October 24, 2025 05:37
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-parquet-c-description-for-out_s3 branch 4 times, most recently from 6e4b47e to b659055 Compare October 24, 2025 05:46
@cosmo0920
Copy link
Contributor Author

cosmo0920 commented Oct 24, 2025

@esmerel Good morning.
Vale warns some of usages that include Parquet is not following sentence style of capitalization. But Parquet is a proper noun. How do we handle on such warnings?

@eschabell eschabell self-assigned this Oct 27, 2025
@eschabell eschabell requested a review from esmerel October 27, 2025 20:44
@eschabell eschabell added the waiting-on-review Waiting on a review from mainteners label Oct 27, 2025
@eschabell
Copy link
Collaborator

@esmerel review needed, see questions above.

esmerel added a commit that referenced this pull request Oct 27, 2025
@esmerel
Copy link
Contributor

esmerel commented Oct 27, 2025

@esmerel Good morning. Vale warns some of usages that include Parquet is not following sentence style of capitalization. But Parquet is a proper noun. How do we handle on such warnings?

We add it to the spelling exceptions file for Vale. I went ahead and did that in #2116 =)

And also the heading file, which I should have realized. #2117

Copy link
Contributor

@esmerel esmerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one suggestion that looks accidental

@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-parquet-c-description-for-out_s3 branch from 0b5a2a3 to 3b4b0d4 Compare October 28, 2025 03:59
TomlinfreeGit pushed a commit to TomlinfreeGit/fluent-bit-docs that referenced this pull request Oct 28, 2025
Signed-off-by: Lynette Miles <[email protected]>
Signed-off-by: Tom <[email protected]>
Copy link
Contributor

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine other than adding the legacy format config as well.

@patrick-stephens patrick-stephens added waiting-for-user Waiting for user/contributors feedback or requested changes and removed waiting-on-review Waiting on a review from mainteners labels Nov 5, 2025
cosmo0920 and others added 2 commits November 21, 2025 19:10
- Apply suggestion from @esmerel

Co-authored-by: Lynette  Miles <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
@cosmo0920 cosmo0920 force-pushed the cosmo0920-add-parquet-c-description-for-out_s3 branch from 3b4b0d4 to 0e3ab85 Compare November 21, 2025 10:17
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 21, 2025

Walkthrough

Updates to the S3 output documentation: corrects the S3 guide link, replaces the compression parameter description with a combined "Compression/format" explanation, and adds detailed Parquet support instructions, build/testing examples, usage notes, and a duplicated Parquet block plus a minor tab/blank formatting tweak.

Changes

Cohort / File(s) Summary
S3 Documentation Updates
pipeline/outputs/s3.md
Replaced CloudWatch S3 link with official Amazon S3 user guide; changed compression listing to a combined "Compression/format" section clarifying Arrow/build-time options and gzip/Content-Encoding behavior; added explicit Parquet support requirements and an "Enable Parquet support" section with build steps, testing configurations and examples (YAML and conf), usage notes about PutObject vs multipart uploads; duplicated Parquet enablement block and added a minor tab/blank formatting adjustment.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Focus review on accuracy of build and dependency instructions for Parquet (Arrow, GLib/Parquet GLib) and correctness of example configurations.
  • Verify the duplicated Parquet block is intentional; consider removing duplicate.
  • Check the added note on gzip Content-Encoding for clarity and correctness.
  • Confirm the updated S3 user guide URL is correct.

Poem

🐇 I hopped through docs with nimble feet,
I fixed a link and made guide neat.
Parquet paths I tucked in tight,
Compress and test — the examples light.
A joyful nibble — docs take flight! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'out_s3: Add a description for pure C parquet' is directly related to the changeset, which adds documentation describing Parquet support (including pure C Parquet options) for the S3 output plugin.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cosmo0920-add-parquet-c-description-for-out_s3

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2dee9d8 and 0e3ab85.

📒 Files selected for processing (1)
  • pipeline/outputs/s3.md (4 hunks)
🔇 Additional comments (3)
pipeline/outputs/s3.md (3)

700-723: LGTM on build requirements documentation.

The Parquet build requirements section is clear and actionable, with a concrete Ubuntu/Debian example and proper references to external documentation for other distributions.


728-783: LGTM on testing configurations.

The example configurations for Parquet testing are well-structured, providing both YAML and conf formats with appropriate settings for testing (debug logging, dummy input with diverse data types, and use_put_object enabled).


700-783: Verify: Has the legacy format request been addressed?

A previous review comment from patrick-stephens requested: "We should add the legacy format as well." The current Parquet section doesn't explicitly mention legacy compression formats or other format options.

Please clarify:

  1. Does this PR need to document additional compression formats beyond gzip and parquet?
  2. If so, should they be added to this section or elsewhere in the documentation?

Signed-off-by: Hiroshi Hatake <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e3ab85 and 848e027.

📒 Files selected for processing (1)
  • pipeline/outputs/s3.md (4 hunks)
🔇 Additional comments (4)
pipeline/outputs/s3.md (4)

9-9: ✓ Helpful documentation link update.

The change from the CloudWatch Logs S3 link to the official AWS S3 user guide is more appropriate and direct for users referencing S3 capabilities.


39-39: ✓ Comprehensive compression parameter documentation.

The expanded compression description clearly distinguishes between:

  • Always-available formats (gzip with Content-Encoding header behavior)
  • Build-time conditional formats (parquet requiring -DFLB_ARROW=On and Arrow GLib/Parquet GLib)
  • Typical usage patterns (Parquet with use_put_object On)

This is precise and helpful for users evaluating their build configuration.


700-722: ✓ Clear build requirements and instructions for Parquet support.

The new "Build requirements for Parquet" section provides:

  • Specific package installation steps for Ubuntu/Debian
  • CMake command with the required -DFLB_ARROW=On flag
  • Reference to upstream Apache Parquet installation docs for other distributions

This guidance aligns well with the parameter documentation and enables users to set up Parquet correctly.


724-781: ✓ Thorough testing section with dual-format examples.

The "Testing Parquet support" section includes:

  • Complete service and pipeline configuration in both YAML and text (.conf) formats
  • Realistic input (dummy input with mixed data types)
  • Key Parquet-specific settings (compression: parquet, use_put_object: On)
  • Comments noting where additional parameters can be added

Examples are clear and actionable.

For example:

{% tabs %}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove extraneous blank line in tab markup.

Line 642 contains a blank line immediately after the opening {% tabs %} tag, which appears to be a formatting artifact. This extra whitespace can affect the rendered tab display.

  {% tabs %}
-
  {% tab title="fluent-bit.yaml" %}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% tabs %}
{% tab title="fluent-bit.yaml" %}
🤖 Prompt for AI Agents
In pipeline/outputs/s3.md around line 642, there is an extraneous blank line
immediately after the opening `{% tabs %}` tag; remove that blank line so the
first tab block/content follows the `{% tabs %}` tag directly (ensure no other
whitespace-only lines exist between the `{% tabs %}` tag and the first `{% tab
%}` or tab content).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-for-user Waiting for user/contributors feedback or requested changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants