Skip to content

add changefeed doc #21273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: release-8.1
Choose a base branch
from

Conversation

shiyuhang0
Copy link
Member

First-time contributors' checklist

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

  • master (the latest development version)
  • v9.0 (TiDB 9.0 versions)
  • v8.5 (TiDB 8.5 versions)
  • v8.4 (TiDB 8.4 versions)
  • v8.3 (TiDB 8.3 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)

What is the related PR or file link(s)?

  • This PR is translated from:
  • Other reference link(s):

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

Copy link

ti-chi-bot bot commented Jun 27, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign qiancai for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added missing-translation-status This PR does not have translation status info. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 27, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @shiyuhang0, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive documentation for the new TiDB Cloud Changefeed feature. It provides an overview of the changefeed's capabilities, how to manage it using the TiDB Cloud CLI, and detailed instructions for configuring data streaming to Apache Kafka, including various filter and Kafka-specific settings.

Highlights

  • New Feature Documentation: Added a new overview document for the TiDB Cloud Changefeed feature, explaining its purpose (streaming data from TiDB Cloud to other services like Kafka), its current beta status, and general management operations.
  • CLI Command Reference: Documented the TiDB Cloud CLI commands for interacting with changefeeds, covering listing, creating, pausing, resuming, editing, and deleting changefeeds.
  • Apache Kafka Integration Guide: Provided a detailed guide on how to configure a changefeed to stream data to Apache Kafka, including network prerequisites, Kafka ACL authorization, and extensive configuration options for data formats (e.g., Canal-JSON, Avro, Open Protocol, Debezium), authentication, and topic partitioning strategies.
  • Changefeed States and Limitations: Outlined the various states a changefeed can be in (e.g., CREATING, RUNNING, PAUSED, WARNING, RUNNING_FAILED) and noted current limitations such as the number of changefeeds per cluster and table filter rules.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The code changes add documentation for the Changefeed feature in TiDB Cloud, including an overview and a guide for sinking data to Apache Kafka. The documentation covers restrictions, prerequisites, configuration, and other details. The style guide was followed, with minor suggestions for clarity and grammar.

- `<cluster-id>`: the ID of the TiDB Cloud cluster that you want to create the changefeed for.
- `<changefeed-name>`: the name of the changefeed, it is optional. If you do not specify a name, TiDB Cloud automatically generates a name for the changefeed.
- type: the type of the changefeed, which is `KAFKA` in this case.
- kafka: a JSON string that contains the configurations for the changefeed to stream data to Apache Kafka. See []() for more information about the configurations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There should be a valid link here. Please provide a valid reference or remove the empty link.

Suggested change
- kafka: a JSON string that contains the configurations for the changefeed to stream data to Apache Kafka. See []() for more information about the configurations.
- kafka: a JSON string that contains the configurations for the changefeed to stream data to Apache Kafka. See the documentation for more information about the configurations.

- `<changefeed-name>`: the name of the changefeed, it is optional. If you do not specify a name, TiDB Cloud automatically generates a name for the changefeed.
- type: the type of the changefeed, which is `KAFKA` in this case.
- kafka: a JSON string that contains the configurations for the changefeed to stream data to Apache Kafka. See []() for more information about the configurations.
- filter: a JSON string that contains the configurations for the changefeed to filter tables and events. See []() for more information about the configurations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There should be a valid link here. Please provide a valid reference or remove the empty link.

Suggested change
- filter: a JSON string that contains the configurations for the changefeed to filter tables and events. See []() for more information about the configurations.
- filter: a JSON string that contains the configurations for the changefeed to filter tables and events. See the documentation for more information about the configurations.

TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. Currently, TiDB Cloud supports streaming data to Apache Kafka.
> **Note:**
>
> - Currently, you can manager changefeeds only with [TiDB Cloud CLI](/tidb-cloud/get-started-with-cli.md).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

According to the style guide 1, when addressing users, use the second person ("you"). Consider rephrasing to directly address the user.

Style Guide References

Suggested change
> - Currently, you can manager changefeeds only with [TiDB Cloud CLI](/tidb-cloud/get-started-with-cli.md).
> - Currently, you can manage changefeeds only with [TiDB Cloud CLI](/tidb-cloud/get-started-with-cli.md).

Footnotes

  1. Write in second person ("you") when addressing users. (link)

>
> TiDB Cloud currently only allows editing changefeeds in the paused status.

To edit a changefeed sink to kafka, you can pause the changefeed first, and then edit with the TiDB Cloud CLI command:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

According to the style guide 1, when addressing users, use the second person ("you"). Consider rephrasing to directly address the user.

Style Guide References

Suggested change
To edit a changefeed sink to kafka, you can pause the changefeed first, and then edit with the TiDB Cloud CLI command:
To edit a changefeed sink to kafka, you can pause the changefeed first, and then edit it with the TiDB Cloud CLI command:

Footnotes

  1. Write in second person ("you") when addressing users. (link)

ticloud serverless changefeed template --explain
```

The configurations in the `kafka` JSON string are used to configure how the changefeed streams data to Apache Kafka. Below is an example of a `filter` configuration:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

The configurations in the kafka JSON string are used to configure how the changefeed streams data to Apache Kafka. Below is an example of a kafka configuration, not a filter configuration.

Suggested change
The configurations in the `kafka` JSON string are used to configure how the changefeed streams data to Apache Kafka. Below is an example of a `filter` configuration:
The configurations in the `kafka` JSON string are used to configure how the changefeed streams data to Apache Kafka. Below is an example of a `kafka` configuration:

- `tls_enable`: Whether to enable TLS for the connection.
- `compression`: The compression type for messages, support `NONE`, `GZIP`, `LZ4`, `SNAPPY`, and `ZSTD`.

"DISABLE", "SASL_PLAIN", "SASL_SCRAM_SHA_256", "SASL_SCRAM_SHA_512"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

This line seems out of place. It should be integrated into the numbered list or removed if it's not relevant.


- **Distribute changelogs by timestamp to Kafka partition**

If you want the changefeed to send Kafka messages to different Kafka partitions randomly, set `partition_type` to `TS`.. The commitTs of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures orderliness in each partition. However, multiple changes of a data item might be sent to different partitions and the consumer progress of different consumers might be different, which might cause data inconsistency. Therefore, the consumer needs to sort the data from multiple partitions by commitTs before consuming.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

The phrase "is send" should be "is sent".

Suggested change
If you want the changefeed to send Kafka messages to different Kafka partitions randomly, set `partition_type` to `TS`.. The commitTs of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures orderliness in each partition. However, multiple changes of a data item might be sent to different partitions and the consumer progress of different consumers might be different, which might cause data inconsistency. Therefore, the consumer needs to sort the data from multiple partitions by commitTs before consuming.
If you want the changefeed to send Kafka messages of a table to different partitions, set `partition_type` to `COLUMN` and set the `columns`. The specified column values of a row changelog will determine which partition the changelog is sent to. This distribution method ensures orderliness in each partition and guarantees that the changelog with the same column values is sent to the same partition.

Copy link

ti-chi-bot bot commented Jun 27, 2025

@shiyuhang0: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-verify edbd3df link true /test pull-verify

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
missing-translation-status This PR does not have translation status info. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant