-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offset-based deduplication for Kafka source. #33596
base: master
Are you sure you want to change the base?
Conversation
28d3ae9
to
9a2d27e
Compare
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
829cc76
to
0307c30
Compare
R: @scwhittle |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment |
3f19ae6
to
6431783
Compare
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
Outdated
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIOUtils.java
Outdated
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIOUtils.java
Outdated
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedSource.java
Show resolved
Hide resolved
949eb3d
to
baca3e2
Compare
84dbc7b
to
9b30e79
Compare
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
Outdated
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIOUtils.java
Outdated
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java
Outdated
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java
Show resolved
Hide resolved
...-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/OrderedCode.java
Outdated
Show resolved
Hide resolved
09a16ab
to
8578e89
Compare
a8317b0
to
fc82f02
Compare
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
Show resolved
Hide resolved
6bfb8ff
to
f2d55b0
Compare
R: @chamikaramj and @johnjcasey Can one of you help review source/translation changes here? I am working on a change that applies to Kafka Read on Java legacy. I am having trouble understanding why the change breaks |
d4d40e3
to
becfb31
Compare
.../io/kafka/upgrade/src/main/java/org/apache/beam/sdk/io/kafka/upgrade/KafkaIOTranslation.java
Outdated
Show resolved
Hide resolved
becfb31
to
87ca766
Compare
@scwhittle if you have any tips, or know who we can ask:
|
Probably try changing the property definition to
|
Seems like the builder is considering this to be a required parameter and failing since it's not provided through the test schema here: |
87ca766
to
60b687d
Compare
Done, I think this resolves it.
I think this PR had it added already, done. Thanks for the feedback! |
Run Java_Kafka_IO_Direct PreCommit |
154b2d1
to
49ad5e2
Compare
Ok, I think some minor updates to KafkaIO translation allows to pass pre commit (such as no conflicting unbounded vs SDF settings). It was a combination of Cham's suggestion to make nullable, and then not setting a default values. |
I think you need to set a default value for this change to be compatible. May be it just passed randomly due to the retry ? |
Lemme know if you need help with re-triggering a test suite. |
When setting a default value and marking offset dedup as only compatible with legacy, Kafka read jobs with SDF-only arguments will fail saying there is no possible implementation (legacy/unbounded or SDF). Example from
@chamikaramj I don't think we need to specify a non-null default. Am I misunderstanding something? |
49ad5e2
to
58948bc
Compare
Synced offline with Cham. I think we are on the same page. Made a minor change KafkaIOTranslation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. LGTM.
Just some nits.
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
Outdated
Show resolved
Hide resolved
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
Outdated
Show resolved
Hide resolved
58948bc
to
8b1b35f
Compare
Offset-based deduplication for Kafka source.
See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.