-
Notifications
You must be signed in to change notification settings - Fork 18
Use long windows when catching up #67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -54,8 +54,16 @@ public Timestamp getWindowEndTimestamp() { | |
| return windowEnd; | ||
| } | ||
|
|
||
| public TaskState moveToNextWindow(long nextWindowSizeMs) { | ||
| return new TaskState(windowEnd, windowEnd.plus(nextWindowSizeMs, ChronoUnit.MILLIS), Optional.empty()); | ||
| public TaskState moveToNextWindow(Timestamp now, long confidenceWindowSizeMs, long newQueryWindowSizeMs) { | ||
| Timestamp newWindowEnd = now.plus(-confidenceWindowSizeMs, ChronoUnit.MILLIS); | ||
|
|
||
| // Make sure that the window is at least newQueryWindowSizeMs long. | ||
| long windowLength = ChronoUnit.MILLIS.between(windowEnd.toDate().toInstant(), newWindowEnd.toDate().toInstant()); | ||
| if (windowLength < newQueryWindowSizeMs) { | ||
| newWindowEnd = windowEnd.plus(newQueryWindowSizeMs, ChronoUnit.MILLIS); | ||
| } | ||
|
|
||
| return new TaskState(windowEnd, newWindowEnd, Optional.empty()); | ||
| } | ||
|
|
||
| public TaskState update(ChangeId seen) { | ||
|
|
@@ -86,12 +94,24 @@ public String toString() { | |
| /* | ||
| * Creates an initial state for tasks in a given |generation|. | ||
| * | ||
| * Such initial state starts at the beginning of the generation and spans for | ||
| * |windowSizeMs| milliseconds. | ||
| * Such initial state starts at the beginning of the generation and spans | ||
| * until now minus confidence window size. | ||
| */ | ||
| public static TaskState createInitialFor(GenerationId generation, long windowSizeMs) { | ||
| Timestamp generationStart = generation.getGenerationStart(); | ||
| return new TaskState(generationStart, generationStart.plus(windowSizeMs, ChronoUnit.MILLIS), Optional.empty()); | ||
| public static TaskState createInitialFor(GenerationId generation, Timestamp now, | ||
| long confidenceWindowSizeMs, long queryTimeWindowSizeMs) { | ||
| // Start reading at generation start: | ||
| Timestamp windowStart = generation.getGenerationStart(); | ||
|
|
||
| // Create a large window up to (now - confidenceWindowSizeMs), except | ||
| // when the window gets too small - in that case create a window | ||
| // queryTimeWindowSizeMs large (the consumer might need to wait a bit | ||
| // for the window to be ready for reading). | ||
| Timestamp windowEnd = now.plus(-confidenceWindowSizeMs, ChronoUnit.MILLIS); | ||
| if (windowEnd.compareTo(windowStart) < 0) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this enough? Wouldn't we want to adjust windowEnd also when
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe, but this is only an initial window so it doesn't really matter.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I could add this at the cost of additional complexity. |
||
| windowEnd = windowStart.plus(queryTimeWindowSizeMs, ChronoUnit.MILLIS); | ||
| } | ||
|
|
||
| return new TaskState(windowStart, windowEnd, Optional.empty()); | ||
| } | ||
|
|
||
| /* If the state is before |minimumWindowStart| then this method returns a state | ||
|
|
@@ -108,6 +128,12 @@ public TaskState trimTaskState(Timestamp minimumWindowStart, long windowSizeMs) | |
| return new TaskState(minimumWindowStart, minimumWindowStart.plus(windowSizeMs, ChronoUnit.MILLIS), Optional.empty()); | ||
| } | ||
|
|
||
| return this; | ||
| // Trim the start of the window with minimumWindowStart. | ||
| Timestamp newWindowStart = windowStart; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why wasn't this needed before and now is? Or was it and it was a bug we weren't trimming the start?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Previously it only handled one case: Now it also handles this case: Previously it didn't really matter to do the second type of trimming, because the windows were very small (default 30 seconds). |
||
| if (newWindowStart.compareTo(minimumWindowStart) < 0) { | ||
| newWindowStart = minimumWindowStart; | ||
| } | ||
|
|
||
| return new TaskState(newWindowStart, windowEnd, lastConsumedChangeId); | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the conclusion of your tests @avelanarius? Does it make sense to take max(generation.getGenerationStart(), now - CDC ttl)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it made sense. The test was the following: create a table with small TTL (3 seconds), insert 10^6 rows, then do two types of selects: select of small window and select of a large window. Both of those queries returned no rows (3 second TTL...), but a select of small window was faster. Sometimes, select of a large window even caused timeouts (even though no rows were returned). We hypothesized that this is due to replicas sending "tombstones" to coordinator.