-
Notifications
You must be signed in to change notification settings - Fork 99
fix(spanner): handle errors during stream restart in snapshot #1471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @sinhasubham, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical reliability issue in the Spanner client library's snapshot iteration, where transient errors during stream restarts could lead to unhandled exceptions and premature termination of data retrieval. By refining the retry mechanism to properly re-initialize iterators after such errors, the change significantly improves the robustness and resilience of data streaming operations, preventing data loss or incomplete results due to intermittent service disruptions. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses a bug where errors during a stream restart in a snapshot were not handled correctly, which could break the retry loop. The fix of resetting the iterator to None and letting the main try block handle the re-initialization is a solid approach. The added unit tests properly cover the fixed scenario.
I've included a couple of suggestions to improve code maintainability by reducing duplication in both the implementation and the new tests. These changes should make the code cleaner and easier to manage in the future.
698bf71 to
6659c7d
Compare
|
|
||
| except ServiceUnavailable: | ||
| del item_buffer[:] | ||
| with trace_call( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you removing the trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We aren't removing the observability; we're consolidating it. By setting iterator = None and calling continue, the loop restarts and re-enters the trace_call block at Line 111. This ensures every retry is still traced while removing the need for duplicate tracing logic inside the except blocks. By doing this we handle the problem of error inside except block which is crashing the application without going through retry mechanism.
Handle errors during stream restart in snapshot
Root Cause
When
_restart_on_unavailablecaught aServiceUnavailableor resumableInternalServerError, it attempted to re-initialize the iterator immediately within theexceptblock. If this re-initialization failed (e.g. due to a persistent transient error), the exception would propagate unhandled, breaking the retry loop.Fix
This change modifies the logic to reset the iterator to
Noneandcontinuethe loop, forcing the re-initialization to occur inside thetryblock. This ensures that subsequent errors during restart are properly caught and retried.Testing
Added unit tests to cover this specific behavior