Skip to content

Conversation

@sinhasubham
Copy link
Contributor

Handle errors during stream restart in snapshot

Root Cause
When _restart_on_unavailable caught a ServiceUnavailable or resumable InternalServerError, it attempted to re-initialize the iterator immediately within the except block. If this re-initialization failed (e.g. due to a persistent transient error), the exception would propagate unhandled, breaking the retry loop.

Fix
This change modifies the logic to reset the iterator to None and continue the loop, forcing the re-initialization to occur inside the try block. This ensures that subsequent errors during restart are properly caught and retried.

Testing
Added unit tests to cover this specific behavior

@sinhasubham sinhasubham requested review from a team as code owners January 5, 2026 15:36
@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Jan 5, 2026
@product-auto-label product-auto-label bot added the api: spanner Issues related to the googleapis/python-spanner API. label Jan 5, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @sinhasubham, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical reliability issue in the Spanner client library's snapshot iteration, where transient errors during stream restarts could lead to unhandled exceptions and premature termination of data retrieval. By refining the retry mechanism to properly re-initialize iterators after such errors, the change significantly improves the robustness and resilience of data streaming operations, preventing data loss or incomplete results due to intermittent service disruptions.

Highlights

  • Robust Error Handling: The pull request enhances the error handling mechanism within the _restart_on_unavailable function for Spanner snapshots, specifically addressing issues where re-initialization failures during stream restarts could break the retry loop.
  • Refactored Iterator Re-initialization: The logic for re-initializing iterators after catching ServiceUnavailable or resumable InternalServerError has been refactored. Instead of immediate re-initialization within the except block, the iterator is now explicitly set to None, forcing re-initialization to occur safely within the main try block on the next loop iteration.
  • Improved Test Coverage: New unit tests have been added to specifically cover scenarios where ServiceUnavailable or resumable InternalServerError exceptions occur during the stream restart process, ensuring the fix behaves as expected and the retry mechanism is robust.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a bug where errors during a stream restart in a snapshot were not handled correctly, which could break the retry loop. The fix of resetting the iterator to None and letting the main try block handle the re-initialization is a solid approach. The added unit tests properly cover the fixed scenario.

I've included a couple of suggestions to improve code maintainability by reducing duplication in both the implementation and the new tests. These changes should make the code cleaner and easier to manage in the future.

@sinhasubham sinhasubham force-pushed the fix/rst-stream-retry branch from 698bf71 to 6659c7d Compare January 6, 2026 14:22

except ServiceUnavailable:
del item_buffer[:]
with trace_call(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you removing the trace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't removing the observability; we're consolidating it. By setting iterator = None and calling continue, the loop restarts and re-enters the trace_call block at Line 111. This ensures every retry is still traced while removing the need for duplicate tracing logic inside the except blocks. By doing this we handle the problem of error inside except block which is crashing the application without going through retry mechanism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: spanner Issues related to the googleapis/python-spanner API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants