Skip to content

Comments

fix: race conditions in parallel agents#530

Open
baptmont wants to merge 11 commits intogoogle:mainfrom
baptmont:fix/rc-parallel-agents
Open

fix: race conditions in parallel agents#530
baptmont wants to merge 11 commits intogoogle:mainfrom
baptmont:fix/rc-parallel-agents

Conversation

@baptmont
Copy link
Contributor

@baptmont baptmont commented Feb 3, 2026

Superposes #473

jaxxjj and others added 5 commits January 12, 2026 12:49
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @baptmont, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the stability and correctness of concurrent operations within the system by addressing critical race conditions and potential deadlocks. It introduces robust synchronization mechanisms in the ParallelAgent for event processing, ensures thread safety in telemetry initialization, and resolves a deadlock scenario in the InMemorySession's event management. The changes are thoroughly validated with new, targeted integration tests.

Highlights

  • Parallel Agent Race Condition Fix: Introduced an acknowledgment channel (ackChan) between the main ParallelAgent runner and its sub-agents. This ensures that the main agent fully processes an event (including session append) before the sub-agent proceeds to its next iteration, effectively preventing race conditions related to event processing and session state updates.
  • Telemetry Thread Safety: Implemented a read-write mutex (localTracerMu) to protect the localTracer in the telemetry system. This ensures thread-safe initialization and access to the tracer provider, preventing potential race conditions during concurrent telemetry operations.
  • In-Memory Session Deadlock Resolution: Refactored the InMemorySession's Events() and appendEvent() methods to correctly use read and write mutexes. The updateSessionState logic was also adjusted to directly modify the session state map, avoiding a nested lock acquisition that could lead to deadlocks when appending events with state deltas.
  • New Integration Tests: Added comprehensive integration tests, including TestParallelAgentWithTools to validate the parallel agent's behavior with tool calls and responses, and TestInMemorySession_AppendEvent_Deadlock to specifically verify the resolution of the deadlock in the in-memory session's event appending mechanism.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • agent/workflowagents/parallelagent/agent.go
    • Added ackChan to result struct and runSubAgent to synchronize event processing between main and sub-agents.
    • Modified run function to wait for ackChan signal before continuing, ensuring event processing completion.
  • agent/workflowagents/parallelagent/agent_test.go
    • Added new imports for llmagent, httprr, testutil, gemini, tool, functiontool.
    • Introduced TestParallelAgentWithTools to verify race condition fixes in parallel agents using mock LLM agents and tools.
    • Added helper functions createAgentWithGemini, newGeminiModelForTest, newGeminiTestTransport for test setup.
  • agent/workflowagents/parallelagent/testdata/TestParallelAgentWithTools_agent1.httprr
    • Added new HTTP recording file for agent1's mock Gemini API interactions.
  • agent/workflowagents/parallelagent/testdata/TestParallelAgentWithTools_agent2.httprr
    • Added new HTTP recording file for agent2's mock Gemini API interactions.
  • internal/telemetry/telemetry.go
    • Added localTracerMu (a sync.RWMutex) to protect localTracer.
    • Applied localTracerMu.Lock()/Unlock() and localTracerMu.RLock()/RUnlock() for thread-safe access to the telemetry tracer.
  • session/inmemory.go
    • Added s.mu.RLock() and s.mu.RUnlock() to Events() method for read-locking.
    • Added s.mu.Lock() and s.mu.Unlock() to appendEvent() method for write-locking.
    • Refactored updateSessionState to directly assign values to session.state to prevent nested lock acquisition and deadlocks.
  • session/inmemory_test.go
    • Added TestInMemorySession_AppendEvent_Deadlock to confirm the resolution of a potential deadlock during event appending with state deltas.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@baptmont baptmont changed the title fix: race conditions in parallel agents #473 fix: race conditions in parallel agents Feb 3, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several important fixes for race conditions and deadlocks across different parts of the codebase. The use of an acknowledgment channel in parallelagent to synchronize event processing is a solid approach to prevent race conditions between parallel agents. The added locks in the in-memory session implementation correctly address thread-safety issues and prevent potential deadlocks. The data race fix in the telemetry component is also a good catch. The new tests are well-written and effectively verify the fixes. I have one suggestion regarding error propagation in the parallelagent.

@baptmont
Copy link
Contributor Author

baptmont commented Feb 6, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses several race conditions and a deadlock, significantly improving the concurrency safety of the application. The introduction of an acknowledgment channel in parallelagent to synchronize sub-agent execution is a robust fix for preventing stale session reads. The deadlock in session/inmemory has been correctly identified and resolved, and missing mutexes have been added for thread-safe session access. Additionally, a data race in the telemetry component is fixed. The changes are well-supported by new, targeted tests that validate the fixes and prevent regressions. I have one minor suggestion to improve test robustness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants