Skip to content

Fix strict model auth and browser fetch cancellation#7

Open
godnight10061 wants to merge 11 commits intomainfrom
pr/stream-first-delta-timeout
Open

Fix strict model auth and browser fetch cancellation#7
godnight10061 wants to merge 11 commits intomainfrom
pr/stream-first-delta-timeout

Conversation

@godnight10061
Copy link
Owner

@godnight10061 godnight10061 commented Mar 3, 2026

Summary

  • Do not treat userscript proxy upstream status=0 as an error during streaming; keep consuming until status is known or timeouts occur.
  • Ensure the userscript proxy prefers the per-request arena-auth token (prevents reusing an anonymous cookie when a logged-in token was selected).
  • Fix strict-model non-streaming requests crashing (NameError) and prefer the ephemeral browser arena-auth cookie when no config token is present.
  • Make Chrome/Camoufox browser fetch transports resilient to asyncio.CancelledError in this runtime (CancelledError inherits BaseException), so Playwright cancellations do not crash requests.
  • Keep non-OpenAI JSON lines from upstream streams for better error hints.

Tests

  • pytest -q

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and reliability of streaming operations, particularly concerning browser-based transports and proxy mechanisms. It introduces more granular control over streaming timeouts, ensures deterministic behavior for userscript proxy jobs, and improves error handling and fallback strategies for various network and authentication failures. These changes aim to provide a more stable and resilient streaming experience, especially in challenging environments like Windows.

Highlights

  • Streaming Timeout Management: Implemented a stream_first_delta_timeout_seconds configuration to abort upstream stream attempts that fail to produce content deltas within a specified period, preventing indefinite keep-alives and enabling retry/fallback logic.
  • Userscript Proxy Job Timeouts: Made userscript proxy job timeouts deterministic by anchoring them to job creation time, ensuring that proxy workers do not get stuck in long evaluate calls and improving overall stability.
  • Windows Reliability Improvements: Enhanced reliability on Windows by defaulting Camoufox fetch and proxy operations to 'headful' mode, unless explicitly configured otherwise, which often improves reCAPTCHA and Turnstile handling.
  • Transport Fallback Logic: Improved transport fallback mechanisms, including treating FETCH_DONE_WITHOUT_META from browser fetches as a transport failure, and implementing specific fallback strategies for non-reCAPTCHA 403 errors from both userscript proxy and browser fetch transports.
  • API Key Persistence: Updated configuration saving logic to preserve existing API keys by default, preventing accidental overwrites during concurrent config updates, and added explicit control for API key modifications.
  • Script Entrypoint Fix: Addressed an issue preventing src/main.py from being run directly as a script by adjusting sys.path to correctly resolve package-relative imports.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added new entries to ignore local end-to-end testing artifacts such as log files, process IDs, and models.json.
  • src/main.py
    • Implemented logic to allow src/main.py to be executed directly as a script by adjusting Python's import path.
    • Modified save_config to include a preserve_api_keys parameter, defaulting to True, to prevent accidental overwrites of API keys.
    • Updated startup_event to generate a default API key only if api_keys is not present or is None in the raw configuration.
    • Adjusted create_key and delete_key functions to explicitly set preserve_api_keys=False when calling save_config.
    • Refined debug logging for userscript proxy activation to indicate 'enabling proxy fallback'.
    • Added a debug message when a client disconnects during a stream.
    • Introduced a condition to skip Chrome fetch if no auth token is present and a userscript proxy is available, prioritizing the proxy fallback.
    • Updated the userscript proxy timeout to use stream_total_timeout_seconds for consistency.
    • Implemented error handling for 'camoufox proxy evaluate timeout' errors, triggering a fallback and marking the userscript proxy inactive.
    • Adjusted the maximum wait time for userscript proxy 403 retries to be bounded by the remaining budget.
    • Enhanced reCAPTCHA failure detection in userscript proxy by including a broader check for 'recaptcha' in the body text.
    • Added fallback logic for non-reCAPTCHA 403 errors from userscript proxy to switch to browser fetch transports.
    • Implemented fallback logic for non-reCAPTCHA 403 errors from Chrome/Camoufox fetch, allowing switching between them.
    • Introduced stream_first_delta_timeout_seconds to abort upstream streams that do not produce content deltas within the configured timeout.
    • Set saw_delta = True when actual content deltas are received to indicate active streaming.
    • Improved upstream_hint parsing to prioritize specific upstream error messages over generic proxy status messages.
    • Added explicit handling for userscript proxy proxy_error conditions to trigger fallbacks.
    • Modified retry logic for /nextjs-api/stream/post-to-evaluation/ to regenerate userMessageId, modelAMessageId, and modelBMessageId.
    • Added a debug message for stream task cancellation.
  • src/recaptcha.py
    • Modified cookie handling in get_recaptcha_v3_token_with_chrome to allow seeding Cloudflare/reCAPTCHA cookies if they are missing from the profile, but prevent overwriting existing ones.
    • Removed a trailing newline character at the end of the file.
  • src/transport.py
    • Initialized UserscriptProxyStreamResponse status code to 0 (unknown) and added robust error handling and bounds for timeout seconds.
    • Updated aiter_lines to use an absolute deadline for proxy job timeouts, anchored to job creation, preventing timeout extensions.
    • Corrected Playwright cookie specification handling in _arena_auth_cookie_specs and _provisional_user_id_cookie_specs to use url OR domain+path, but not both url and path.
    • Added created_at_monotonic and timeout_seconds to the proxy job dictionary in fetch_lmarena_stream_via_userscript_proxy.
    • Implemented logic in _on_fetch_task_done to treat FETCH_DONE_WITHOUT_META (status 502) from browser fetch as a transport failure.
    • Defaulted Camoufox fetch headless mode to False (headful) on Windows for improved reliability.
    • Defaulted Camoufox proxy headless mode to False (headful) on Windows for improved reliability.
    • Removed path from cookie specifications when url is used in _attempt_anonymous_signup.
    • Added logic to bound the in-page fetch runtime to the per-job timeout in camoufox_proxy_worker to prevent the worker from getting stuck.
    • Implemented a browser relaunch mechanism in camoufox_proxy_worker if an evaluate timeout occurs, to clear bad execution contexts.
  • tests/test_api_key_persistence.py
    • Added new test cases to verify the persistence behavior of API keys when saving configuration.
  • tests/test_arena_origin_and_cookie_scoping.py
    • Updated assertions in test_arena_auth_cookie_specs_scope_to_both_origins to reflect changes in Playwright cookie spec handling, specifically the absence of 'path' when 'url' is present.
  • tests/test_main_script_entrypoint_as_script.py
    • Added a new test to confirm that src/main.py can be successfully executed as a standalone script.
  • tests/test_stream_browser_fetch_non_recaptcha_403_switches_transport.py
    • Added a new test case to validate the browser fetch fallback mechanism when encountering non-reCAPTCHA 403 errors.
  • tests/test_stream_no_delta_followup_regenerates_ids.py
    • Added a new test to ensure that message IDs are regenerated during retry attempts after an upstream stream produces no content deltas.
  • tests/test_stream_userscript_proxy_evaluate_timeout_fallback.py
    • Added a new test to verify the fallback behavior when a userscript proxy evaluate call times out.
  • tests/test_stream_userscript_proxy_long_delay_respects_stream_deadline.py
    • Added a new test to confirm that the userscript proxy respects the configured stream deadline even with long delays.
  • tests/test_stream_userscript_proxy_non_recaptcha_403_fallback.py
    • Added a new test to validate the userscript proxy's fallback to browser fetch transports upon receiving a non-reCAPTCHA 403 error.
Activity
  • The pull request introduces significant changes to streaming, timeout, and transport fallback logic, indicating a focused effort on improving system resilience.
  • New test files have been added to cover the updated functionality, suggesting a thorough approach to validating the changes.
  • The changes primarily involve code modifications and additions, with no explicit human comments or reviews provided in the context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several significant improvements to streaming reliability and transport fallback logic, including making userscript proxy job timeouts deterministic, adding a timeout for initial stream data, and improving transport selection on Windows. However, a security audit identified critical vulnerabilities in the administrative dashboard and API handling. Specifically, there are stored XSS vulnerabilities in the dashboard due to unescaped rendering of model names and API key names, and administrative endpoints lack CSRF protection, potentially allowing unauthorized actions. Additionally, while the PR adds robust error handling for proxy timeouts and non-reCAPTCHA 403s and includes good tests, there is an area of code duplication in the streaming logic that could be refactored for better maintainability.

@godnight10061
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to streaming reliability, timeout handling, and transport fallback mechanisms. The changes are well-structured and address important issues like deterministic timeouts for proxy jobs and preventing infinite keep-alives on stuck streams. The addition of comprehensive tests for the new logic is also a great enhancement. I've identified a couple of opportunities to improve maintainability by refactoring duplicated code blocks, which I've detailed in the specific comments.

@godnight10061
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent pull request that significantly improves the reliability and security of the application. The introduction of deterministic timeouts, fallback logic for streaming transports, and robust CSRF/XSS protection on the dashboard are all critical enhancements. The code is well-structured and the addition of comprehensive tests is commendable.

I've found one high-severity issue related to timeout calculations and one medium-severity suggestion for refactoring duplicated code.

Additionally, while you've added CSRF protection to the API key management endpoints, I noticed that other state-changing endpoints like /add-auth-token and /delete-auth-token are still missing this protection. It would be great to secure them as well to prevent Cross-Site Request Forgery vulnerabilities across the entire dashboard.

@godnight10061
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to streaming reliability, transport fallback logic, and dashboard security. Key changes include adding a timeout for streams that don't produce data, making userscript proxy timeouts deterministic, and enhancing fallback mechanisms between different browser transports (Userscript proxy, Chrome, and Camoufox). The dashboard has also been hardened against XSS and CSRF attacks through HTML escaping and CSRF protection for API key management. However, I identified one remaining instance of unescaped output in the dashboard that could lead to XSS if the upstream model data is malicious. Additionally, there is a suggestion to improve the timeout implementation's robustness.

@godnight10061
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to streaming timeouts, transport fallback logic, and dashboard security. The changes effectively address potential infinite keep-alive streams by adding a 'first delta' timeout and make proxy job timeouts deterministic. The transport fallback mechanisms are now more robust, handling various error scenarios like non-reCAPTCHA 403s and proxy timeouts. Security is enhanced with CSRF protection on the dashboard and systematic HTML escaping to prevent XSS. The addition of comprehensive unit tests for the new logic is a great practice. My review includes one suggestion for minor code simplification in an error handling path, but overall, the changes are excellent and well-implemented.

@godnight10061
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent pull request that significantly improves the reliability and security of the application. The introduction of CSRF protection and HTML escaping on the dashboard is a critical security enhancement. The improvements to streaming timeouts, particularly the 'no delta' timeout and the deterministic proxy job timeouts, will greatly enhance reliability and prevent hanging requests. The transport fallback logic is also much more robust now. The addition of a comprehensive test suite to validate these changes is highly appreciated. I've found a couple of minor opportunities for refactoring to reduce code duplication, but overall this is a very high-quality contribution.

@godnight10061 godnight10061 force-pushed the pr/stream-first-delta-timeout branch from fcf6326 to a22ab18 Compare March 4, 2026 06:27
@godnight10061 godnight10061 changed the title Fix streaming timeouts and transport fallback Fix strict model auth and browser fetch cancellation Mar 4, 2026
@godnight10061
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant number of improvements across security, robustness, and functionality. The addition of CSRF protection and HTML escaping on the dashboard is a critical security enhancement. The streaming logic has been made much more robust with better transport selection, error handling for various 403 scenarios, and timeouts to prevent hangs. The fixes for strict model authentication and browser fetch cancellation are also valuable. The code is well-structured, and the inclusion of a comprehensive test suite for the new features and fixes is excellent. I have one suggestion for refactoring a piece of duplicated logic to improve maintainability.

Note: Security Review did not run due to the size of the PR.

Comment on lines +4674 to +4692
auth_for_browser = str(current_token or "").strip()
try:
cand = str(EPHEMERAL_ARENA_AUTH_TOKEN or "").strip()
except Exception:
cand = ""
if cand:
try:
if (
is_probably_valid_arena_auth_token(cand)
and not is_arena_auth_token_expired(cand, skew_seconds=0)
and (
(not auth_for_browser)
or (not is_probably_valid_arena_auth_token(auth_for_browser))
or is_arena_auth_token_expired(auth_for_browser, skew_seconds=0)
)
):
auth_for_browser = cand
except Exception:
auth_for_browser = cand

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of logic for selecting the best authentication token for browser-based transports appears to be duplicated in a few places (here for non-streaming requests, and inside _try_chrome_fetch and _try_camoufox_fetch in the streaming logic). To improve maintainability and reduce code duplication, consider refactoring this logic into a separate helper function.

For example:

def _get_auth_for_browser_transport(current_token: str) -> str:
    """Selects the best auth token for browser transports, preferring a valid ephemeral token."""
    auth_for_browser = str(current_token or "").strip()
    try:
        cand = str(EPHEMERAL_ARENA_AUTH_TOKEN or "").strip()
    except Exception:
        cand = ""
    
    if not cand:
        return auth_for_browser

    try:
        is_cand_valid = (
            is_probably_valid_arena_auth_token(cand)
            and not is_arena_auth_token_expired(cand, skew_seconds=0)
        )
        if not is_cand_valid:
            return auth_for_browser

        is_current_invalid = (
            (not auth_for_browser)
            or (not is_probably_valid_arena_auth_token(auth_for_browser))
            or is_arena_auth_token_expired(auth_for_browser, skew_seconds=0)
        )

        if is_current_invalid:
            return cand
            
    except Exception:
        # Fallback to candidate on parsing error
        return cand

    return auth_for_browser

You could then call this helper function in all three locations where this logic is used.

@godnight10061
Copy link
Owner Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent pull request that significantly improves the security and robustness of the application. It introduces crucial security measures like CSRF protection and output escaping to prevent XSS, fixes several bugs related to transport selection and timeouts, and improves session management. The changes to the userscript proxy and browser fetch transports make them more resilient to errors and upstream issues. The addition of a comprehensive suite of tests for the new functionality is also highly commendable. I have one minor suggestion for improvement.

Note: Security Review did not run due to the size of the PR.

Comment on lines +3748 to +3749
except Exception:
pass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For clarity and to avoid unintentionally catching unrelated errors, it's better to catch the specific asyncio.QueueEmpty exception here instead of the general Exception.

Suggested change
except Exception:
pass
except asyncio.QueueEmpty:
pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant