-
-
Notifications
You must be signed in to change notification settings - Fork 56
feat(quota): ✨ implement authoritative local counting and retry tracking #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Updates the usage tracking logic to handle providers where local counting is more accurate than API returns (specifically Antigravity). - Implement `sync_mode` in `UsageManager` (`force`, `if_exhausted`, `none`) to control how API baselines affect local counters. - Configure Antigravity credential manager to treat local counting as authoritative, skipping background refreshes to prevent valid usage data from being overwritten by stale API data. - Add `on_retry_attempt` callback in `RotatingClient` and `AntigravityProvider` to track internal retries (bare 429s, empty responses) that consume quota. - Introduce `measured_max_requests` tracking to dynamically learn the actual request limit of credentials.
|
@b3nw This is my take. Though, this feels way too bloated to commit without a major rethink. What do you think? Difference here is that antigravity now only counts locally, except for first fetch and force fetch from the API. But, the retries in the provider make incrementing counts.. difficult. |
Oh i know it is the case. No need to confirm for me - this is because of reporting in 20% increments. So api fetches are essentially useless for measuring quota accurately. It is also rounded a bit, as you can get quota exhausted before you get the error. Suggesting the api refresh happened and you got hit with 0 remaining. |
- Delay setting `_initial_quota_fetch_done` in `GeminiCredentialManager` until after baselines are successfully stored to prevent premature usage. - Update `_store_baselines_to_usage_manager` signature to propagate `sync_mode`. - Add validation in `UsageManager` to warn and default to "force" if an unknown `sync_mode` is provided.
|
Okay, this seems like a very bad idea. Entire usage manager needs to be refactored, as it was built on way too many times. It needs to be easily overridable by the providers. Current logic in the PR is extremely shoddy when combined and anything short of rework is making it worse. |
Updates the usage tracking logic to handle providers where local counting is more accurate than API returns (specifically Antigravity).
sync_modeinUsageManager(force,if_exhausted,none) to control how API baselines affect local counters.on_retry_attemptcallback inRotatingClientandAntigravityProviderto track internal retries (bare 429s, empty responses) that consume quota.measured_max_requeststracking to dynamically learn the actual request limit of credentials.Linked to Bug: Quota count resets unexpectedly when background refresh syncs with API #75 and fix(usage_manager): 🐛 prevent stale API responses from resetting quota count (#75) #81
Important
Implement authoritative local counting and retry tracking for Antigravity provider with
sync_modeand dynamic request limit learning.sync_modeinUsageManagerto control API baseline effects on local counters (force,if_exhausted,none).on_retry_attemptcallback inRotatingClientandAntigravityProviderto track retries (bare 429s, empty responses) that consume quota.measured_max_requeststracking to dynamically learn actual request limits.increment_request_count()inUsageManagerto track retry attempts.update_quota_baseline()inUsageManagerupdated to handlesync_modeand measured max requests.client.pyandantigravity_provider.pyto integrate retry tracking and sync mode logic.This description was created by
for 2015eed. You can customize this summary. It will automatically update as commits are pushed.