Skip to content

Conversation

@b3nw
Copy link
Contributor

@b3nw b3nw commented Jan 20, 2026

Implement quota tracking for NanoGPT provider using a simple standalone mixin pattern:

Core Implementation:

  • NanoGptQuotaTracker: Standalone mixin (no complex base class inheritance)
  • Tracks credential-level quota (usage units = successful operations)
  • Daily/monthly quota with API-provided reset timestamps
  • Tier detection from subscription state (active/grace/inactive)

Features:

  • fetch_subscription_usage(): Queries NanoGPT API for subscription usage
  • Background refresh job: Periodic quota updates via run_background_job()
  • Integration with UsageManager using virtual model 'nanogpt/_subscription'
  • Configurable refresh interval via NANOGPT_QUOTA_REFRESH_INTERVAL env var
  • Dynamic model discovery from /v1/models endpoint
  • Parallel credential refresh with asyncio.gather and semaphore

Files:

  • NEW: nanogpt_provider.py (321 lines) - Provider with quota tracking + background job
  • NEW: nanogpt_quota_tracker.py (350 lines) - Standalone quota mixin

API Integration:

Architecture: Simple standalone mixin for credential-level quota tracking.


Important

Adds NanoGptProvider with NanoGptQuotaTracker mixin for credential-level quota tracking and dynamic model discovery for NanoGPT API.

  • Behavior:
    • Adds NanoGptProvider in nanogpt_provider.py for NanoGPT API with subscription-based usage tracking.
    • Implements NanoGptQuotaTracker mixin in nanogpt_quota_tracker.py for credential-level quota tracking.
    • Tracks daily/monthly usage limits and supports tier-based credential prioritization.
  • Features:
    • fetch_subscription_usage(): Queries NanoGPT API for subscription usage.
    • run_background_job(): Periodic quota updates using asyncio.gather.
    • Dynamic model discovery from /v1/models endpoint.
    • Configurable refresh interval via NANOGPT_QUOTA_REFRESH_INTERVAL env var.
  • API Integration:
    • Models endpoint: GET https://nano-gpt.com/api/v1/models.
    • Usage endpoint: GET https://nano-gpt.com/api/subscription/v1/usage.
    • Auth: Bearer token in Authorization header.

This description was created by Ellipsis for 8e70237. You can customize this summary. It will automatically update as commits are pushed.

@b3nw b3nw requested a review from Mirrowel as a code owner January 20, 2026 00:55
@mirrobot-agent
Copy link
Contributor

I'm diving into the new NanoGPT provider and the quota tracking implementation. The mixin approach for subscription management looks interesting—I'll start my analysis now!

Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

The addition of the NanoGPT provider is a solid improvement, and I like the standalone mixin pattern for quota tracking. It fits well with the project's goal of managing shared usage pools. However, there is a critical missing piece regarding the quota group implementation that will prevent usage from syncing correctly between different NanoGPT models.

Architectural Feedback

  • Quota Group Syncing: The most significant issue is the missing get_models_in_quota_group method. Because NanoGPT uses a single global pool, the UsageManager needs this method to know which models to update when a baseline is fetched or a request is recorded.
  • Logic Consolidation: There is significant redundancy between the provider's run_background_job and the mixin's get_all_quota_info. Moving the parallel fetch logic entirely into the mixin would be cleaner.
  • Global State: Setting os.environ in __init__ is a bit "heavy". If possible, it's safer to handle routing at the request level.

Key Suggestions

  • Fix the Quota Sync: Implement get_models_in_quota_group in NanoGptProvider. You'll need to keep track of discovered models (e.g., in a set) to return them here.
  • Consolidate Mixin: Use the mixin's methods in the background job instead of re-implementing the parallel gather logic.
  • Correct Fallback Models: Update the Gemini 2.5 fallbacks to 2.0.

Questions for the Author

  • Is the resetAt timestamp format (milliseconds) confirmed from NanoGPT's API response?
  • Is nanogpt/_subscription the preferred naming convention for virtual models in this system?

This review was generated by an AI assistant.

@Mirrowel Mirrowel added the Agent Monitored Monitored for AI Agent to review PR's and commits label Jan 20, 2026
Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

Excellent progress! You've addressed the most critical architectural concerns from the previous review. The implementation of the quota group synchronization and the consolidation of the background job logic significantly improve the robustness and maintainability of the NanoGPT provider.

Assessment of New Changes:

  • Quota Group Sync: The addition of get_models_in_quota_group is a key fix. It ensures that usage metrics are correctly shared across all models in the NanoGPT pool.
  • Logic Consolidation: Refactoring run_background_job to use the mixin's refresh_subscription_usage reduces duplication and makes the provider cleaner.
  • Virtual Model Convention: Thanks for the clarification on the nanogpt/_subscription naming; it makes sense within the UsageManager framework.
  • Environment Safety: Removing the global os.environ setting is a good safety improvement.

Remaining Points:

  • Gemini Versions: The fallback list still refers to Gemini 2.5. Since 2.0 is the current latest, it's safer to update these to avoid potential "model not found" errors if discovery fails.
  • Unused Mixin Code: get_all_quota_info in the mixin is still unused and contains redundant logic. Removing it would help keep the codebase lean.
  • Timestamp Verification: Just a final sanity check on the resetAt division by 1000. If you've verified this from API responses, we're good to go!

The PR is in great shape now. Once those minor points are addressed or confirmed, this is ready to merge.

This review was generated by an AI assistant.

b3nw added 2 commits January 20, 2026 02:16
Implement NanoGPT provider (https://nano-gpt.com) with simple standalone quota tracking:

Core Implementation:
- NanoGptProvider: OpenAI-compatible provider with subscription monitoring
- NanoGptQuotaTracker: Standalone mixin (no BaseQuotaTracker inheritance)
- Tracks subscription-level quota (not per-model)

Features:
- Dynamic model discovery from /v1/models endpoint
- Environment variable model override (NANOGPT_MODELS)
- Subscription usage tracking via /api/subscription/v1/usage
- Tier-based credential prioritization (active > grace > inactive)
- Background refresh job for quota monitoring

Quota Tracking:
- Daily and monthly usage pools (shared across all models)
- Usage units tracking (successful operations, not tokens)
- Virtual model 'nanogpt/_subscription' for UsageManager integration
- States: active, grace_period, inactive, cancelled

Files:
- NEW: nanogpt_provider.py (280 lines)
- NEW: nanogpt_quota_tracker.py (345 lines)

Pattern: Simple standalone mixin for credential-level quota providers.
- Add get_models_in_quota_group() for UsageManager quota sync
- Track discovered models in _discovered_models set for group sync
- Refactor run_background_job to use mixin's refresh_subscription_usage()
- Make tier refresh conditional (only on cache miss) to reduce network calls
- Remove global os.environ mutation from __init__
- Add documentation for nanogpt/_subscription virtual model convention

Resolves critical feedback: without get_models_in_quota_group, the
UsageManager cannot sync request_count and quota baselines across
models sharing NanoGPT's global subscription pool.
@b3nw b3nw force-pushed the feature/nanogpt-quota-system branch from 7fbad34 to d8664bc Compare January 20, 2026 02:17
Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

Excellent progress! The NanoGPT implementation is now architecturally robust and addresses all critical feedback from the previous rounds. The consolidation of logic into the mixin and the implementation of quota group synchronization significantly improve the maintainability and correctness of the provider.

Assessment of New Changes:

  • NanoGPT Quota Sync: The addition of get_models_in_quota_group and the _discovered_models tracking correctly solves the usage synchronization issue.
  • Provider Refactoring: Moving the parallel fetch logic into the mixin and removing global environment state in __init__ are great cleanup steps.
  • Consistency: The naming of the virtual model nanogpt/_subscription aligns well with the existing UsageManager infrastructure.

The PR is in great shape and ready to merge.

This review was generated by an AI assistant.

@b3nw b3nw closed this Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Agent Monitored Monitored for AI Agent to review PR's and commits

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants