-
-
Notifications
You must be signed in to change notification settings - Fork 53
feat(nanogpt): add NanoGPT provider with subscription quota tracking #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I'm diving into the new NanoGPT provider and the quota tracking implementation. The mixin approach for subscription management looks interesting—I'll start my analysis now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall Assessment
The addition of the NanoGPT provider is a solid improvement, and I like the standalone mixin pattern for quota tracking. It fits well with the project's goal of managing shared usage pools. However, there is a critical missing piece regarding the quota group implementation that will prevent usage from syncing correctly between different NanoGPT models.
Architectural Feedback
- Quota Group Syncing: The most significant issue is the missing
get_models_in_quota_groupmethod. Because NanoGPT uses a single global pool, theUsageManagerneeds this method to know which models to update when a baseline is fetched or a request is recorded. - Logic Consolidation: There is significant redundancy between the provider's
run_background_joband the mixin'sget_all_quota_info. Moving the parallel fetch logic entirely into the mixin would be cleaner. - Global State: Setting
os.environin__init__is a bit "heavy". If possible, it's safer to handle routing at the request level.
Key Suggestions
- Fix the Quota Sync: Implement
get_models_in_quota_groupinNanoGptProvider. You'll need to keep track of discovered models (e.g., in a set) to return them here. - Consolidate Mixin: Use the mixin's methods in the background job instead of re-implementing the parallel gather logic.
- Correct Fallback Models: Update the Gemini 2.5 fallbacks to 2.0.
Questions for the Author
- Is the
resetAttimestamp format (milliseconds) confirmed from NanoGPT's API response? - Is
nanogpt/_subscriptionthe preferred naming convention for virtual models in this system?
This review was generated by an AI assistant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
Excellent progress! You've addressed the most critical architectural concerns from the previous review. The implementation of the quota group synchronization and the consolidation of the background job logic significantly improve the robustness and maintainability of the NanoGPT provider.
Assessment of New Changes:
- Quota Group Sync: The addition of
get_models_in_quota_groupis a key fix. It ensures that usage metrics are correctly shared across all models in the NanoGPT pool. - Logic Consolidation: Refactoring
run_background_jobto use the mixin'srefresh_subscription_usagereduces duplication and makes the provider cleaner. - Virtual Model Convention: Thanks for the clarification on the
nanogpt/_subscriptionnaming; it makes sense within theUsageManagerframework. - Environment Safety: Removing the global
os.environsetting is a good safety improvement.
Remaining Points:
- Gemini Versions: The fallback list still refers to Gemini 2.5. Since 2.0 is the current latest, it's safer to update these to avoid potential "model not found" errors if discovery fails.
- Unused Mixin Code:
get_all_quota_infoin the mixin is still unused and contains redundant logic. Removing it would help keep the codebase lean. - Timestamp Verification: Just a final sanity check on the
resetAtdivision by 1000. If you've verified this from API responses, we're good to go!
The PR is in great shape now. Once those minor points are addressed or confirmed, this is ready to merge.
This review was generated by an AI assistant.
Implement NanoGPT provider (https://nano-gpt.com) with simple standalone quota tracking: Core Implementation: - NanoGptProvider: OpenAI-compatible provider with subscription monitoring - NanoGptQuotaTracker: Standalone mixin (no BaseQuotaTracker inheritance) - Tracks subscription-level quota (not per-model) Features: - Dynamic model discovery from /v1/models endpoint - Environment variable model override (NANOGPT_MODELS) - Subscription usage tracking via /api/subscription/v1/usage - Tier-based credential prioritization (active > grace > inactive) - Background refresh job for quota monitoring Quota Tracking: - Daily and monthly usage pools (shared across all models) - Usage units tracking (successful operations, not tokens) - Virtual model 'nanogpt/_subscription' for UsageManager integration - States: active, grace_period, inactive, cancelled Files: - NEW: nanogpt_provider.py (280 lines) - NEW: nanogpt_quota_tracker.py (345 lines) Pattern: Simple standalone mixin for credential-level quota providers.
- Add get_models_in_quota_group() for UsageManager quota sync - Track discovered models in _discovered_models set for group sync - Refactor run_background_job to use mixin's refresh_subscription_usage() - Make tier refresh conditional (only on cache miss) to reduce network calls - Remove global os.environ mutation from __init__ - Add documentation for nanogpt/_subscription virtual model convention Resolves critical feedback: without get_models_in_quota_group, the UsageManager cannot sync request_count and quota baselines across models sharing NanoGPT's global subscription pool.
7fbad34 to
d8664bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up Review
Excellent progress! The NanoGPT implementation is now architecturally robust and addresses all critical feedback from the previous rounds. The consolidation of logic into the mixin and the implementation of quota group synchronization significantly improve the maintainability and correctness of the provider.
Assessment of New Changes:
- NanoGPT Quota Sync: The addition of
get_models_in_quota_groupand the_discovered_modelstracking correctly solves the usage synchronization issue. - Provider Refactoring: Moving the parallel fetch logic into the mixin and removing global environment state in
__init__are great cleanup steps. - Consistency: The naming of the virtual model
nanogpt/_subscriptionaligns well with the existingUsageManagerinfrastructure.
The PR is in great shape and ready to merge.
This review was generated by an AI assistant.
Implement quota tracking for NanoGPT provider using a simple standalone mixin pattern:
Core Implementation:
Features:
Files:
API Integration:
Architecture: Simple standalone mixin for credential-level quota tracking.
Important
Adds
NanoGptProviderwithNanoGptQuotaTrackermixin for credential-level quota tracking and dynamic model discovery for NanoGPT API.NanoGptProviderinnanogpt_provider.pyfor NanoGPT API with subscription-based usage tracking.NanoGptQuotaTrackermixin innanogpt_quota_tracker.pyfor credential-level quota tracking.fetch_subscription_usage(): Queries NanoGPT API for subscription usage.run_background_job(): Periodic quota updates usingasyncio.gather./v1/modelsendpoint.NANOGPT_QUOTA_REFRESH_INTERVALenv var.GET https://nano-gpt.com/api/v1/models.GET https://nano-gpt.com/api/subscription/v1/usage.This description was created by
for 8e70237. You can customize this summary. It will automatically update as commits are pushed.