-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gemini Integration w/ changes #250
base: main
Are you sure you want to change the base?
Conversation
Warning Rate limit exceeded@SujalThapaK has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 12 minutes and 35 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughThis pull request introduces multiple modifications across the project. Updates include adding new environment variables for Gemini integration in the Changes
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (3)
app/modules/key_management/secret_manager.py (3)
188-189
: Update provider literal in delete_secret.The delete_secret method's provider literal should also include "google" for consistency.
- provider: Literal["openai", "anthropic", "deepseek", "all"], + provider: Literal["openai", "anthropic", "deepseek", "google", "all"],🧰 Tools
🪛 Ruff (0.8.2)
189-189: Do not perform function call
Depends
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
196-196
: Add Google to provider_list.Update the provider_list to include "google" when deleting all secrets.
- provider_list = ["openai", "anthropic", "deepseek"] + provider_list = ["openai", "anthropic", "deepseek", "google"]
48-60
:🧩 Analysis chain
Verify Gemini integration requirements.
Please confirm the following:
- Is "google" the correct provider name for Gemini integration? Some implementations use "gemini" or "google-ai" instead.
- Are there any specific format requirements for Google API keys that need validation?
Let's check the official documentation:
🌐 Web query:
What is the correct provider name for Google Gemini API integration with LangChain?
💡 Result:
To integrate Google's Gemini API with LangChain, the correct provider name is **`langchain-google-genai`**. This package provides classes like `ChatGoogleGenerativeAI` and `GoogleGenerativeAIEmbeddings` for direct interaction with Gemini models via an API key[1][2][7][8][10]. Key details: - **Installation**: ```bash pip install langchain-google-genai ``` - **Usage**: ```python from langchain_google_genai import ChatGoogleGenerativeAI llm = ChatGoogleGenerativeAI(model="gemini-pro", temperature=0.9) response = llm.invoke("Your prompt here") ``` This differs from Google Cloud Vertex AI integrations (`langchain-google-vertexai`), which require cloud credentials instead of a simple API key[3][10]. For most developers starting with Gemini API, `langchain-google-genai` is recommended[10]. Citations: - 1: https://www.andela.com/blog-posts/langchain-and-google-gemini-api-for-ai-apps-a-quickstart-guide - 2: https://python.langchain.com/api_reference/google_genai/ - 3: https://python.langchain.com/docs/integrations/llms/google_vertex_ai_palm/ - 4: https://github.com/shamspias/langchain-gemini-api/blob/main/README.md - 5: https://www.youtube.com/watch?v=_GEitI9PU28 - 6: https://discuss.ai.google.dev/t/does-google-ai-studio-work-with-langchain/64052 - 7: https://python.langchain.com/docs/integrations/chat/google_generative_ai/ - 8: https://dev.to/timesurgelabs/how-to-use-googles-gemini-pro-with-langchain-1eje - 9: https://github.com/google-gemini/cookbook/blob/main/examples/langchain/Gemini_LangChain_QA_Chroma_WebLoad.ipynb - 10: https://python.langchain.com/docs/integrations/providers/google/
Update provider name to match Gemini integration requirements.
- The official documentation indicates that for Gemini integration with LangChain, the correct provider name is "langchain-google-genai" rather than "google".
- There are no specific API key format requirements mentioned; the key is expected to be a standard API key string.
🧰 Tools
🪛 Ruff (0.8.2)
50-50: Use capitalized environment variable
ISDEVELOPMENTMODE
instead ofisDevelopmentMode
(SIM112)
🧹 Nitpick comments (5)
app/celery/celery_app.py (1)
40-42
: Consider environment-specific queue naming.While hardcoding the queue name simplifies development in WSL, consider making this more flexible for different environments.
-# Hardcoded queue name to "dev" (Allows for easier development in WSL;) - Sujal -def configure_celery(): - queue_name = os.getenv("CELERY_QUEUE_NAME", "dev_process_repository") +def configure_celery(): + env = os.getenv("ENV", "development") + queue_name = os.getenv("CELERY_QUEUE_NAME", f"{env}_process_repository")app/celery/tasks/parsing_tasks.py (1)
48-48
: Consider structured logging for user ID.While logging the user ID improves traceability, consider using structured logging for better log parsing.
- logger.info(f"Given userID is: {user_id}") + logger.info("Processing task for user", extra={"user_id": user_id})app/modules/intelligence/provider/provider_service.py (2)
1-1
: Remove unused import.The
lru_cache
import is not used in the code.-from functools import lru_cache
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
functools.lru_cache
imported but unusedRemove unused import:
functools.lru_cache
(F401)
238-243
: Consider adding max_tokens parameter.The Google provider initialization might benefit from a max_tokens parameter like other providers.
if provider == "google": common_params.update( { "google_api_key": api_key, + "max_tokens": 8000, } )
start.sh (1)
36-38
: Legacy Resolver Flag in pip Install Command
The pip install command now uses the "--use-deprecated=legacy-resolver" flag to improve startup time. This is acceptable as a temporary measure for performance, but consider reassessing its necessity when a newer pip version or a more permanent solution becomes available.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.env.template
(1 hunks).gitignore
(1 hunks)app/celery/celery_app.py
(3 hunks)app/celery/tasks/parsing_tasks.py
(2 hunks)app/main.py
(1 hunks)app/modules/intelligence/provider/provider_service.py
(7 hunks)app/modules/key_management/secret_manager.py
(2 hunks)app/modules/key_management/secrets_schema.py
(2 hunks)requirements.txt
(1 hunks)start.sh
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- requirements.txt
🧰 Additional context used
🪛 Shellcheck (0.10.0)
start.sh
[warning] 50-50: Celery_QUEUE_NAME is referenced but not assigned.
(SC2154)
🪛 Ruff (0.8.2)
app/modules/intelligence/provider/provider_service.py
1-1: functools.lru_cache
imported but unused
Remove unused import: functools.lru_cache
(F401)
🔇 Additional comments (13)
app/modules/key_management/secrets_schema.py (2)
9-9
: LGTM! Provider type updated for Google integration.The Literal type has been correctly extended to include the "google" provider.
42-44
: Verify Google API key format validation.The validation checks for the "AIza" prefix, which is standard for Google API keys. However, let's verify if there are any additional format requirements.
✅ Verification successful
🌐 Web query:
What is the format of Google Gemini API keys?
💡 Result:
Google Gemini API keys are **alphanumeric strings** generated by Google AI Studio, typically resembling a long sequence of characters like `AIzaSyB...` (truncated for security). While the exact character count and internal structure aren’t publicly documented in the provided sources[1][2][4][8], they follow standard Google Cloud API key conventions[8], which are designed to be cryptographically secure and globally unique. Key characteristics: - **No fixed visual pattern**: Unlike some API keys with segmented formats (e.g., `xxxx-xxxx-xxxx`), Gemini keys appear as continuous strings[2][8]. - **Case-sensitive**: Likely include both uppercase and lowercase letters[8]. - **Generated exclusively via Google AI Studio**: No manual formatting is required or possible[1][4][5]. Example usage in code (Python): ```python from google import genai genai.configure(api_key="AIzaSyB...") # Placeholder for actual key[3] ``` Security considerations: - Never expose keys in client-side code or public repositories[2][8]. - Store them in environment variables or secure secret managers[2][5]. - Regenerate compromised keys immediately via Google AI Studio[8]. For verification, you can test your key with a curl command: ```bash curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=${GEMINI_API_KEY}" ... ``` A valid key returns structured JSON responses, while invalid ones trigger authentication errors[2][6]. Citations: - 1: https://www.youtube.com/watch?v=I0LMGNAfA1k - 2: https://ai.google.dev/gemini-api/docs/api-key - 3: https://ai.google.dev/gemini-api/docs - 4: https://www.merge.dev/blog/gemini-api-key - 5: https://docs.aicontentlabs.com/articles/google-gemini-api-key/ - 6: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference - 7: https://developers.google.com/idx/guides/build-gemini-api-app - 8: https://cloud.google.com/api-keys/docs/overview?hl=en - 9: https://zapier.com/blog/gemini-api/
Google API key validation is appropriate as implemented.
Based on the web query, Google Gemini API keys do indeed start with "AIza" and are generally alphanumeric without a fixed segmentation pattern. The current logic (checking that the key begins with "AIza") aligns with these characteristics. While there isn’t a publicly disclosed fixed length or additional static format beyond the prefix, this check serves as a valid initial validation for keys.
- The check for the "AIza" prefix matches the commonly observed format for Gemini API keys.
- No further formatting rules (e.g., length validation) are publicly documented, so the current validation is sufficient for preventing obvious format errors.
app/celery/celery_app.py (1)
54-56
: LGTM! Explicit queue configuration added.The explicit task_queues configuration using kombu.Queue is a good practice for clear queue definition.
app/celery/tasks/parsing_tasks.py (1)
13-14
: LGTM! Enhanced logging configuration for async processes.Setting logger.propagate = True is appropriate for background and async processes, ensuring logs are properly captured in the parent logger hierarchy.
app/main.py (1)
24-24
: LGTM! Potpie API router reintroduced with v2 prefix.The router is correctly configured with appropriate prefix and tags. The v2 prefix suggests this is a new version of the API, which is a good practice for versioning.
Also applies to: 118-119
app/modules/intelligence/provider/provider_service.py (3)
56-60
: LGTM! Provider info added correctly.The Google provider information is properly added with a clear description.
286-288
: LGTM! Provider name handling is complete.The provider name detection is properly implemented for both ChatGoogleGenerativeAI instances and model names containing "gemini".
Also applies to: 295-296
144-159
: Verify Gemini model names.The model configurations use "gemini-1.5-flash" and "gemini-1.5-pro". Let's verify these are the correct model names.
✅ Verification successful
🌐 Web query:
What are the available Gemini model names in Google's Generative AI?
💡 Result:
Google's Generative AI offers multiple Gemini models across different versions and optimization levels. Here's a breakdown of available model names as of February 2025: ## Core Model Families **Gemini 2.x Series** - `gemini-2.0-flash` (General Availability) - `gemini-2.0-pro` (Experimental) - `gemini-2.0-flash-thinking` (Experimental) - `gemini-2.0-flash-lite` (Public Preview) **Gemini 1.x Series** - `gemini-1.5-flash` (Auto-updated version) - Stable: `gemini-1.5-flash-002`[4] - Preview: `gemini-1.5-flash-preview-MMDD`[4] - `gemini-1.5-pro` (Auto-updated version) - Stable: `gemini-1.5-pro-002`[4][6] - Preview: `gemini-1.5-pro-preview-MMDD`[4] ## Specialized Variants **Legacy Models** - `gemini-1.0-pro-vision` (Multimodal text+image processing)[4] - `gemini-1.0-pro` (Deprecated after Feb 15, 2025)[6] **Embeddings & QA** - `text-embedding-004` (Text embeddings)[6] - `aqa` (Source-grounded answers)[6] ## Versioning System Three naming conventions exist: | Type | Example | Characteristics | |------|---------|------------------| | Stable | `gemini-x.x-pro-002` | Fixed capabilities | | Auto-updated | `gemini-x.x-pro` | Always latest stable | | Preview | `gemini-x.x-pro-preview-0409` | Cutting-edge features | For agentic AI workflows requiring multimodal streaming or tool integration, Gemini 2.x models are recommended[2][10]. Developers should use auto-updated versions (`gemini-x.x-pro`) unless requiring specific feature locking via stable versions[4]. Citations: - 1: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models - 2: https://www.techtarget.com/whatis/feature/Google-Gemini-20-explained-Everything-you-need-to-know - 3: https://ai.google/get-started/our-models/ - 4: https://firebase.google.com/docs/vertex-ai/gemini-models - 5: https://pypi.org/project/google-generativeai/ - 6: https://ai.google.dev/gemini-api/docs/models/gemini - 7: https://en.wikipedia.org/wiki/Gemini_(language_model) - 8: https://zapier.com/blog/google-gemini/ - 9: https://www.ibm.com/think/topics/google-gemini - 10: https://deepmind.google/technologies/gemini/
Verification of Gemini Model Names
Based on the latest information from Google’s Generative AI documentation, the model names "gemini-1.5-flash" and "gemini-1.5-pro" in your configuration represent the auto-updated versions of the Gemini 1.x series. These are valid and match the available auto-updated naming convention as of February 2025.
- The auto-updated naming (without additional suffixes) is intended to always track the latest stable release.
- If a specific stable version is required, you might consider using the suffixes (e.g., "gemini-1.5-flash-002" or "gemini-1.5-pro-002").
For the current implementation, the Gemini model names used are correct.
app/modules/key_management/secret_manager.py (2)
48-49
: LGTM! Google provider support added correctly.The secret manager properly handles Google provider secrets with consistent naming convention.
Also applies to: 58-60
58-59
: LGTM! The Google provider implementation follows the established pattern.The secret ID format for the Google provider maintains consistency with other providers.
.gitignore (1)
5-5
: LGTM! Appropriate files ignored.Good addition of:
potpievenv
: Project-specific virtual environmentservice_account.json
: Google service account credentials (sensitive)Also applies to: 7-7
.env.template (1)
30-33
: LGTM! Google environment variables added with clear documentation.Good addition of required Google environment variables with helpful comment about service account credentials.
start.sh (1)
50-50
: Celery Queue Environment Variable Check
The variable name has been updated from "CELERY_QUEUE_NAME" to "Celery_QUEUE_NAME". Note that shellcheck warns this variable is referenced but not assigned. Ensure that "Celery_QUEUE_NAME" is defined—likely via the sourced .env file—or add an explicit check in the script to prevent runtime errors.
For instance, you could add the following snippet before its usage:+if [ -z "${Celery_QUEUE_NAME}" ]; then + echo "Error: Celery_QUEUE_NAME environment variable not set." + exit 1 +fi🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 50-50: Celery_QUEUE_NAME is referenced but not assigned.
(SC2154)
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
app/modules/key_management/secret_manager.py (1)
191-203
: Update provider list in delete_secret method.The provider list for bulk deletion needs to include "google" to maintain consistency with the new provider support.
Apply this diff:
if provider == "all": - provider_list = ["openai", "anthropic", "deepseek"] + provider_list = ["openai", "anthropic", "deepseek", "google"]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/modules/key_management/secret_manager.py
(2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
app/modules/key_management/secret_manager.py
48-48: SyntaxError: Expected an indented block after function definition
🔇 Additional comments (2)
app/modules/key_management/secret_manager.py (2)
138-140
: Update provider Literal type in get_secret method.The provider Literal type needs to be updated to include "google" for consistency.
Apply this diff:
@staticmethod def get_secret( - provider: Literal["openai", "anthropic", "deepseek"], customer_id: str + provider: Literal["openai", "anthropic", "deepseek", "google"], customer_id: str ):
61-62
: LGTM! Google provider support is well implemented.The implementation follows the established pattern for secret ID generation and maintains consistency with other providers.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
Added Google Gemini Integration for Potpie.
(Changes requested in the last PR have been addressed)
Summary by CodeRabbit
New Features
• Integrated support for a Google AI provider, giving users additional model options.
• Re-enabled the Potpie API endpoint for streamlined service interaction.
Chores
• Updated environment settings to include new variables for Google integration.
• Updated installation settings to optimize performance and reliability.
• Refined background processing configurations for smoother task handling.
• Added entries to
.gitignore
to exclude specific files from version control.