Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use Gemini model with browser extract command #42

Open
balioglum opened this issue Feb 19, 2025 · 1 comment
Open

Unable to use Gemini model with browser extract command #42

balioglum opened this issue Feb 19, 2025 · 1 comment

Comments

@balioglum
Copy link

Description

When trying to use the browser extract command with Gemini model, the command fails with API key configuration issues despite setting up the API key in multiple ways.

Environment

  • cursor-tools version: 0.5.1-alpha.2
  • OS: macOS 24.2.0
  • Node.js version: v20.18.1

Steps to Reproduce

  1. Install [email protected]:
npm install -g [email protected]
  1. Set up configuration in cursor-tools.config.json:
{
  "stagehand": {
    "provider": "google",
    "model": "gemini-2.0-pro-exp-02-05"
  }
}
  1. Try running the command with various API key configurations:
# Attempt 1: With GOOGLE_API_KEY
GOOGLE_API_KEY=<api_key> cursor-tools browser extract "query" --url="https://example.com" --model=gemini-2.0-pro-exp-02-05

# Attempt 2: With both environment variables
STAGEHAND_PROVIDER=google GOOGLE_API_KEY=<api_key> cursor-tools browser extract "query" --url="https://example.com" --model=gemini-2.0-pro-exp-02-05

# Attempt 3: With LLM_API_KEY
LLM_API_KEY=<api_key> cursor-tools browser extract "query" --url="https://example.com" --model=gemini-2.0-pro-exp-02-05

Current Behavior

The command fails with the error:

ExtractionSchemaError: No LLM API key or LLM Client configured. An LLM API key or a custom LLM Client is required to use act, extract, or observe.

Even though the model is recognized (as shown in the config output):

Warning: Using unfamiliar model "gemini-2.0-pro-exp-02-05"...
using stagehand config {
  env: "LOCAL",
  headless: true,
  verbose: 0,
  debugDom: false,
  modelName: "gemini-2.0-pro-exp-02-05",
  apiKey: "REDACTED",
  ...
}

Expected Behavior

The command should successfully use the Gemini model with the provided API key to perform the extraction.

Questions

  1. What is the correct way to configure the API key for Gemini model usage?
  2. Are there specific environment variable names that should be used for Gemini integration?
  3. Is there additional configuration needed to enable Gemini model support?

Additional Context

  • The API key is valid and works with other Google services
  • The same setup works with the repo command
  • We've tried multiple combinations of environment variables and configuration file settings
@eastlondoner
Copy link
Owner

I'm afraid that this is because Stagehand does not support Google/Gemini at the moment. We would have to implement our own gemini client for Stagehand to support this. You can raise this with Stagehand here: https://github.com/browserbase/stagehand

I will keep the issue open though in case we decide to implement a Gemini Client (I have implemented a Groq client for stagehand) or if someone finds a Gemini client for Stagehand that we can use

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants