Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemeni Models integration #252

Closed

Conversation

subashrijal5
Copy link

@subashrijal5 subashrijal5 commented Dec 2, 2024

Why

  • We need to expand our LLM client support to include Google's Generative AI alongside our existing Anthropic client

What Changed

Copy link

changeset-bot bot commented Dec 2, 2024

🦋 Changeset detected

Latest commit: 9dccfdc

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@browserbasehq/stagehand Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@kamath kamath added this to the Extensibility milestone Dec 2, 2024
@kamath kamath added the enhancement New feature or request label Dec 2, 2024
@kamath
Copy link
Member

kamath commented Dec 2, 2024

Thanks so much for taking the time to do this! Curious if you ran any evals already on the different models and how they performed? Transparently, we're trying to limit our scope of LLMs for now to figure out how we want to better incorporate extensibility, so there's a good chance we might refactor a lot of this code. That being said, don't want your contributions and time spent to go to waste, so just want to make sure this has meaningful value-add over OpenAI/Anthropic

@@ -9,7 +9,12 @@ async function example() {
enableCaching: false,
});

await stagehand.init();
await stagehand.init({
modelName: "gemini-1.5-flash",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend updating this default to the newly released "gemini-2.0-flash-exp"

@vladionescu
Copy link

vladionescu commented Dec 12, 2024

Gemini has an OpenAI-compatible API endpoint, as well as their own client SDK which is implemented here. I prefer using their own SDK, just flagging in case it becomes unstable (Google has the genai Python SDK in beta alongside the generativeai SDK...).

Thanks so much for taking the time to do this! Curious if you ran any evals already on the different models and how they performed? Transparently, we're trying to limit our scope of LLMs for now to figure out how we want to better incorporate extensibility, so there's a good chance we might refactor a lot of this code. That being said, don't want your contributions and time spent to go to waste, so just want to make sure this has meaningful value-add over OpenAI/Anthropic

@kamath in my unscientific experiments, Gemini 2.0 Flash is capable, excels at following instructions (less fiddling with system prompts needed), and very fast. Definitely a contender for this usecase.

@kamath
Copy link
Member

kamath commented Dec 12, 2024

Thanks @vladionescu! We'll keep this in mind. We're investing heavily into eval + debugging tools rn, so once we have a good baseline, we're likely to use a more standardized model router. I'm hesitant to add more models just yet until we have a better sense of failure modes -- ie did code fail due to Stagehand or a model

@kamath
Copy link
Member

kamath commented Jan 7, 2025

Hey! Closing this since we have #352 that allows you to add your own custom LLMClient.

We also have #382 in flight to add vercel AI SDK, which would then support gemini as well.

Thank you so much for the time and effort that went into this PR though!!! Sincerely appreciate it :)

@kamath kamath closed this Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants