Skip to content

AWS Bedrock support #379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
csheaff opened this issue Sep 11, 2024 · 21 comments · May be fixed by #670
Open

AWS Bedrock support #379

csheaff opened this issue Sep 11, 2024 · 21 comments · May be fixed by #670
Labels
feature request Request for a new feature

Comments

@csheaff
Copy link

csheaff commented Sep 11, 2024

Hello, it would be grand to be able to use AWS models from Amazon Bedrock, such as Anthropic Claude Sonnet 3.5.

@karthink
Copy link
Owner

Can you provide a link to their API documentation?

@csheaff
Copy link
Author

csheaff commented Sep 11, 2024

...looking for a way to use just http requests but i'm not sure it's possible.

@karthink
Copy link
Owner

I'm not familiar with AWS Bedrock. How do you access models (or other computation) running there?

@csheaff
Copy link
Author

csheaff commented Sep 11, 2024

the easiest approaches are to use the aws cli on the command line or a python sdk. But i'm guessing what would be most convenient here is being able to send https requests with lisp.

relevant? pokepay/aws-sdk-lisp#35

@csheaff
Copy link
Author

csheaff commented Sep 11, 2024

This is as close as i can find to the payload structure:

https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html

On the command line one would would do:

aws bedrock-runtime converse \
--model-id amazon.titan-text-express-v1 \
--messages '[{"role": "user", "content": [{"text": "Describe the purpose of a \"hello world\" program in one line."}]}]' \
--inference-config '{"maxTokens": 512, "temperature": 0.5, "topP": 0.9}'

@karthink
Copy link
Owner

This is as close as i can find to the payload structure:

https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html

This makes it seem like you can make http requests? Sorry, I'm not understanding how this service is structured.

If you can make http requests I can add support for it to gptel.

@csheaff
Copy link
Author

csheaff commented Sep 11, 2024

yes i think so. one could authenticate using environmental variables

AWS_SESSION_TOKEN, AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID, AWS_REGION

...or just have the user enter them in during config. In my case i have to renew my credentials often for security reasons, so being able to update them without closing emacs and re-exporting the variables in the terminal would be a plus.

@karthink
Copy link
Owner

Sorry, I don't follow how environment variables are relevant to making a http request.

Is there a curl command you can run to receive model responses from AWS Bedrock?

@swapneils
Copy link

Is there a curl command you can run to receive model responses from AWS Bedrock?

Curl has native support for the AWS signing method (see e.g. this article), so this should be possible.
This reddit discussion seems to have a sample curl command to invoke Bedrock.

On a separate note, it would be nice to support CLI backends as well as REST HTTP backends, since then the work of actually calling a service can be offloaded to its native CLI tools (q chat, llama-cli, etc).

Note since I'm AWS employed: The above is purely my own knowledge and opinions, and not communication on behalf of my employer. This also applies to all future communications in this thread unless explicitly specified otherwise.

@karthink
Copy link
Owner

@swapneils Thank you for the pointer -- this should be possible now.

@csheaff So this can be done using Curl, but someone will need to write an AWS bedrock backend for gptel. Unfortunately we can't inherit the OpenAI backend since the payload structure is different. PRs are welcome, you can copy gptel-openai.el or gptel-anthropic.el and modify it.

@csheaff
Copy link
Author

csheaff commented Oct 18, 2024

thanks @karthink . I'll try to find some time but it might be tough.

@JGalego
Copy link

JGalego commented Oct 29, 2024

@karthink @csheaff Found this issue by accident. Maybe this will be useful for a future implementation >>> cl-bedrock? Includes support for InvokeModel, Converse and ApplyGuardrail APIs.

cl-bedrock-announcement

@felipeochoa
Copy link

I'm working on this and almost have a first version ready. However, I don't fully understand the point gptel--parse-buffer. Can you explain why that needs to be generic? Am I correct in understanding that it's only used in the chat buffer opened by M-x gptel?

@karthink
Copy link
Owner

I'm working on this and almost have a first version ready. However, I don't fully understand the point gptel--parse-buffer. Can you explain why that needs to be generic?

@felipeochoa the input to gptel--parse-buffer is a buffer position to scan backwards from, up to the start of the (possibly narrowed) buffer. It collects user text and LLM responses into an array of messages and returns this array. The format of this array is API-specific, so each API uses a different implementation. The Bedrock API must have a specification too, you'll need to create an array of messages following this spec.

To see what the messages array looks like for the active backend, you can run (gptel--parse-buffer gptel-backend (point-max)) in a chat buffer.

Am I correct in understanding that it's only used in the chat buffer opened by M-x gptel?

It's used in all situations except when gptel-request is given an explicit prompt (string or list of strings) to use instead. By default, gptel does not distinguish between chat and non-chat buffers.

@felipeochoa
Copy link

Ah thanks for that explanation. That makes sense now. I didn't quite get to finish the stream implementation, and it's totally untested, but I think the main pieces are in place in f6b8f41. The one change to the internal API I had to make was to allow :curl-args to be a function so that the backend could inject the AWS credentials afresh each time. That's a small change in a7bd580.

If anyone understands how to handle the ConverseStream response (mime type application/vnd.amazon.eventstream it seems?), pointers would be helpful. ChatGPT insists that it's a plain JSON stream...

@felipeochoa
Copy link

felipeochoa commented Feb 22, 2025

Made a bit of progress today, 8e21474. Main thing missing is media handling and e2e testing!

@karthink if you have a chance to look at gptel-curl--parse-stream and how I'm using a marker to keep track of what's been parsed, I'd appreciate that. I think you do something similar using (point) instead of a separate marker. I opted for a new marker as a precaution against issues like #261. The main thing I'm unsure about is whether we can assume that every streaming response gets a fresh buffer, or whether buffers can be reused across requests. (nvm, figured this out)

@felipeochoa
Copy link

Continuing to chip away at this in 65079f8. Good news is that basic messages and streaming work! I'm debugging tool use, and have half the media handling set up. I did have to add one more change to the internal API to allow running the curl process in 'binary coding. That's in 2378b96

@felipeochoa felipeochoa linked a pull request Feb 28, 2025 that will close this issue
@felipeochoa
Copy link

PR at #670

@karthink
Copy link
Owner

@karthink if you have a chance to look at gptel-curl--parse-stream and how
I'm using a marker to keep track of what's been parsed, I'd appreciate that. I
think you do something similar using (point) instead of a separate marker. I
opted for a new marker as a precaution against issues like #261. The main thing
I'm unsure about is whether we can assume that every streaming response gets a
fresh buffer, or whether buffers can be reused across requests.
(nvm, figured
this out)

I'm looking at it now. I don't think you need gptel-bedrock--stream-cursor or for it to be a marker, but there's no harm I guess.

@karthink
Copy link
Owner

karthink commented Mar 13, 2025

@karthink if you have a chance to look at gptel-curl--parse-stream and how
I'm using a marker to keep track of what's been parsed, I'd appreciate that. I
think you do something similar using (point) instead of a separate marker. I
opted for a new marker as a precaution against issues like #261. The main thing
I'm unsure about is whether we can assume that every streaming response gets a
fresh buffer, or whether buffers can be reused across requests.
(nvm, figured
this out)

I'm looking at it now. I don't think you need gptel-bedrock--stream-cursor or for it to be a marker, but there's no harm I guess.

You can also use (process-mark).

@pavloo
Copy link

pavloo commented Apr 13, 2025

Would be great to have this feature added to gptel. Also using this comment as an opportunity to thank everyone who is involved in development of gptel and especially @karthink . You guys are amazing <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants