AWS Bedrock support #379

csheaff · 2024-09-11T00:05:10Z

Hello, it would be grand to be able to use AWS models from Amazon Bedrock, such as Anthropic Claude Sonnet 3.5.

karthink · 2024-09-11T00:37:40Z

Can you provide a link to their API documentation?

csheaff · 2024-09-11T02:12:51Z

...looking for a way to use just http requests but i'm not sure it's possible.

karthink · 2024-09-11T02:21:32Z

I'm not familiar with AWS Bedrock. How do you access models (or other computation) running there?

csheaff · 2024-09-11T02:25:57Z

the easiest approaches are to use the aws cli on the command line or a python sdk. But i'm guessing what would be most convenient here is being able to send https requests with lisp.

relevant? pokepay/aws-sdk-lisp#35

csheaff · 2024-09-11T02:35:26Z

This is as close as i can find to the payload structure:

https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html

On the command line one would would do:

aws bedrock-runtime converse \
--model-id amazon.titan-text-express-v1 \
--messages '[{"role": "user", "content": [{"text": "Describe the purpose of a \"hello world\" program in one line."}]}]' \
--inference-config '{"maxTokens": 512, "temperature": 0.5, "topP": 0.9}'

karthink · 2024-09-11T02:49:49Z

This is as close as i can find to the payload structure:

https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html

This makes it seem like you can make http requests? Sorry, I'm not understanding how this service is structured.

If you can make http requests I can add support for it to gptel.

csheaff · 2024-09-11T16:37:02Z

yes i think so. one could authenticate using environmental variables

AWS_SESSION_TOKEN, AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID, AWS_REGION

...or just have the user enter them in during config. In my case i have to renew my credentials often for security reasons, so being able to update them without closing emacs and re-exporting the variables in the terminal would be a plus.

karthink · 2024-09-11T21:40:07Z

Sorry, I don't follow how environment variables are relevant to making a http request.

Is there a curl command you can run to receive model responses from AWS Bedrock?

swapneils · 2024-10-15T00:28:43Z

Is there a curl command you can run to receive model responses from AWS Bedrock?

Curl has native support for the AWS signing method (see e.g. this article), so this should be possible.
This reddit discussion seems to have a sample curl command to invoke Bedrock.

On a separate note, it would be nice to support CLI backends as well as REST HTTP backends, since then the work of actually calling a service can be offloaded to its native CLI tools (q chat, llama-cli, etc).

Note since I'm AWS employed: The above is purely my own knowledge and opinions, and not communication on behalf of my employer. This also applies to all future communications in this thread unless explicitly specified otherwise.

karthink · 2024-10-16T21:18:17Z

@swapneils Thank you for the pointer -- this should be possible now.

@csheaff So this can be done using Curl, but someone will need to write an AWS bedrock backend for gptel. Unfortunately we can't inherit the OpenAI backend since the payload structure is different. PRs are welcome, you can copy gptel-openai.el or gptel-anthropic.el and modify it.

csheaff · 2024-10-18T23:47:27Z

thanks @karthink . I'll try to find some time but it might be tough.

JGalego · 2024-10-29T13:00:22Z

@karthink @csheaff Found this issue by accident. Maybe this will be useful for a future implementation >>> cl-bedrock? Includes support for InvokeModel, Converse and ApplyGuardrail APIs.

felipeochoa · 2025-02-20T22:01:22Z

I'm working on this and almost have a first version ready. However, I don't fully understand the point gptel--parse-buffer. Can you explain why that needs to be generic? Am I correct in understanding that it's only used in the chat buffer opened by M-x gptel?

karthink · 2025-02-20T22:20:00Z

I'm working on this and almost have a first version ready. However, I don't fully understand the point gptel--parse-buffer. Can you explain why that needs to be generic?

@felipeochoa the input to gptel--parse-buffer is a buffer position to scan backwards from, up to the start of the (possibly narrowed) buffer. It collects user text and LLM responses into an array of messages and returns this array. The format of this array is API-specific, so each API uses a different implementation. The Bedrock API must have a specification too, you'll need to create an array of messages following this spec.

To see what the messages array looks like for the active backend, you can run (gptel--parse-buffer gptel-backend (point-max)) in a chat buffer.

Am I correct in understanding that it's only used in the chat buffer opened by M-x gptel?

It's used in all situations except when gptel-request is given an explicit prompt (string or list of strings) to use instead. By default, gptel does not distinguish between chat and non-chat buffers.

felipeochoa · 2025-02-21T03:21:57Z

Ah thanks for that explanation. That makes sense now. I didn't quite get to finish the stream implementation, and it's totally untested, but I think the main pieces are in place in f6b8f41. The one change to the internal API I had to make was to allow :curl-args to be a function so that the backend could inject the AWS credentials afresh each time. That's a small change in a7bd580.

If anyone understands how to handle the ConverseStream response (mime type application/vnd.amazon.eventstream it seems?), pointers would be helpful. ChatGPT insists that it's a plain JSON stream...

felipeochoa · 2025-02-22T05:32:29Z

Made a bit of progress today, 8e21474. Main thing missing is media handling and e2e testing!

@karthink if you have a chance to look at gptel-curl--parse-stream and how I'm using a marker to keep track of what's been parsed, I'd appreciate that. I think you do something similar using (point) instead of a separate marker. I opted for a new marker as a precaution against issues like #261. The main thing I'm unsure about is whether we can assume that every streaming response gets a fresh buffer, or whether buffers can be reused across requests. (nvm, figured this out)

felipeochoa · 2025-02-26T03:49:24Z

Continuing to chip away at this in 65079f8. Good news is that basic messages and streaming work! I'm debugging tool use, and have half the media handling set up. I did have to add one more change to the internal API to allow running the curl process in 'binary coding. That's in 2378b96

felipeochoa · 2025-02-28T01:17:41Z

PR at #670

karthink · 2025-03-13T05:09:53Z

@karthink if you have a chance to look at gptel-curl--parse-stream and how
I'm using a marker to keep track of what's been parsed, I'd appreciate that. I
think you do something similar using (point) instead of a separate marker. I
opted for a new marker as a precaution against issues like #261. The main thing
I'm unsure about is whether we can assume that every streaming response gets a
fresh buffer, or whether buffers can be reused across requests. (nvm, figured
this out)

I'm looking at it now. I don't think you need gptel-bedrock--stream-cursor or for it to be a marker, but there's no harm I guess.

karthink · 2025-03-13T05:10:04Z

@karthink if you have a chance to look at gptel-curl--parse-stream and how
I'm using a marker to keep track of what's been parsed, I'd appreciate that. I
think you do something similar using (point) instead of a separate marker. I
opted for a new marker as a precaution against issues like #261. The main thing
I'm unsure about is whether we can assume that every streaming response gets a
fresh buffer, or whether buffers can be reused across requests. (nvm, figured
this out)

I'm looking at it now. I don't think you need gptel-bedrock--stream-cursor or for it to be a marker, but there's no harm I guess.

You can also use (process-mark).

pavloo · 2025-04-13T19:44:21Z

Would be great to have this feature added to gptel. Also using this comment as an opportunity to thank everyone who is involved in development of gptel and especially @karthink . You guys are amazing <3

nfedyashev mentioned this issue Sep 14, 2024

Idea: add custom local model(HTTP backend) for complex agents #383

Closed

karthink added the feature request Request for a new feature label Oct 4, 2024

felipeochoa linked a pull request Feb 28, 2025 that will close this issue

AWS Bedrock Support #670

Open

swapneils mentioned this issue Apr 28, 2025

Support for AWS Bedrock? ahyatt/llm#188

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS Bedrock support #379

AWS Bedrock support #379

csheaff commented Sep 11, 2024 •

edited

Loading

karthink commented Sep 11, 2024

csheaff commented Sep 11, 2024

karthink commented Sep 11, 2024

csheaff commented Sep 11, 2024 •

edited

Loading

csheaff commented Sep 11, 2024 •

edited

Loading

karthink commented Sep 11, 2024

csheaff commented Sep 11, 2024

karthink commented Sep 11, 2024

swapneils commented Oct 15, 2024

karthink commented Oct 16, 2024

csheaff commented Oct 18, 2024

JGalego commented Oct 29, 2024

felipeochoa commented Feb 20, 2025

karthink commented Feb 20, 2025

felipeochoa commented Feb 21, 2025

felipeochoa commented Feb 22, 2025 •

edited

Loading

felipeochoa commented Feb 26, 2025

felipeochoa commented Feb 28, 2025

karthink commented Mar 13, 2025

karthink commented Mar 13, 2025 •

edited

Loading

pavloo commented Apr 13, 2025

AWS Bedrock support #379

AWS Bedrock support #379

Comments

csheaff commented Sep 11, 2024 • edited Loading

karthink commented Sep 11, 2024

csheaff commented Sep 11, 2024

karthink commented Sep 11, 2024

csheaff commented Sep 11, 2024 • edited Loading

csheaff commented Sep 11, 2024 • edited Loading

karthink commented Sep 11, 2024

csheaff commented Sep 11, 2024

karthink commented Sep 11, 2024

swapneils commented Oct 15, 2024

karthink commented Oct 16, 2024

csheaff commented Oct 18, 2024

JGalego commented Oct 29, 2024

felipeochoa commented Feb 20, 2025

karthink commented Feb 20, 2025

felipeochoa commented Feb 21, 2025

felipeochoa commented Feb 22, 2025 • edited Loading

felipeochoa commented Feb 26, 2025

felipeochoa commented Feb 28, 2025

karthink commented Mar 13, 2025

karthink commented Mar 13, 2025 • edited Loading

pavloo commented Apr 13, 2025

csheaff commented Sep 11, 2024 •

edited

Loading

csheaff commented Sep 11, 2024 •

edited

Loading

csheaff commented Sep 11, 2024 •

edited

Loading

felipeochoa commented Feb 22, 2025 •

edited

Loading

karthink commented Mar 13, 2025 •

edited

Loading