Skip to content

Commit

Permalink
Fix/apigateway timeout (#68)
Browse files Browse the repository at this point in the history
closes #63 

* refactor:backend directory structure

* fix: readme

* wip

* enable hot reload

* wip

* change: detection of end of streaming'

* update: max token

* fix

* fix

* change temperature

---------

Co-authored-by: Yusuke Wada <[email protected]>
  • Loading branch information
statefb and wadabee authored Oct 24, 2023
1 parent bd7cca3 commit ae995cf
Show file tree
Hide file tree
Showing 28 changed files with 180 additions and 184 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
# Bedrock Claude Chat

![](https://github.com/aws-samples/bedrock-claude-chat/actions/workflows/test.yml/badge.svg)

日本語は[こちら](./docs/README_ja.md)

> **Warning**
> The current version (`v0.2.x`) has no compatibility with ex version (`v0.1.0`) due to the change of the conversation schema. Please note that conversations stored in DynamoDB with ex version cannot be rendered.
This repository is a sample chatbot using the Anthropic company's LLM [Claude 2](https://www.anthropic.com/index/claude-2), one of the foundational models provided by [Amazon Bedrock](https://aws.amazon.com/bedrock/) for generative AI. This sample is currently developed for use by Japanese speakers, but it is also possible to speak to the chatbot in English.
This repository is a sample chatbot using the Anthropic company's LLM [Claude 2](https://www.anthropic.com/index/claude-2), one of the foundational models provided by [Amazon Bedrock](https://aws.amazon.com/bedrock/) for generative AI. This sample is currently developed for use by Japanese speakers, but it is also possible to speak to the chatbot in English. **I18n is under development and will be released soon.**

![](./docs/imgs/demo_en.png)
![](./docs/imgs/demo2.gif)
Expand All @@ -16,6 +18,7 @@ It's an architecture built on AWS managed services, eliminating the need for inf

- [Amazon DynamoDB](https://aws.amazon.com/dynamodb/): NoSQL database for conversation history storage
- [Amazon API Gateway](https://aws.amazon.com/api-gateway/) + [AWS Lambda](https://aws.amazon.com/lambda/): Backend API endpoint ([AWS Lambda Web Adapter](https://github.com/awslabs/aws-lambda-web-adapter), [FastAPI](https://fastapi.tiangolo.com/))
- [Amazon SNS](https://aws.amazon.com/sns/): Used to decouple streaming calls between API Gateway and Bedrock because streaming responses can take over 30 seconds in total, exceeding the limitations of HTTP integration (See [quota](https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html)).
- [Amazon CloudFront](https://aws.amazon.com/cloudfront/) + [S3](https://aws.amazon.com/s3/): Frontend application delivery ([React](https://react.dev/), [Tailwind CSS](https://tailwindcss.com/))
- [AWS WAF](https://aws.amazon.com/waf/): IP address restriction
- [Amazon Cognito](https://aws.amazon.com/cognito/): User authentication
Expand All @@ -34,8 +37,8 @@ It's an architecture built on AWS managed services, eliminating the need for inf
- [x] Streaming Response
- [x] IP address restriction
- [x] Edit message & re-send
- [ ] Save and re-use prompt template
- [ ] I18n (English / Japanese)
- [ ] Save and re-use prompt template

## Deployment

Expand Down Expand Up @@ -128,7 +131,7 @@ BedrockChatStack.FrontendURL = https://xxxxx.cloudfront.net

### Configure text generation parameters

Edit [config.py](./backend/common/config.py) and run `cdk deploy`.
Edit [config.py](./backend/app/config.py) and run `cdk deploy`.

```py
GENERATION_CONFIG = {
Expand Down Expand Up @@ -158,13 +161,10 @@ cd frontend && npm run dev
Currently, the environment variable `VITE_APP_USE_STREAMING` is specified on the frontend side. It's recommended to set it to `false` when running the backend locally and `true` when operating on AWS.
When streaming is enabled, text is generated in real-time due to the streaming of content generation results.


### Local development using docker compose

[docker-compose.yml](./docker-compose.yml) allows you to run and develop frontend/backend APIs/DynamoDB Local in your local environment.

※ Hot reloading is only supported on the frontend, not on the backend API. Because the source code cannot be mounted due to the directory structure.

```bash
# Build containers
docker compose build
Expand Down
11 changes: 5 additions & 6 deletions backend/api/Dockerfile → backend/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
FROM public.ecr.aws/docker/library/python:3.11.4-slim-bullseye
FROM public.ecr.aws/docker/library/python:3.11.6-slim-bullseye

# Install lambda web adapter
COPY --from=public.ecr.aws/awsguru/aws-lambda-adapter:0.7.0 /lambda-adapter /opt/extensions/lambda-adapter

WORKDIR /app
WORKDIR /backend

COPY ./api/requirements.txt ./
COPY ./requirements.txt ./
RUN pip3 install -r requirements.txt --no-cache-dir

COPY ./common .
COPY ./api .
COPY ./app ./app

ENV PORT=8000
EXPOSE ${PORT}
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
10 changes: 5 additions & 5 deletions backend/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ Written in Python with [FastAPI](https://fastapi.tiangolo.com/).
## Unit test on local

```
cd backend/common
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r ../api/requirements.txt
pip install -r ./requirements.txt
```

```
Expand All @@ -19,7 +19,7 @@ export BEDROCK_REGION=us-east-1
```

```
python repositories/test_conversation.py TestConversationRepository
python test_bedrock.py
python test_usecase.py
python tests/test_conversation.py TestConversationRepository
python tests/test_bedrock.py
python tests/test_usecase.py
```
File renamed without changes.
6 changes: 2 additions & 4 deletions backend/common/bedrock.py → backend/app/bedrock.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
import json
import os

import boto3
from config import GENERATION_CONFIG
from utils import get_bedrock_client
from app.config import GENERATION_CONFIG
from app.utils import get_bedrock_client

client = get_bedrock_client()

Expand Down
6 changes: 3 additions & 3 deletions backend/common/config.py → backend/app/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
# ご利用のアプリケーションに合わせて調整してください。
# 参考: https://docs.anthropic.com/claude/reference/complete_post
GENERATION_CONFIG = {
"max_tokens_to_sample": 500,
"temperature": 0.0,
"max_tokens_to_sample": 2000,
"temperature": 0.6,
"top_k": 250,
"top_p": 0.999,
"stop_sequences": ["Human: ", "Assistant: "],
}
}
10 changes: 5 additions & 5 deletions backend/api/main.py → backend/app/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,19 @@
import traceback
from typing import Callable

from auth import verify_token
from app.auth import verify_token
from app.repositories.conversation import RecordNotFoundError
from app.route import router
from app.route_schema import User
from app.utils import is_running_on_lambda
from fastapi import Depends, FastAPI, HTTPException, Request, status
from fastapi.exceptions import RequestValidationError
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
from jose import JWTError
from repositories.conversation import RecordNotFoundError
from route import router
from route_schema import User
from starlette.routing import Match
from starlette.types import ASGIApp, Message
from utils import is_running_on_lambda

CORS_ALLOW_ORIGINS = os.environ.get("CORS_ALLOW_ORIGINS", "*")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,9 @@
from decimal import Decimal as decimal

import boto3
from app.repositories.model import ContentModel, ConversationModel, MessageModel
from boto3.dynamodb.conditions import Key

from .model import ContentModel, ConversationModel, MessageModel

DDB_ENDPOINT_URL = os.environ.get("DDB_ENDPOINT_URL")
TABLE_NAME = os.environ.get("TABLE_NAME", "")
ACCOUNT = os.environ.get("ACCOUNT", "")
Expand Down Expand Up @@ -37,12 +36,17 @@ def _get_table_client(user_id: str):
Ref: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_dynamodb_items.html
"""
if "AWS_EXECUTION_ENV" not in os.environ:
# NOTE: This is for local development using DynamDB Local
dynamodb = boto3.resource("dynamodb",
endpoint_url=DDB_ENDPOINT_URL,
aws_access_key_id="key",
aws_secret_access_key="key",
region_name="us-east-1")
if DDB_ENDPOINT_URL:
# NOTE: This is for local development using DynamDB Local
dynamodb = boto3.resource(
"dynamodb",
endpoint_url=DDB_ENDPOINT_URL,
aws_access_key_id="key",
aws_secret_access_key="key",
region_name="us-east-1",
)
else:
dynamodb = boto3.resource("dynamodb")
return dynamodb.Table(TABLE_NAME)

policy_document = {
Expand Down
File renamed without changes.
8 changes: 4 additions & 4 deletions backend/api/route.py → backend/app/route.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
from fastapi import APIRouter, Request
from repositories.conversation import (
from app.repositories.conversation import (
change_conversation_title,
delete_conversation_by_id,
delete_conversation_by_user_id,
find_conversation_by_id,
find_conversation_by_user_id,
)
from route_schema import (
from app.route_schema import (
ChatInput,
ChatOutput,
Content,
Expand All @@ -17,7 +16,8 @@
ProposedTitle,
User,
)
from usecase import chat, get_invoke_payload, propose_conversation_title
from app.usecase import chat, get_invoke_payload, propose_conversation_title
from fastapi import APIRouter, Request

router = APIRouter()

Expand Down
File renamed without changes.
10 changes: 5 additions & 5 deletions backend/common/usecase.py → backend/app/usecase.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,16 @@
import logging
from datetime import datetime

from bedrock import _create_body, get_model_id, invoke
from repositories.conversation import (
from app.bedrock import _create_body, get_model_id, invoke
from app.repositories.conversation import (
RecordNotFoundError,
find_conversation_by_id,
store_conversation,
)
from repositories.model import ContentModel, ConversationModel, MessageModel
from route_schema import ChatInput, ChatOutput, Content, MessageOutput
from app.repositories.model import ContentModel, ConversationModel, MessageModel
from app.route_schema import ChatInput, ChatOutput, Content, MessageOutput
from app.utils import get_buffer_string
from ulid import ULID
from utils import get_buffer_string

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
Expand Down
4 changes: 2 additions & 2 deletions backend/common/utils.py → backend/app/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from typing import List

import boto3
from repositories.model import MessageModel
from app.repositories.model import MessageModel

BEDROCK_REGION = os.environ.get("BEDROCK_REGION", "us-east-1")

Expand All @@ -22,7 +22,7 @@ def get_buffer_string(conversations: dict[str, MessageModel]) -> str:
prefix = "System: "
else:
raise ValueError(f"Unsupported role: {conversation.role}")

if conversation.role != "system":
# Ignore system messages (currently `system` is dummy)
message = f"{prefix}{conversation.content.body}"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
import json
import logging
import os
from datetime import datetime

import boto3
from auth import verify_token
from repositories.conversation import store_conversation
from repositories.model import ContentModel, MessageModel
from route_schema import ChatInputWithToken
from app.auth import verify_token
from app.repositories.conversation import store_conversation
from app.repositories.model import ContentModel, MessageModel
from app.route_schema import ChatInputWithToken
from app.usecase import get_invoke_payload, prepare_conversation
from app.utils import get_bedrock_client
from ulid import ULID
from usecase import get_invoke_payload, prepare_conversation
from utils import get_bedrock_client

client = get_bedrock_client()

Expand All @@ -28,16 +27,19 @@ def generate_chunk(stream) -> bytes:


def handler(event, context):
route_key = event["requestContext"]["routeKey"]

if route_key == "$connect":
# NOTE: Authentication is done at each message
return {"statusCode": 200, "body": "Connected."}

connection_id = event["requestContext"]["connectionId"]
domain_name = event["requestContext"]["domainName"]
stage = event["requestContext"]["stage"]
message = event["body"]
print(f"Received event: {event}")
# Extracting the SNS message and its details
# NOTE: All notification messages will contain a single published message.
# See `Reliability` section of: https://aws.amazon.com/sns/faqs/
sns_message = event["Records"][0]["Sns"]["Message"]
message_content = json.loads(sns_message)

route_key = message_content["requestContext"]["routeKey"]

connection_id = message_content["requestContext"]["connectionId"]
domain_name = message_content["requestContext"]["domainName"]
stage = message_content["requestContext"]["stage"]
message = message_content["body"]
endpoint_url = f"https://{domain_name}/{stage}"
gatewayapi = boto3.client("apigatewaymanagementapi", endpoint_url=endpoint_url)

Expand Down
40 changes: 40 additions & 0 deletions backend/publisher/index.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import json
import os

import boto3
from botocore.exceptions import ClientError

TOPIC_ARN = os.environ["WEBSOCKET_TOPIC_ARN"]
sns_client = boto3.client("sns")


def handler(event, context):
print(f"Received event: {event}")
route_key = event["requestContext"]["routeKey"]

if route_key == "$connect":
# NOTE: Authentication is run at each message
return {"statusCode": 200, "body": "Connected."}

message = {
"requestContext": event["requestContext"],
"body": event["body"],
}

try:
sns_response = sns_client.publish(
TopicArn=TOPIC_ARN,
Message=json.dumps(message),
)

response = {
"statusCode": 200,
}
except ClientError as e:
print(f"ClientError: {e}")
response = {
"statusCode": 500,
"body": json.dumps({"error": str(e)}),
}

return response
File renamed without changes.
Loading

0 comments on commit ae995cf

Please sign in to comment.