AWS CDK Stack: Multi-Service Application with ALB & Fargate for Deployment of Framework-agnostic AI Agent Apps
This documentation provides an in-depth explanation of the AWS Cloud Development Kit (CDK) stack for deploying two services (frontend & backend) on AWS Fargate, using a single Application Load Balancer (ALB) with path-based routing. This setup ensures a cost-effective, scalable, and secure architecture for containerized applications.
Before setting up and deploying the AWS CDK infrastructure, ensure that you have the following installed and configured on your system:
- AWS CDK (Node.js Required) – Install Node.js and AWS CDK:
# Install Node.js (if not already installed) sudo apt install nodejs npm # Ubuntu/Debian brew install node # macOS choco install nodejs # Windows # Install AWS CDK globally npm install -g aws-cdk
- Docker (for building container images for AWS Fargate)
# Install Docker (ensure the Docker daemon is running) - Python 3 & Virtual Environment
# Ensure Python is installed (3.8+ recommended) python3 --version # Install virtualenv if not installed pip install virtualenv
Ensure you are authenticated with AWS and have the necessary permissions:
aws configure- Create a secret in the aws secret manager and call it agent-app. It must hold your HF_TOKEN and YOUR OPENAI_API_KEY
- You should have access to an AWS account with IAM permissions for CDK deployment, ECS, ALB, and networking setup.
Since AWS CDK projects require a specific structure, you cannot simply clone this repository and run cdk deploy. Instead, follow these steps:
Navigate to your working directory and initialize a new CDK project:
mkdir my-cdk-project
cd my-cdk-project
cdk init app --language pythonThis creates the necessary CDK project structure.
# Create a virtual environment
python3 -m venv .venv
# Activate the virtual environment
source .venv/bin/activate # macOS/Linux
.venv\Scripts\activate # Windowspip install -r requirements.txtEnsure that the dependencies match those in the requirements.txt of the repository.
Since CDK requires an initialized project, you must manually copy the files from this repository into your initialized CDK project.
AWS CDK requires your AWS account to be bootstrapped before deploying any infrastructure.
cdk bootstrap aws://your_account_id/your_regionThis sets up necessary resources.
Once you have copied the required files and bootstrapped your AWS environment, deploy the stack:
cdk deployThis will:
- Create/update the VPC, subnets, and networking components.
- Deploy the Application Load Balancer (ALB).
- Set up the ECS Fargate services (frontend and backend).
- Apply security configurations.
After the deployment completes, check the AWS console to confirm:
- The ALB is running and accessible.
- ECS Fargate tasks are successfully launched.
- The API endpoint (FastAPI) is responding correctly.
You can also check the deployed stack by running:
aws cloudformation list-stacks --query "StackSummaries[?StackStatus!='DELETE_COMPLETE'].StackName" --output textand get the respective DNS via:
aws cloudformation describe-stacks --stack-name CombinedFrontendBackendStack --query 'Stacks[0].Outputs[?OutputKey==`LoadBalancerDNS`].OutputValue' --output textTo avoid unnecessary AWS costs, delete the stack when no longer needed:
cdk destroyThis removes all AWS resources created by the stack.
- VPC Configuration
- 2 Availability Zones for high availability.
- 1 NAT Gateway (can be increased to 1 per AZ for production).
- Public subnets for ALB.
- Private subnets for Fargate services.
-
Security Group Configuration
alb_security_group = ec2.SecurityGroup( self, "ALBSecurityGroup", vpc=vpc, allow_all_outbound=True ) alb_security_group.add_ingress_rule( ec2.Peer.any_ipv4(), ec2.Port.tcp(80), "Allow HTTP traffic" )
- Allows inbound HTTP traffic (port 80) from any IP.
- Allows all outbound traffic.
- Note: In production, consider adding HTTPS (port 443) and restricting IP ranges.
-
ALB Configuration
alb = elbv2.ApplicationLoadBalancer( self, "SharedALB", vpc=vpc, internet_facing=True, security_group=alb_security_group )
- Internet-facing for public access.
- Placed in public subnets.
- Uses a single security group.
-
Default Route (Frontend)
- All traffic by default routes to the frontend service.
- Uses Streamlit's port 8501.
default_action=elbv2.ListenerAction.forward([frontend_target_group])
-
API Route (Backend)
- All
/api/*paths route to the backend service. - Uses FastAPI's port 8000.
conditions=[elbv2.ListenerCondition.path_patterns(["/api/*"])]
- All
-
Task Definition
- CPU: 512 units.
- Memory: 1024 MB.
- Container Port: 8000.
- Environment Variables:
HF_TOKEN(from Secrets Manager).OPENAI_API_KEY(from Secrets Manager).
-
Health Check
- Path:
/api/health. - Success Codes:
200.
- Path:
-
Task Definition
- CPU: 256 units.
- Memory: 512 MB.
- Container Port: 8501.
- Environment Variables:
API_ENDPOINT: Points to ALB DNS with/apiprefix.
-
Health Check
- Path:
/_stcore/health. - Success Codes:
200.
- Path:
- Both services run on Fargate (serverless).
- Services placed in private subnets.
- No public IP addresses assigned (
assign_public_ip=False). - Uses AWS Log Driver with 1-week retention.
The standard CDK ApplicationLoadBalancedFargateService construct was not used because:
- We need custom routing logic for both services.
- We're sharing a single ALB between services.
- We need fine-grained control over security groups and routing rules.
- Manually created an ALB and added path-based routing rules.
- Manually defined target groups for both services.
- Attached Fargate services to their respective target groups during creation.
- Inbound: Internet → Internet Gateway IGW → ALB (Public Subnet) → Fargate Services (Private Subnet).
- Outbound: Fargate → NAT Gateway → Internet (for pulling images/updates).
-
Exposed Ports
- ALB: Port 80 (HTTP) (allows inbound traffic from any IP).
- Backend: Port 8000 (internal only).
- Frontend: Port 8501 (internal only).
-
Port Mapping
- External HTTP traffic (
80) → ALB. - ALB → Backend (
8000) for/api/*paths. - ALB → Frontend (
8501) for all other paths.
- External HTTP traffic (
- Both services start with
desired_count=1. - No auto-scaling configured in this setup.
- Can be added using ECS Service Auto Scaling.
- Fargate tasks run in private subnets.
- Only the ALB is internet-facing.
- Services communicate through the ALB, not directly.
- Secrets managed through AWS Secrets Manager.
The backend service exposes a FastAPI-based REST API that provides an AI-driven medical appointment scheduling agent.
- Description: Processes a user query and returns AI-generated results for medical appointment scheduling.
- Request Body:
{ "user_input": "Find all orthopedic specialists available on Mondays." } - Response:
{ "answer": "Dr. Smith is available on Mondays from 8 AM - 12 PM." } - Error Handling:
500 Internal Server Errorif query processing fails.
- Description: Returns the API health status.
- Response:
{ "status": "healthy", "version": "v1" }
- Description: Provides metadata about the API.
- Response:
{ "message": "Welcome to the Medical Appointment Agent API", "version": "v1", "documentation": "/api/docs", "health_check": "/api/health", "usage": "Send a POST request to /api/query with a JSON body containing a 'user_input' field." }
- CORS Middleware: Allows cross-origin requests.
- Dependency Injection: Uses
Depends()for structured API dependencies. - Pydantic Models: Ensures input validation and response standardization.
- Environment Variables: Uses
.envfiles for configuration management during local development. Otherwise secret from AWS secret manager is utilized.
- Add HTTPS support with an ACM certificate.
- Implement auto-scaling rules. (Currently ECS Fargate services (frontend and backend) run with a fixed number of tasks (desired_count=1) to keep cost minimal. This means that the number of running containers will remain constant unless manually changed.)
- Enhance security with AWS WAF.
- Configure CloudWatch alarms for monitoring.
- Integrate with Route 53 for custom domain management.
