Skip to content

Commit

Permalink
Merge pull request #7 from Infectious-Disease-Modeling-Hubs/bsweger/a…
Browse files Browse the repository at this point in the history
…dd-lambda-support

Add lambda support
  • Loading branch information
bsweger authored May 6, 2024
2 parents d15f761 + 4504f4f commit 6ea45cf
Show file tree
Hide file tree
Showing 5 changed files with 121 additions and 96 deletions.
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,3 +142,17 @@ The `pdm add` command will install the package, add it to [`pyproject.toml`](pyp

Refer to [PDM's documentation](https://pdm-project.org/latest/usage/dependency/) for complete information about adding dependencies.
## Creating and deploying the AWS Lambda package
**Temporary: next step is to deploy updates to the lambda package via GitHub Actions**
To package the hubverse_transform code for deployment to the `hubverse-transform-model-output` AWS Lambda function:
1. Make sure you have the AWS CLI installed
2. Make sure PDM is installed (see the dev setup instructions above)
3. Make sure you have AWS credentials that allow writes to the `hubverse-assets` S3 bucket
4. From the root of this project, run the deploy script:
```bash
source deploy_lambda.sh
```
55 changes: 55 additions & 0 deletions deploy_lambda.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/bin/bash

# allow pip installs outside of a virtual environment
export PIP_REQUIRE_VIRTUALENV=false

build_dir="build"

# create build directory if it doesn't exist, and remove any prior
# artifacts
echo "Removing old build artifacts"
rm -rf $build_dir
mkdir -p $build_dir/hubverse_transform

# output project requirements
pdm export --without dev --format requirements > $build_dir/requirements.txt

# install the hubverse_transform dependencies into the build directory
echo "Installing dependencies into the build directory"
pip install \
--platform manylinux2014_x86_64 \
--target=$build_dir \
--python-version 3.12 \
--only-binary=:all: --upgrade \
-r $build_dir/requirements.txt

# copy the hubverse_transform package into the build directory so it
# will be included in the lambda deployment package
cp -r src/hubverse_transform/ $build_dir/hubverse_transform

# create the zip file that will be deployed to AWS Lambda
# https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-package.html#gettingstarted-package-zip
echo "Creating AWS Lambda deployment package"

# step 1: zip the files in the build directory (from the above pip install, but exclude the stuff we don't need)
echo "Zipping project dependencies"
cd $build_dir
py_exclude=("*.pyc" "*.ipynb" "*__pycache__*" "*ipynb_checkpoints*" "requirements.txt" "*.egg-info")
zip -r hubverse-transform-model-output.zip . -x "${py_exclude[@]}"

# step 2: add the lambda handler the .zip package
echo "Adding lambda handler to the deployment .zip package"
cd ..
zip -j $build_dir/hubverse-transform-model-output.zip faas/lambda_function.py

# for reference: the S3 bucket in the comment below is where our IaC (i.e., hubverse-infrastructure repo)
# creates the placeholder lambda function for model-output-transforms
# s3://hubverse-assets/lambda/hubverse-transform-model-output.zip
echo "Uploading deployment package to S3 and performing aws lambda update-function-code"
aws s3 cp build/hubverse-transform-model-output.zip s3://hubverse-assets/lambda/
aws lambda update-function-code \
--function-name arn:aws:lambda:us-east-1:767397675902:function:hubverse-transform-model-output \
--s3-bucket hubverse-assets \
--s3-key lambda/hubverse-transform-model-output.zip > $build_dir/lambda_update.log

echo "Lambda function updated (see $build_dir/lambda_update.log for details)"
51 changes: 51 additions & 0 deletions faas/lambda_function.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
"""
This the handler invoked by the Lambda function that runs whenever there's a change to a hub's uploaded model-output files.
For the Hubverse AWS account, the baseline "hubverse-transform-model-output" Lambda is defined via our IaC (Infrastructure as Code)
repository: hubverse-transform-model-output.
This handler, along with the actual transformation module (hubverse_transform), live outside of the Hubverse's IaC repository:
- To avoid tightly coupling AWS infrastructure to the more general hubverse_transform module that can be used for hubs hosted elsewhere
- To allow faster iteration and testing of the hubverse_transform module without needing to update the IaC repo or redeploy AWS resources
"""
import json
import logging
import urllib.parse

from hubverse_transform.model_output import ModelOutputHandler

logger = logging.getLogger()
logger.setLevel("INFO")


def lambda_handler(event, context):
logger.info("Received event: " + json.dumps(event, indent=2))

# info from the S3 event
event_source = event["Records"][0]["eventSource"]
event_name = event["Records"][0]["eventName"]
bucket = event["Records"][0]["s3"]["bucket"]["name"]
key = urllib.parse.unquote_plus(event["Records"][0]["s3"]["object"]["key"], encoding="utf-8")

# Below is some old testing code that we were using to ignore all file
# types that don't have a supported extension. It's commented-out now, to ensure
# that we don't have to update this handler every time we add support for a new
# file type in the model-output transforms.
# extensions = [".csv", ".parquet"]
# if not any(ext in key.lower() for ext in extensions):
# print(f"{key} is not a supported file type, skipping")
# return

# Until we implement ModelOutputHandler functionality to act on deleted model-output files, ignore any ObjectDelete
# events that are triggered by by a hub's S3 bucket.
if "objectcreated" not in event_name.lower():
logger.info(f"Event type {event_source}:{event_name} is not supported, skipping")
return

logger.info("Transforming file: {}/{}".format(bucket, key))
try:
mo = ModelOutputHandler.from_s3(bucket, key)
mo.transform_model_output()
except Exception as e:
logger.exception("Error transforming file: {}/{}".format(key, bucket))
raise e
96 changes: 1 addition & 95 deletions pdm.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ classifiers = [
]

dependencies = [
'boto3',
"pyarrow>=16.0.0",
]
authors = [
Expand Down

0 comments on commit 6ea45cf

Please sign in to comment.