-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lambda support #7
Conversation
hubverse_transform doesn't use boto3: it was there for earlier testing
…mbda function This changeset is the first step for getting this code deployable by the hubverse-transform-model-output lambda function that exists in the Hubverse's AWS account. This version of the code supports manually updating the code in the lambda function by running bash script. The next version will do the deployment via GitHub actions, as changes are merged into the main branch.
logger.setLevel("INFO") | ||
|
||
|
||
def lambda_handler(event, context): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this function is to receive the AWS S3 events that are emitted whenever a new/updated model-output
file lands in a hub's S3 bucket. The function parses the event to get the name of the hub's S3 bucket and the name ("key") of the model-output file.
The bucket name + key are then used to create aModelOutputHandler
object and transform the data.
For reference, this is an example of such an event
{
"Records": [
{
"eventVersion": "2.1",
"eventSource": "aws:s3",
"awsRegion": "us-east-1",
"eventTime": "2024-05-02T19:06:28.151Z",
"eventName": "ObjectCreated:Put",
"userIdentity": {
"principalId": "howdy"
},
"requestParameters": {
"sourceIPAddress": "redacted"
},
"responseElements": {
"x-amz-request-id": "S7DY1N8KZP1F8J35",
"x-amz-id-2": "LeRQgYUNlXYiMdY+ibFpvF0XUSjcu5tgyUyhufmsSavl+oPrpKF2L5/J1MYfe+F0wEKHUGnC+BxxOrQKhTiXPKmuAxVFA48R"
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "howdy",
"bucket": {
"name": "hubverse-cloud",
"ownerIdentity": {
"principalId": "howdy"
},
"arn": "arn:aws:s3:::hubverse-cloud"
},
"object": {
"key": "raw/model-output/UMass-flusion/2023-10-14-UMass-flusion.csv",
"size": 357048,
"eTag": "af50008c99f29b39310b57be5f474d28",
"versionId": "nPvSV7RK1UNFXMGHNY9pDiNVbb5zBTJI",
"sequencer": "006633E433E70A377B"
}
}
}
]
}
|
||
logger.info("Transforming file: {}/{}".format(bucket, key)) | ||
try: | ||
mo = ModelOutputHandler.from_s3(bucket, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the handler actually invokes our code that performs the data transformations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script looks worse than it is. Ultimately, these steps will be performed in the context of a GitHub action (instead of being run by a human).
mkdir -p $build_dir/hubverse_transform | ||
|
||
# output project requirements | ||
pdm export --without dev --format requirements > $build_dir/requirements.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will delete this once pdm is removed.
This will ensure the lambda handler doesn't break when we add s3:ObjectRemoved triggers to hubs' S3 buckets.
Approved via joint review session. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved via joint review session.
Resolves #4
Although the
hubverse_transform
package in this repo can be installed and run anywhere, our most immediate need is to run it as an AWS Lambda function (Lambda is AWS's function-as-a-service offering).To get this deployed to the
hubverse-transform-model-output
Lambda that already exists in the Hubverse's AWS account*, this PR adds two things: