This is a serverless Python application that helps you to push data from Kinesis Data Firehose to your MongoDB Atlas cluster. This project utilizes AWS Lambda as a resolver to push data into the Atlas cluster. The data flows in the Lambda from an API Gateway through the medium of a Kinesis Data Firehose stream.
Before proceeding, ensure you have the following prerequisites in place:
- Install AWS CLI
- Create IAM User for AWS CLI, Create Access Keys & secret keys
- Configure AWS CLI using
aws configure
with Account Id, Access Key, Secret Key, and Region - Install SAM
- This application requires a minimum version of Python 3.9 to run. You can install Python 3.9 from Install Python
-
Go to Lambda section in your AWS console
-
Click on the
Applications
section present on the left navigation bar and then click on Create application -
Type MongoDB-Firehose-Ingestion-App in the search bar and check the "Show apps that create custom IAM roles or resource policies" checkbox
-
Go to Outputs section of your stack in the cloudformation console and check the outputs of the resources deployed. Keep this tab open
-
Copy the value of ApiKeyValue mentioned in the Value column. Go to the Lambda > Select the Authorizer Lambda function > Environment Variables > Paste the copied API Key ID there
-
Go to API Gateway console, click on Resources > ANY > Click on Edit under Method request settings. Disable the
API key required
flag. After saving the changes, Deploy API forProd
stage for the changes to take effect -
Copy the API Gateway endpoint URL for Prod stage from the cloudformation Outputs section and copy the API Key value from the API Keys section in the API Gateway
-
Go to the Firehose console and then follow the instructions mentioned here to create a firehose stream. After creating the firehose stream, go to Configuration > Under the Destination Settings section click on Edit. Paste the API Key Value and the API Gateway Web Endpoint in the fields highlighted below and click on Save changes
-
In your Firehose stream, click on Start sending demo data under the Test with demo data section
-
Go to your MongoDB Atlas cluster and check whether you're able to see the records being inserted in your collection
You have to create a Kinesis Data Firehose stream that will help in moving the data from configured source to destination:
- Click here to go to Kinesis Data Firehose console
- Click on
Create Firehose stream
- Select your desired source of data using the
Source
dropdown - Select
MongoDB Atlas
option in theDestination
dropdown - Enter the API Gateway URL that we created in the previous step in the
HTTP Endpoint URL
field - Copy the API Key value generated in the previous step and paste it in the
Access Key
field - Under the
Backup Settings
section, configure a S3 bucket to store the source record backup if the data transformation doesn't produce the desired results
- For demo purposes, we have allowed access from anywhere
(0.0.0.0/0)
under the Network Access section of MongoDB Atlas Project. We would strictly not recommend this for production scenarios. For production usage, kindly establish a Private Endpoint.