Skip to content

Commit 0c9abc7

Browse files
authored
[Agentless] Add some documentation for RDS scanning (#246)
* Add RDS architecture documentation * Add architecture svg * Clarify no datadog access
1 parent e72e2b7 commit 0c9abc7

File tree

2 files changed

+55
-0
lines changed

2 files changed

+55
-0
lines changed

examples/rds_scanning/README.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,57 @@ To deploy a Datadog agentless scanner:
1616
1. Run `terraform init`.
1717
1. Run `terraform apply`.
1818
1. Set your Datadog [API key](https://docs.datadoghq.com/account_management/api-app-keys/).
19+
20+
## How RDS Sensitive Data Scanning Works
21+
22+
The Datadog Agentless Scanner supports sensitive data scanning for RDS databases using a **two-stage process** that never directly accesses your live databases. The scanner is piloted by the Datadog backend and orchestrates the entire scanning workflow automatically.
23+
24+
### Stage 1: Scanner Initiates RDS Export
25+
26+
The Agentless Scanner, controlled by the Datadog backend, initiates an RDS snapshot export to S3:
27+
28+
1. **Datadog backend identifies** RDS databases to scan based on your configuration and tags
29+
2. **Scanner receives instructions** from the Datadog backend to export a specific RDS snapshot
30+
3. **Scanner calls AWS RDS API** (`rds:StartExportTask`) to export the snapshot to a dedicated S3 bucket
31+
4. **AWS RDS exports the snapshot** to S3 in Parquet format, encrypted with a KMS key managed by the scanner
32+
5. **Export completes** and data is ready for scanning in the S3 bucket
33+
34+
The scanner uses a dedicated RDS service role (`DatadogAgentlessScannerRDSS3ExportRole`) that has permissions to write exports to the Agentless Scanner's S3 bucket.
35+
36+
### Stage 2: Scanner Scans Exported Snapshot
37+
38+
Once the export completes, the scanner automatically proceeds to scan the exported data:
39+
40+
1. **Scanner assumes the delegate role** to gain read-only access to the S3 bucket
41+
2. **Reads the exported Parquet files** from S3
42+
3. **Performs sensitive data scanning** on the exported database content
43+
4. **Sends scan results** to Datadog for analysis and alerting
44+
5. **Exported files are automatically deleted**
45+
46+
### Architecture
47+
![RDS Agentless Scanning Architecture](./agentless_rds_scanning.svg)
48+
49+
### Key Components
50+
51+
- **Datadog Backend**: Orchestrates and controls the scanner, determining which RDS databases to scan and when
52+
- **Agentless Scanner**: EC2 instance that initiates RDS exports and performs the actual scanning
53+
- **S3 Bucket** (`datadog-agentless-scanning-*`): Temporary storage for RDS exports with automatic 2-day expiration
54+
- **RDS Service Role**: Allows AWS RDS service to write exports to the S3 bucket
55+
- **Delegate Role**: Allows the scanner to read from the S3 bucket and initiate RDS exports
56+
- **KMS Key**: Encrypts all exported data at rest
57+
58+
### Security Features
59+
60+
- **No Direct Database Access**: Scanners never connect to live RDS databases
61+
- **No Datadog Access to Database or Snapshots**: Datadog never has direct access to your RDS databases or the exported snapshots. Everything is processed within the agentless scanner, which runs entirely in your AWS account and infrastructure.
62+
- **Encryption**: All exports are encrypted with KMS at rest and in transit
63+
- **Automatic Cleanup**: Exported files are automatically deleted
64+
- **Least Privilege**: Separate IAM roles with minimal required permissions
65+
- **Tag-Based Control**: Only RDS resources without `DatadogAgentlessScanner:false` tag are eligible for scanning
66+
67+
### Requirements
68+
69+
- RDS databases must have automated backups enabled (which creates snapshots)
70+
- RDS databases should not have the `DatadogAgentlessScanner:false` tag if you want them to be scanned
71+
- S3 bucket must be deployed in the same region as the RDS database to minimize data transfer costs
72+
- The scanner must be deployed in the same region as the RDS databases you want to scan

0 commit comments

Comments
 (0)