You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/rds_scanning/README.md
+54Lines changed: 54 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,3 +16,57 @@ To deploy a Datadog agentless scanner:
16
16
1. Run `terraform init`.
17
17
1. Run `terraform apply`.
18
18
1. Set your Datadog [API key](https://docs.datadoghq.com/account_management/api-app-keys/).
19
+
20
+
## How RDS Sensitive Data Scanning Works
21
+
22
+
The Datadog Agentless Scanner supports sensitive data scanning for RDS databases using a **two-stage process** that never directly accesses your live databases. The scanner is piloted by the Datadog backend and orchestrates the entire scanning workflow automatically.
23
+
24
+
### Stage 1: Scanner Initiates RDS Export
25
+
26
+
The Agentless Scanner, controlled by the Datadog backend, initiates an RDS snapshot export to S3:
27
+
28
+
1.**Datadog backend identifies** RDS databases to scan based on your configuration and tags
29
+
2.**Scanner receives instructions** from the Datadog backend to export a specific RDS snapshot
30
+
3.**Scanner calls AWS RDS API** (`rds:StartExportTask`) to export the snapshot to a dedicated S3 bucket
31
+
4.**AWS RDS exports the snapshot** to S3 in Parquet format, encrypted with a KMS key managed by the scanner
32
+
5.**Export completes** and data is ready for scanning in the S3 bucket
33
+
34
+
The scanner uses a dedicated RDS service role (`DatadogAgentlessScannerRDSS3ExportRole`) that has permissions to write exports to the Agentless Scanner's S3 bucket.
35
+
36
+
### Stage 2: Scanner Scans Exported Snapshot
37
+
38
+
Once the export completes, the scanner automatically proceeds to scan the exported data:
39
+
40
+
1.**Scanner assumes the delegate role** to gain read-only access to the S3 bucket
41
+
2.**Reads the exported Parquet files** from S3
42
+
3.**Performs sensitive data scanning** on the exported database content
43
+
4.**Sends scan results** to Datadog for analysis and alerting
-**Datadog Backend**: Orchestrates and controls the scanner, determining which RDS databases to scan and when
52
+
-**Agentless Scanner**: EC2 instance that initiates RDS exports and performs the actual scanning
53
+
-**S3 Bucket** (`datadog-agentless-scanning-*`): Temporary storage for RDS exports with automatic 2-day expiration
54
+
-**RDS Service Role**: Allows AWS RDS service to write exports to the S3 bucket
55
+
-**Delegate Role**: Allows the scanner to read from the S3 bucket and initiate RDS exports
56
+
-**KMS Key**: Encrypts all exported data at rest
57
+
58
+
### Security Features
59
+
60
+
-**No Direct Database Access**: Scanners never connect to live RDS databases
61
+
-**No Datadog Access to Database or Snapshots**: Datadog never has direct access to your RDS databases or the exported snapshots. Everything is processed within the agentless scanner, which runs entirely in your AWS account and infrastructure.
62
+
-**Encryption**: All exports are encrypted with KMS at rest and in transit
63
+
-**Automatic Cleanup**: Exported files are automatically deleted
64
+
-**Least Privilege**: Separate IAM roles with minimal required permissions
65
+
-**Tag-Based Control**: Only RDS resources without `DatadogAgentlessScanner:false` tag are eligible for scanning
66
+
67
+
### Requirements
68
+
69
+
- RDS databases must have automated backups enabled (which creates snapshots)
70
+
- RDS databases should not have the `DatadogAgentlessScanner:false` tag if you want them to be scanned
71
+
- S3 bucket must be deployed in the same region as the RDS database to minimize data transfer costs
72
+
- The scanner must be deployed in the same region as the RDS databases you want to scan
0 commit comments