AWS AI/ML Security Assessment for Amazon Bedrock, Amazon SageMaker AI, and Amazon Bedrock AgentCore

A serverless framework that scans your AWS accounts for AI/ML security misconfigurations and produces an interactive, shareable report.

Open-source automated security scanner for generative AI and machine learning workloads on AWS. Core checks for Amazon Bedrock, Amazon SageMaker AI, and Amazon Bedrock AgentCore are built on the AWS Well-Architected Framework — Generative AI Lens. An optional Financial Services GenAI risk module adds 64 checks aligned to the AWS User Guide to Governance, Risk, and Compliance for Responsible AI Adoption within Financial Services Industries. See the AWS Security Blog announcement for context on the updated guide.

Run 116 security checks across your AWS accounts and regions in one deployment. Surfaces IAM misconfigurations, encryption gaps, network isolation issues, missing guardrails, and governance gaps — with interactive HTML reports, severity ratings, and AWS documentation links for remediation. Single-account or full AWS Organizations multi-account scans; all data stays in your account.

See It In Action

The framework generates professional, interactive security assessment reports with filtering, search, and dark mode support.

Download Sample Reports | Single Account | Multi-Account

Executive Dashboard (Light Mode)	Executive Dashboard (Dark Mode)
Interactive Findings Table with Filtering

Key Features

Executive Summary with severity counts and service breakdown
Priority Recommendations highlighting critical issues requiring immediate attention
116 Security Checks across Amazon Bedrock, Amazon SageMaker AI, Amazon Bedrock AgentCore, and Financial Services GenAI Risk
Multi-Region Support for core Bedrock, SageMaker, and AgentCore checks, with per-region risk breakdown
Interactive Filtering by account, region, service, severity, and status
Light/Dark Mode Toggle with persistent user preference
Text Search across all findings with real-time results
Direct AWS Documentation Links for each finding with remediation guidance
Multi-Account Support with consolidated reporting across your organization
Fully Automated deployment and execution through AWS CloudFormation and AWS CodeBuild

What It Does

This serverless assessment framework automatically evaluates your AI/ML workloads against AWS security best practices. It uses AWS serverless services to gather data from the control plane and generate reports containing the status of various security checks, severity levels, and recommended actions.

Designed for workloads using Amazon Bedrock, Amazon Bedrock AgentCore, Amazon SageMaker AI, or the optional Financial Services GenAI risk assessment.

Why Use This Framework?

Challenge	How This Framework Helps
Manual security audits are time-consuming	Fully automated scanning with one-click CloudFormation deployment
Inconsistent security checks across teams	Standardized 116-check assessment based on AWS Well-Architected Generative AI Lens best practices and AWS Responsible AI governance, risk, and compliance guidance for financial services
Difficulty tracking AI/ML security posture	Interactive HTML dashboards with severity breakdown and per-account visibility
Multi-account complexity	Consolidated reporting across AWS Organizations with cross-account role assumption
Compliance and audit support	Exportable reports to supplement your compliance program, with remediation guidance linked to AWS documentation
Generative AI security gaps	Purpose-built checks for LLM guardrails, model access controls, and prompt injection prevention

Services Covered:

Amazon Bedrock (14 checks) - Guardrails, encryption, Amazon VPC endpoints, AWS IAM permissions, model invocation logging
Amazon SageMaker AI (25 checks) - AWS Security Hub controls (SageMaker.1-5), encryption, network isolation, AWS IAM, MLOps
Amazon Bedrock AgentCore (13 checks) - Amazon VPC configuration, encryption, observability, resource policies
Financial Services GenAI Risk (64 checks) - Unbounded consumption, excessive agency, supply chain, training data poisoning, hallucination, prompt injection, PII disclosure, and 8 more FinServ-specific risk categories derived from the AWS User Guide to Governance, Risk, and Compliance for Responsible AI Adoption within Financial Services Industries. See the AWS Security Blog announcement for context on the updated guide.

Deployment Options:

Single-Account: Assess security in one AWS account
Multi-Account: Scan entire AWS Organizations with consolidated reporting

How It Works:

Deploy through AWS CloudFormation (one-click deployment)
Framework automatically scans your AI/ML resources
Generates interactive HTML reports stored in your Amazon S3 bucket
All data stays in your AWS account - no external dependencies

Scope and Limitations

This tool operates within the AWS Shared Responsibility Model. It assesses your configuration responsibilities (IAM policies, encryption settings, network isolation, logging) for AI/ML services. It does not assess AWS-managed infrastructure, physical security, or the underlying service platform.

Point-in-time assessment. Each run captures your security posture at the moment of execution. Resource configurations can change immediately after an assessment completes. Run assessments regularly and after significant changes to maintain visibility.

No guarantee of security or compliance. This framework identifies common misconfigurations based on AWS best practices and the AWS Well-Architected Framework. It does not cover all possible security risks, does not replace formal compliance audits (SOC 2, HIPAA, and similar), and does not guarantee that your workloads are secure. Use the results as one input into your broader security program.

116 checks across four domains. The assessment covers Amazon Bedrock, Amazon SageMaker AI, Amazon Bedrock AgentCore, and optional Financial Services GenAI risk checks. Other AI/ML services (Amazon Comprehend, Amazon Rekognition, Amazon Textract, and others) are not currently assessed.

Quick Start

Single-Account: Jump to Single-Account Deployment
Multi-Account: Jump to Multi-Account Deployment

Architecture

Prerequisites

Python 3.12+ — Install Python
AWS SAM CLI — Install the AWS SAM CLI
Docker (optional) — Install Docker — Only required for local development

Single-Account Deployment

Download the aiml-security-single-account.yaml CloudFormation template.
Deploy to AWS CloudFormation
Upload the template and provide a stack name.
Optionally specify your email address to receive notifications.
(Optional) Multi-Region: Set TargetRegions to scan multiple regions:
- Leave empty to scan only the deployment region (default)
- Comma- or space-separated list (for example, us-east-1,us-west-2,eu-west-1 or us-east-1 us-west-2 eu-west-1)
- all to scan all regions where the services are available
Acknowledge IAM capabilities and click Submit.
Once complete, CodeBuild automatically runs the assessment.
View results: go to the stack Outputs tab → copy AssessmentBucket → open the report under the /{account_id}/ prefix in that S3 bucket.

Tip: The deployment creates two stacks. Your results are in the stack you named, not the auto-generated aiml-sec-* stack. See Troubleshooting for details.

Multi-Account Deployment

Step 1: Deploy Member Roles

Deploy 1-aiml-security-member-roles.yaml to all target accounts using CloudFormation StackSets with service-managed permissions.

Navigate to CloudFormation > StackSets in the AWS Organizations management account or delegated administrator account
Upload the template and set ManagementAccountID to the account ID where the central multi-account CodeBuild project runs
Select Service-managed permissions and target your OUs
Select your target region and submit

Step 2: Deploy Central Infrastructure

Deploy 2-aiml-security-codebuild.yaml in your central assessment account. This can be your AWS Organizations management account or a delegated administrator/central tooling account.

Upload the template and set MultiAccountScan to true
Optionally set TargetRegions for multi-region scanning
Optionally provide an email address for notifications
Acknowledge IAM capabilities and submit
Stack creation automatically triggers the assessment across all accounts

Multi-Region Scanning

Both deployment modes support scanning multiple AWS regions in parallel via the TargetRegions parameter:

Value	Behavior
Empty (default)	Scans deployment region only — fully backward compatible
Comma- or space-separated (for example, `us-east-1,us-west-2` or `us-east-1 us-west-2`)	Scans those regions in parallel
`all`	Discovers and scans all regions where assessed services are available

Scanning uses a Step Functions Map state, so multiple regions execute in parallel with no additional time cost. Services unavailable in a region produce an informational N/A finding.

The HTML report includes a Region column, filter dropdown, and "Risk by Region" summary.

Upgrading an existing deployment? See Troubleshooting — it's a simple stack parameter update with no teardown.

How It Works

Deploy — CloudFormation creates CodeBuild, S3, IAM roles, and a Lambda trigger
CodeBuild runs — builds and deploys the SAM assessment stack (per account in multi-account mode)
Step Functions execute — orchestrates: S3 cleanup → IAM permission caching → resolve regions → Map state fans out per-region assessments (Bedrock, SageMaker, AgentCore in parallel) → optionally run FinServ checks → generate consolidated report
Results — HTML and CSV reports are stored in your S3 bucket

Optional: Financial Services GenAI Risk Checks (`EnableFinServAssessment`)

The 64 Financial Services (FS-XX) GenAI risk checks are opt-in and default to false. Set the EnableFinServAssessment deployment parameter to true when you want the additional Financial Services GenAI risk assessment. When enabled, the FinServ assessment Lambda runs and its findings appear in a dedicated Financial Services section of the HTML report. When left false, no FinServ findings are produced and the report omits the FinServ section entirely. The toggle is threaded into the Step Functions execution input (enableFinServ); the FinServ Lambda is always deployed but is invoked only when the flag is true.

Deployment path note. The EnableFinServAssessment parameter is wired through the CodeBuild-based deployment templates (deployment/aiml-security-single-account.yaml and deployment/2-aiml-security-codebuild.yaml), which thread it into every Step Functions start-execution call as enableFinServ. This is the supported install path. If you instead deploy aiml-security-assessment/template.yaml directly with sam deploy and start executions yourself, the state machine has no built-in trigger, so FinServ stays off unless you include "enableFinServ": "true" in the execution input you pass to StartExecution.

Scope and limitations

FinServ Region scope. Core Bedrock, SageMaker, AgentCore, and optional FinServ checks use the resolved TargetRegions from the deployment parameters. FinServ findings are emitted with Region values so they appear alongside the same regional filter and per-region report views as the core service checks.
Heuristic and advisory checks. Some controls cannot be verified through an API (application-layer controls, dataset contents, resource associations); these are reported as ADVISORY/N/A and require manual review. See How finding severities are determined.
Permissions. A check that lacks an IAM permission is reported as COULD NOT ASSESS (not a failure). Re-deploy the member role after any IAM template change so newer actions take effect.

For detailed architecture, execution flow, and extension guidance, see the Developer Guide.

Viewing Results

Open your infrastructure stack in CloudFormation → Outputs tab → copy AssessmentBucket
Navigate to that S3 bucket
For single-account, open {account_id}/security_assessment_single_account_*.html
For multi-account, open consolidated-reports/security_assessment_multi_account_*.html

Assessment Execution Process

Automatic Trigger

The AWS CodeBuild project starts automatically after central stack creation
An AWS Lambda trigger function initiates the assessment workflow

Multi-Account Orchestration

Account Discovery: AWS CodeBuild queries AWS Organizations for active accounts
Role Assumption: Assumes AIMLSecurityMemberRole in each target account
Module Deployment: Deploys the AI/ML assessment module:
- Amazon Bedrock Assessment AWS Lambda
- Amazon SageMaker AI Assessment AWS Lambda
- Amazon Bedrock AgentCore Assessment AWS Lambda
- Financial Services GenAI Risk Assessment AWS Lambda
- AWS IAM Permission Caching AWS Lambda
- Consolidated Report Generation AWS Lambda
Assessment Execution: AWS Step Functions orchestrate parallel AWS Lambda execution
Results Collection: Individual AWS Lambda functions store results in local Amazon S3 buckets
Consolidation: AWS CodeBuild collects and consolidates results from all accounts
Reporting: Generates multi-account HTML and CSV reports
Notification: Sends completion notification through Amazon SNS (if configured)

Monitoring and Results

Amazon S3 Bucket: Central storage for all assessment results
Amazon CloudWatch Logs: AWS CodeBuild execution logs
Amazon SNS Notifications: Email alerts on completion/failure
Amazon EventBridge Rules: Automated workflow triggers

You can check the AWS CodeBuild console to confirm the assessment completed successfully before accessing the results.

Accessing Results

Find the Amazon S3 Bucket Name:
- Navigate to AWS CloudFormation > Stacks in the AWS Console
- For single-account deployments using the standalone template (aiml-security-single-account.yaml), select the stack you deployed (for example, aiml-security-single-account) and find the AssessmentBucket output. Results are synced to this bucket under the {account_id}/ prefix.
- For multi-account deployments, select the aiml-security-multi-account stack created in Step 2: Deploy Central Infrastructure and find the AssessmentBucket output
- Go to the Outputs tab
- Copy the Amazon S3 bucket name
Note: The deployment creates multiple Amazon S3 buckets. Only use the bucket from the AssessmentBucket output above. Other buckets (such as aiml-sec-*-aimlassessmentbucket-* from nested stacks or aws-sam-cli-managed-* for deployment artifacts) are for internal use and can be ignored.
Navigate to the Amazon S3 Bucket:
- Go to Amazon S3 in the AWS Console
- Search for and open your assessment bucket
- For single-account deployments, open the {account_id}/ folder and then open the security_assessment_single_account_YYYYMMDD_HHMMSS.html report
- For multi-account deployments, follow the Report Structure guidance below

Report Structure

Consolidated Reports

Location: consolidated-reports/ folder in the bucket
Content: Multi-account HTML report combining all account assessments
File Format: security_assessment_multi_account_YYYYMMDD_HHMMSS.html
Features:
- Executive summary with metrics (Total, High, Medium, Low severity counts)
- Service breakdown (Amazon Bedrock, Amazon SageMaker AI, Amazon Bedrock AgentCore, Financial Services GenAI Risk)
- Priority recommendations
- Light/dark mode toggle (persists through localStorage)
- Dropdown filters for Account ID, Region, Service, Severity, Status
- Text search filter for findings
- "View Docs" buttons for reference links

Individual Account Reports

Location: Folders named with account IDs (for example, 123456789012/)
Content: Account-specific CSV and HTML files for AI/ML assessments
Files Include:
- bedrock_security_report_{execution_id}.csv - Amazon Bedrock security assessment results
- sagemaker_security_report_{execution_id}.csv - Amazon SageMaker AI security assessment results
- agentcore_security_report_{execution_id}.csv - Amazon Bedrock AgentCore security assessment results
- finserv_security_report_{execution_id}.csv - Financial Services GenAI risk assessment results (64 FS-XX checks)
- permissions_cache_{execution_id}.json - IAM permissions cache
- security_assessment_single_account_{timestamp}.html - Consolidated HTML report (same features as multi-account report)

Understanding Results

Severity	Meaning
High	Critical — immediate action required
Medium	Important — should be addressed
Low	Minor — best practice optimization
Informational	Advisory — no action required

Status	Meaning
Failed	Security issue identified
Passed	Resource meets best practice
N/A	No resources to assess or service not available in region

How finding severities are determined

FinServ (FS-) check severities are assigned by a documented, reproducible methodology rather than per-check intuition. Each control is scored on two axes — Impact (harm if the control is absent) and Likelihood (probability the adverse outcome occurs given the control is absent) — and the pair is mapped to a severity via a 3×3 matrix. The labels align with the AWS Security Hub ASFF severity scale, so findings can be forwarded to Security Hub with consistent severities:

Label	ASFF normalized	Meaning
Informational	0	No actionable issue (control not applicable, advisory/manual-review, or could-not-assess context)
Low	1–39	Does not require action on its own; compensating controls exist
Medium	40–69	Should be addressed, but not urgently
High	70–89	Should be addressed as a priority

Severity is a property of the control (its inherent risk), so a check's Passed and Failed rows carry the same severity. The N/A family is fixed by disposition: not-applicable and advisory findings are Informational; could-not-assess (access-denied / unsupported region) findings are Low. Critical is reserved and not currently emitted.

For the full methodology (matrix, factor definitions, disposition rules) and the authoritative per-finding assignments, see FinServ Severity Methodology and the FinServ Severity Register. Mappings are preliminary — validate with your MRM/Legal/Compliance teams before relying on them as audit evidence.

Customization

Task	How
Add new accounts	Add to StackSet deployment targets
Modify permissions scope	Edit `1-aiml-security-member-roles.yaml`
Adjust concurrency	Change `ConcurrentAccountScans` parameter
Add new service checks	See Developer Guide

Permissions Required

The deployment uses multiple IAM roles with different trust and permission boundaries. They are not all read-only.

CodeBuildRole / MultiAccountCodeBuildRole: orchestration roles used by the infrastructure stack to clone the repo, build SAM, deploy/update the assessment stack, and start Step Functions executions. These roles require infrastructure-management permissions such as CloudFormation, Lambda, IAM, Step Functions, and S3 actions.
AIMLSecurityMemberRole: role assumed in the target account during single-account and multi-account runs. In the multi-account flow this role is also not read-only. It needs both service-read permissions for the checks and deployment permissions so CodeBuild can create or update the per-account SAM assessment stack.
SAM-created Lambda execution roles: runtime roles for the assessment functions. These are the closest thing to read-only assessment roles. They primarily use List*, Describe*, and Get* access against Bedrock, SageMaker, AgentCore, IAM analysis APIs, and supporting read APIs, plus S3 access to write reports and read the cached IAM permissions file.

If you need to reduce scope, review the role policies in:

Documentation

Document	Description
Security Checks Reference	Complete reference for all 116 security checks with severity levels
FinServ GenAI Risk Checks	Complete FS-01..69 reference: shared introduction, severity rubric, upstream-overlap table, compliance framework mapping, and all check definitions (Part 1 infrastructure controls, Part 2 guardrails & content safety, Part 3 app-layer controls & gaps)
FinServ Severity Methodology	Likelihood × Impact → ASFF severity model, disposition rules, and research basis for FS check severities
FinServ Severity Register	Authoritative per-finding severity assignments (the single source of truth enforced by the drift-guard test)
FinServ Compliance Mappings	Preliminary mapping of FS checks to SR 11-7, FFIEC CAT, NYDFS 500, PCI-DSS, DORA, MAS TRM, ISO 27001, ECOA, and OWASP LLM Top 10
Troubleshooting Guide	Common issues, stack identification, upgrade guide, debugging
Developer Guide	Architecture details, adding custom checks, and contributing
Cleanup Guide	Step-by-step resource removal instructions

CI/CD

GitHub Actions workflows run automatically on pull requests and selected pushes:

Workflow	Trigger	What It Checks
Python Code Quality	PR	`ruff check` and `ruff format --check` on changed Python files
AI/ML Security Assessment Tests	PR, push to `main`/`develop`	Runs the `pytest` suite (assessment functions and report pipeline) on Python 3.11 and 3.12
CloudFormation Lint	PR	Validates deployment and SAM templates with `cfn-lint`
SAM Validate & Build	PR	`sam validate --lint` and `sam build` on SAM templates
ASH Security Scan	PR	Scans for secrets, dependency vulnerabilities, and IaC misconfigurations
ASH Full Repository Scan	Push to main, monthly	Full repository security scan

Contributing

We welcome community contributions! See the Developer Guide for guidelines.

Security

See CONTRIBUTING for reporting security issues.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 205 Commits
.github		.github
aiml-security-assessment		aiml-security-assessment
deployment		deployment
docs		docs
sample-reports		sample-reports
tests		tests
.cfnlintrc		.cfnlintrc
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
buildspec.yml		buildspec.yml
consolidate_html_reports.py		consolidate_html_reports.py

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

AWS AI/ML Security Assessment for Amazon Bedrock, Amazon SageMaker AI, and Amazon Bedrock AgentCore

See It In Action

Key Features

Table of Contents

What It Does

Why Use This Framework?

Scope and Limitations

Quick Start

Architecture

Prerequisites

Single-Account Deployment

Multi-Account Deployment

Step 1: Deploy Member Roles

Step 2: Deploy Central Infrastructure

Multi-Region Scanning

How It Works

Optional: Financial Services GenAI Risk Checks (EnableFinServAssessment)

Scope and limitations

Viewing Results

Assessment Execution Process

Automatic Trigger

Multi-Account Orchestration

Monitoring and Results

Accessing Results

Report Structure

Consolidated Reports

Individual Account Reports

Understanding Results

How finding severities are determined

Customization

Permissions Required

Documentation

CI/CD

Contributing

Security

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Optional: Financial Services GenAI Risk Checks (`EnableFinServAssessment`)

Packages