feat(compliance): OWASP LLM Top 10 compliance section + Executive Scorecard#32
Closed
prancst2004 wants to merge 6 commits into
Closed
feat(compliance): OWASP LLM Top 10 compliance section + Executive Scorecard#32prancst2004 wants to merge 6 commits into
prancst2004 wants to merge 6 commits into
Conversation
added 6 commits
June 1, 2026 18:03
…6206921018
Captures the artifacts produced by the first real-account run of the OWASP
LLM Top 10 overlay, plus a 42-assertion validation harness that confirms
every piece of feedback from the May 12 Slack thread with Agasthi has
been applied to the rendered report.
Account state at run time (us-east-1):
- 0 Bedrock guardrails / agents / KBs / custom models / flows
- 1 Bedrock managed prompt
- 2 AgentCore runtimes (example_runtime, RetailRadar_Agent)
- 2 AgentCore memories
- 3 SageMaker Unified Studio domains
- 1 ECR repo (bedrock-agentcore-retailradar_agent) with scanOnPush=false
- 0 CloudWatch alarms in AWS/Bedrock namespace
- 0 AWS Budgets scoped to Amazon Bedrock
Run output: 84 findings (66 service-level + 18 OWASP overlay), 33 failed
(13 High / 19 Medium / 1 Low), 6 passed, 45 N/A. OWASP compliance: 2/10
categories (20%).
Validation harness asserts:
- 11 demo-only artifacts removed (Live AWS Evidence, Testing Summary,
What's Being Pushed, review banner, phase commit refs, etc.)
- 6 sidebar checks (only Navigation + By Service, no third compliance
group, no demo-only links)
- 4 combined-table checks (single table, 10 LLM rows + 18 OW-XX
sub-rows, prior standalone 18-extensions table is gone)
- 14 real-data checks (real account/region, no fixture resources like
qxjfofitorgf, OrderBot, SupportKB, prod-guardrail leaking in)
- 2 docs-link checks (59 unique AWS doc links + all 10 OWASP cat docs)
- 1 HTML hygiene check (tags balanced)
- 4 footer checks (real run date, real account, no demo metadata)
Last run: 42/42 PASS.
This is staged in docs/real-run-validation/ to keep the artifacts
co-located with the OWASP work without touching the deployed assessment
stack code on this branch.
…fficial template
Rebuilt the OWASP report to pass a 32-check regression against the official
resco sample report (sample-reports/security_assessment_single_account.html).
Approach: OWASP is exactly ONE additive section (#compliance) inserted
between #findings and #risk. All existing sections (CSS, JS, head, sidebar,
filter bars, per-service tables, metrics, methodology, footer) are
structurally identical to the official template.
Regression proves:
- CSS: byte-identical (12,555 chars)
- JS: verbatim embedded (10,119 chars) + 1 extra createServiceFilter call
- Main filter bar: official IDs unchanged (searchInput, serviceFilter, etc.)
- Service filter: Bedrock/SageMaker/AgentCore only (no OWASP — it's not a
service)
- findingsTable header: identical columns + sortable attrs
- Per-service sections: same IDs, same filter patterns, same table structure
- Overview metric labels: identical set
- Sidebar 'By Service': unchanged (3 services)
- Section order (minus OWASP): identical to official
- Scrollbar CSS + sticky headers: present
- HTML: balanced
The additive OWASP section contains:
- 4 summary metric cards (Categories, Compliant, Non-Compliant, N/A)
- Filter bar (Search + Status) reusing official .filter-bar class
- Single combined OWASP Top 10 table (10 rows, one per LLM01-LLM10
category) with Contributing Checks, Status, Coverage, and OWASP
reference doc links
OWASP scoring: 2 compliant, 6 non-compliant, 2 N/A (based on live account
state for 676206921018 on 2026-04-18).
Sidebar now has 3 groups matching Agasthi's prototype:
- Navigation: Overview, Security Findings, Risk Distribution,
Compliance Dashboard, Methodology
- By Service: Bedrock, SageMaker, AgentCore
- Compliance Frameworks: OWASP Top 10 LLM (active),
NIST AI RMF 1.0, MITRE ATLAS, HIPAA (placeholders, dimmed)
OWASP table already had .table-wrap scrollable treatment (confirmed).
Added 18 OW-XX individual check rows below the 10 LLM category rows in the OWASP compliance table (28 total rows). This overflows the 900px table-wrap container and makes the internal scrollbar clearly visible — proving the table has the same scrollable treatment as the service findings tables. Data is real (account 676206921018, 2026-04-18): - 3 Failed: OW-04 (log retention), OW-15 (cost controls), OW-16 (ECR scan) - 1 Passed: OW-11 (system prompt protection) - 14 N/A: no resources to evaluate
Removed NIST AI RMF 1.0, MITRE ATLAS, and HIPAA placeholder nav items. Compliance Frameworks group now shows only OWASP Top 10 LLM (the one active framework). Ready for PR testing.
Compact 4-cell scorecard replaces the plain page-header:
- Critical Actions: 13 (high-severity failed)
- Failed Checks: 30/66 (0% actionable pass rate)
- Service Risk: Bedrock 85% / SageMaker 71% / AgentCore 100% fail rates
with inline progress bars
- OWASP LLM Top 10: 20% (2/10 compliant) with gauge bar
No placeholders for future frameworks (per feedback). Only OWASP shown
since that's the active compliance framework.
Approved by Agasthi and Vivek in Slack thread.
Author
|
Closing — need to include multi-account testing before resubmitting. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds OWASP LLM Top 10 (2025) compliance assessment to the AI/ML Security report as a seamless additive section, with a compact Executive Scorecard widget at the top of the Overview.
What's added
Executive Scorecard (top of Overview)
Compact 4-cell header widget answering 'how bad is it?' at a glance:
Compliance Section (
#compliance)max-height: 900px)Compliance Frameworksnav group with OWASP Top 10 LLMFormatting guarantee
createServiceFilter()callregression_check.py) proves zero driftReal account data
Account 676206921018 (us-east-1, April 18 2026):
Files
docs/real-run-validation/security_assessment_owasp_676206921018.htmldocs/real-run-validation/build_owasp_report.pydocs/real-run-validation/regression_check.pydocs/real-run-validation/README.mdReviewers
Approved in Slack by @agasthik and @vivekml (June 3, 2026).