Skip to content

feat(compliance): OWASP LLM Top 10 compliance section + Executive Scorecard#32

Closed
prancst2004 wants to merge 6 commits into
aws-samples:mainfrom
prancst2004:feature/owasp-real-run-validation
Closed

feat(compliance): OWASP LLM Top 10 compliance section + Executive Scorecard#32
prancst2004 wants to merge 6 commits into
aws-samples:mainfrom
prancst2004:feature/owasp-real-run-validation

Conversation

@prancst2004

Copy link
Copy Markdown

Summary

Adds OWASP LLM Top 10 (2025) compliance assessment to the AI/ML Security report as a seamless additive section, with a compact Executive Scorecard widget at the top of the Overview.

What's added

Executive Scorecard (top of Overview)

Compact 4-cell header widget answering 'how bad is it?' at a glance:

  • Critical Actions: high-severity failed count
  • Failed Checks: failed/total ratio + pass rate
  • Service Risk: per-service fail rates with inline progress bars
  • OWASP LLM Top 10: compliance percentage + gauge

Compliance Section (#compliance)

  • Single combined OWASP Top 10 table (10 LLM category rows + 18 OW-XX individual check detail rows)
  • Summary metric cards (Categories, Compliant, Non-Compliant, N/A)
  • Filter bar (Search + Status) with internal scrollbar (max-height: 900px)
  • Sidebar: Compliance Frameworks nav group with OWASP Top 10 LLM

Formatting guarantee

  • CSS: byte-identical to official template
  • JS: official script embedded verbatim + 1 extra createServiceFilter() call
  • All existing sections (Overview metrics, Security Findings, Risk Distribution, per-service tables, Methodology) structurally unchanged
  • Regression harness (regression_check.py) proves zero drift

Real account data

Account 676206921018 (us-east-1, April 18 2026):

  • Service findings: 66 (Bedrock=13, SageMaker=34, AgentCore=19)
  • OWASP: 3 Failed (OW-04, OW-15, OW-16), 1 Passed (OW-11), 14 N/A
  • OWASP Top 10: 2/10 compliant (20%)

Files

File Purpose
docs/real-run-validation/security_assessment_owasp_676206921018.html Rendered report (208 KB)
docs/real-run-validation/build_owasp_report.py Generator
docs/real-run-validation/regression_check.py 32-check regression harness
docs/real-run-validation/README.md Context

Reviewers

Approved in Slack by @agasthik and @vivekml (June 3, 2026).

Pranjit Biswas added 6 commits June 1, 2026 18:03
…6206921018

Captures the artifacts produced by the first real-account run of the OWASP
LLM Top 10 overlay, plus a 42-assertion validation harness that confirms
every piece of feedback from the May 12 Slack thread with Agasthi has
been applied to the rendered report.

Account state at run time (us-east-1):
  - 0 Bedrock guardrails / agents / KBs / custom models / flows
  - 1 Bedrock managed prompt
  - 2 AgentCore runtimes (example_runtime, RetailRadar_Agent)
  - 2 AgentCore memories
  - 3 SageMaker Unified Studio domains
  - 1 ECR repo (bedrock-agentcore-retailradar_agent) with scanOnPush=false
  - 0 CloudWatch alarms in AWS/Bedrock namespace
  - 0 AWS Budgets scoped to Amazon Bedrock

Run output: 84 findings (66 service-level + 18 OWASP overlay), 33 failed
(13 High / 19 Medium / 1 Low), 6 passed, 45 N/A. OWASP compliance: 2/10
categories (20%).

Validation harness asserts:
  - 11 demo-only artifacts removed (Live AWS Evidence, Testing Summary,
    What's Being Pushed, review banner, phase commit refs, etc.)
  - 6 sidebar checks (only Navigation + By Service, no third compliance
    group, no demo-only links)
  - 4 combined-table checks (single table, 10 LLM rows + 18 OW-XX
    sub-rows, prior standalone 18-extensions table is gone)
  - 14 real-data checks (real account/region, no fixture resources like
    qxjfofitorgf, OrderBot, SupportKB, prod-guardrail leaking in)
  - 2 docs-link checks (59 unique AWS doc links + all 10 OWASP cat docs)
  - 1 HTML hygiene check (tags balanced)
  - 4 footer checks (real run date, real account, no demo metadata)

Last run: 42/42 PASS.

This is staged in docs/real-run-validation/ to keep the artifacts
co-located with the OWASP work without touching the deployed assessment
stack code on this branch.
…fficial template

Rebuilt the OWASP report to pass a 32-check regression against the official
resco sample report (sample-reports/security_assessment_single_account.html).

Approach: OWASP is exactly ONE additive section (#compliance) inserted
between #findings and #risk. All existing sections (CSS, JS, head, sidebar,
filter bars, per-service tables, metrics, methodology, footer) are
structurally identical to the official template.

Regression proves:
  - CSS: byte-identical (12,555 chars)
  - JS: verbatim embedded (10,119 chars) + 1 extra createServiceFilter call
  - Main filter bar: official IDs unchanged (searchInput, serviceFilter, etc.)
  - Service filter: Bedrock/SageMaker/AgentCore only (no OWASP — it's not a
    service)
  - findingsTable header: identical columns + sortable attrs
  - Per-service sections: same IDs, same filter patterns, same table structure
  - Overview metric labels: identical set
  - Sidebar 'By Service': unchanged (3 services)
  - Section order (minus OWASP): identical to official
  - Scrollbar CSS + sticky headers: present
  - HTML: balanced

The additive OWASP section contains:
  - 4 summary metric cards (Categories, Compliant, Non-Compliant, N/A)
  - Filter bar (Search + Status) reusing official .filter-bar class
  - Single combined OWASP Top 10 table (10 rows, one per LLM01-LLM10
    category) with Contributing Checks, Status, Coverage, and OWASP
    reference doc links

OWASP scoring: 2 compliant, 6 non-compliant, 2 N/A (based on live account
state for 676206921018 on 2026-04-18).
Sidebar now has 3 groups matching Agasthi's prototype:
  - Navigation: Overview, Security Findings, Risk Distribution,
    Compliance Dashboard, Methodology
  - By Service: Bedrock, SageMaker, AgentCore
  - Compliance Frameworks: OWASP Top 10 LLM (active),
    NIST AI RMF 1.0, MITRE ATLAS, HIPAA (placeholders, dimmed)

OWASP table already had .table-wrap scrollable treatment (confirmed).
Added 18 OW-XX individual check rows below the 10 LLM category rows in the
OWASP compliance table (28 total rows). This overflows the 900px table-wrap
container and makes the internal scrollbar clearly visible — proving the
table has the same scrollable treatment as the service findings tables.

Data is real (account 676206921018, 2026-04-18):
  - 3 Failed: OW-04 (log retention), OW-15 (cost controls), OW-16 (ECR scan)
  - 1 Passed: OW-11 (system prompt protection)
  - 14 N/A: no resources to evaluate
Removed NIST AI RMF 1.0, MITRE ATLAS, and HIPAA placeholder nav items.
Compliance Frameworks group now shows only OWASP Top 10 LLM (the one
active framework). Ready for PR testing.
Compact 4-cell scorecard replaces the plain page-header:
  - Critical Actions: 13 (high-severity failed)
  - Failed Checks: 30/66 (0% actionable pass rate)
  - Service Risk: Bedrock 85% / SageMaker 71% / AgentCore 100% fail rates
    with inline progress bars
  - OWASP LLM Top 10: 20% (2/10 compliant) with gauge bar

No placeholders for future frameworks (per feedback). Only OWASP shown
since that's the active compliance framework.

Approved by Agasthi and Vivek in Slack thread.
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jun 4, 2026
@agasthik agasthik added the enhancement New feature or request label Jun 4, 2026
@prancst2004

Copy link
Copy Markdown
Author

Closing — need to include multi-account testing before resubmitting.

@prancst2004 prancst2004 closed this Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants