Skip to content

realm-security/agent-union-type

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Embracing Uncertainty with AI Agents: Union-type Structured Output

A practical guide to building a Vulnerability Assessment triage Agent.

TLDR: We show union-type structured output allows AI agents to handle uncertain outcomes, critical for auditable and accurate vulnerability triage.

This code is a companion to our technical blog post, published by Realm.Security.

Contents

  • vuln_agent CLI runs the Pydantic AI agent to evaluate a pip-audit scan.

  • src/vuln_agent/ package contains the agent, code search, and Pydantic AI definitions

  • vuln_demo/ directory contains a Python package with a PyJWT vulnerability (CVE-2022-29217)

Setup

This demonstration requires pip-audit, ripgrep and uv.

brew install pip-audit
brew install ripgrep
brew install uv

Depending on your LLM of choice, change the MODEL_ID variable in agent.py to match a KnownModelName literal string. Ensure that you pass the proper credentials as environment variables in when calling the CLI through uv.

Usage

The following steps illustrate the usage of the vuln_agent CLI tool for assessing a Python package for critical vulnerabilities.

Collect telemetry

We use OTEL to provide clear logs and traces to observe the AI agent actions and outcomes. For the demo, we recommend using the Jaeger tracing tool as a simple frontend to view the results.

docker run --rm --name jaeger \
  -p 16686:16686 \
  -p 4317:4317 \
  -p 4318:4318 \
  -p 5778:5778 \
  -p 9411:9411 \
  cr.jaegertracing.io/jaegertracing/jaeger:2.11.0

Export the following environment variable to connect to the local Jaeger frontend.

export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

Scan a Python package

Use pip-audit to scan a package and produce a JSON scan result. The vuln_demo package provides an example with a vulnerable PyJWT usage for the purpose of testing the agent.

pip-audit -s osv -l -f json -o vuln_demo/scan_results.json vuln_demo/

Run the Agent

Provide the scan results as the input to the vuln_agent CLI.

uv run vuln_agent assess --input-file vuln_demo/scan_results.json

Include the search path for more confident results, allowing the LLM to search the code through a read-only bounded directory ripgrep. This tool is disabled when no path is provided.

uv run vuln_agent assess --input-file vuln_demo/scan_results.json --search-path vuln_demo

Example vuln_demo package

The vuln_demo package illustrates the value of using union-type structured output for AI Agents (as discussed in our technical blog post as it gracefully handles the case of insufficient information. When the code search functionality is not included in the CLI arguments, the agent can not assess whether the PyJWT is active. Instead of hallucinating a falsely confident response, it raises the situation up to a security expert and admits the uncertainty. This design pattern ensures agent uncertainty results the correct action, and provides detailed structured-output analysis for that case.

UnableToAssess outcome

Assessment Complete:
------------------------------------------------------------
MANUAL REVIEW REQUIRED: Unable to complete assessment.
  - Justification: Code search functionality is disabled, preventing analysis of how
  PyJWT is actually used in the application. CVE-2022-29217 is an algorithm confusion
  vulnerability (CVSS 7.4) in PyJWT 2.3.0 that allows attackers to bypass signature
  verification by using HMAC with a public key when
  jwt.algorithms.get_default_algorithms() is used. However, exploitability depends
  critically on:

1. Whether jwt.decode() is called with algorithms=jwt.algorithms.get_default_algorithms()
or a broad algorithm list
2. Whether the application uses asymmetric keys (EdDSA/ECDSA) for JWT validation
3. Whether public keys are exposed that could be abused for HMAC signing

The vulnerability is NOT exploitable if the application:
- Explicitly specifies allowed algorithms (e.g., algorithms=['RS256'])
- Only uses HMAC with symmetric secrets
- Does not use EdDSA or ECDSA keys in SSH format

Without access to the codebase to verify JWT usage patterns, authentication
implementation, and algorithm configuration, I cannot determine if the vulnerable
code path is reachable. Given this is an internet-facing production background worker
with authentication required (suggesting JWT tokens may be in use), the risk could be
significant if improperly configured.
  - Uncertainty: insufficient_context
  - Flagged CVEs: CVE-2022-29217

Recommended Action:
  1. Manually audit all jwt.decode() calls in the codebase to verify that explicit
  algorithm lists are used (e.g., algorithms=['RS256'] instead of
  get_default_algorithms())
  2. Check if EdDSA or ECDSA keys in SSH format are used for JWT validation
  3. Review authentication middleware and helper functions that may wrap jwt.decode()
  4. If vulnerable patterns are found, upgrade PyJWT from 2.3.0 to 2.4.0+ immediately
  5. Implement code scanning rules to prevent use of get_default_algorithms() in
  production code

CriticalVulnerability outcome

Assessment Complete:
------------------------------------------------------------
CRITICAL: CVE-2022-29217 in pyjwt
  - Severity Score: 7.4
  - Package: pyjwt (Current: 2.3.0, Fixed: 2.4.0)
  - Priority: 1 (Patch immediately)
  - Public Exploit: Yes

Business Impact:
  Complete authentication bypass allowing unauthorized access to the background worker
  system. Attackers can forge arbitrary JWT tokens to impersonate any user, execute
  unauthorized operations, access sensitive data processed by the worker, and
  potentially pivot to other AWS resources. This is a critical security control
  failure in a production environment.

Exploitability:
  The application is CRITICALLY vulnerable to algorithm confusion attacks. Code
  analysis reveals that app.py lines 5-8 use jwt.decode() with
  algorithms=jwt.algorithms.get_default_algorithms(), which is the exact vulnerable
  pattern described in CVE-2022-29217. The verify_token() function accepts a token
  and public_key parameter, making it callable from external code. An attacker can
  craft a JWT token using HMAC (HS256) signed with the application's public key (which
  is typically publicly accessible for JWT verification). When the application calls
  verify_token() with this malicious token, it will accept HMAC as a valid algorithm
  and use the public key as the HMAC secret, allowing the attacker to forge valid JWT
  tokens and bypass authentication entirely. Given that authentication_required=true
  and internet_facing=true in the deployment context, this enables complete
  authentication bypass for a production background worker on AWS ECS.

About

Demonstrates union-type structured output technique to triage vulnerability assessments with Pydantic AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages