Note
Shaart: 96.15% success rate on hint-free, source-aware XBOW benchmark. View Results β
Autonomous AI Penetration Testing β’ Real Exploits, Not Alerts
When you launch Shaart, you're greeted with a retro-futuristic CRT terminal experience:
BIGMAC-ATTACK CORP :: SECURITY TERMINAL v1.0.0
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
> INITIALIZING SYSTEM...
> LOADING BIOS... OK
> CHECKING MEMORY... 64K OK
> MOUNTING DRIVES... OK
> BOOTING SECURITY SUBSYSTEM...
> LOADING MU/TH/UR 6000 PROTOCOLS...
> INITIALIZING AI MODULES...
βββββββββββ βββ ββββββ ββββββ βββββββ βββββββββ
βββββββββββ ββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββ βββ
ββββββββββββββββββββββββββββββββββββββββ βββ
βββββββββββ ββββββ ββββββ ββββββ βββ βββ
βββββββββββ ββββββ ββββββ ββββββ βββ βββ
> SECURITY HUNTING AI AGENT FOR RECON & TESTING
> SYSTEM STATUS........................... [ACTIVE]
> AUTHORIZATION................. [REQUIRED - READ ONLY]
β WARNING: DEFENSIVE SECURITY OPERATIONS ONLY
β UNAUTHORIZED ACCESS PROHIBITED :: 18 U.S.C. Β§ 1030
> AWAITING TARGET CONFIGURATION...
Skip the animation in CI/CD: SHAART_SKIP_ANIMATION=true
Your dev team ships code daily with Claude Code and Cursor. But security testing? That's a once-a-year event.
364 days of vulnerability exposure while you wait for the next penetration test.
This gap is where breaches happen. By the time you discover a critical SQL injection or auth bypass, it's been in production for monthsβor worse, already exploited.
Shaart is an autonomous AI pentester that executes real exploits against your running application.
Not a scanner. Not a linter. An actual penetration tester that:
- Reads your source code to understand attack vectors
- Launches a browser to exploit your live application
- Proves vulnerabilities with working exploits
- Delivers pentester-quality reports with proof-of-concepts
Ship with confidence. Test every build. Close the security gap.
Traditional scanners flag "potential" vulnerabilities. Shaart proves them.
- β Scanner: "This endpoint might be vulnerable to SQL injection"
- β Shaart: "SQL injection confirmed. Here's the database dump. Here's the exact command."
No exploit = No report. If Shaart can't prove it works, it doesn't waste your time.
White-box analysis tells Shaart where the vulnerabilities hide. Black-box exploitation proves they're actually exploitable.
This hybrid approach finds what pure scanners miss and validates what static analysis only guesses.
Launch one command. Shaart's AI agents handle the rest:
βββββββββββββββββββββββββββββββββββββββββββ
β 1. RECONNAISSANCE β
β Maps your attack surface β
β Tech stack, endpoints, auth flows β
ββββββββββββββββ¬βββββββββββββββββββββββββββ
β
ββββββββββββ΄βββββββββββ
βΌ βΌ
βββββββββββ βββββββββββ
β 2. VULN β β 2. VULN β
β AGENTS β ... β AGENTS β
β (PARALLEL) β (PARALLEL)
ββββββ¬βββββ ββββββ¬βββββ
β β
βΌ βΌ
βββββββββββ βββββββββββ
β 3. EXPLOIT β 3. EXPLOIT
β AGENTS β ... β AGENTS β
β (PARALLEL) β (PARALLEL)
ββββββ¬βββββ ββββββ¬βββββ
β β
ββββββββββββ¬βββββββββββ
βΌ
ββββββββββββββββ
β 4. REPORTING β
β Proven Vulns β
β Only β
ββββββββββββββββ
Parallelized for speed. Multiple vulnerability categories tested simultaneously.
Most automated tools fail at login. Shaart doesn't.
β Multi-Factor Authentication (TOTP/2FA): Auto-generates time-based codes β OAuth & SSO: "Sign in with Google" and federated identity β Custom Workflows: Step-by-step login instructions you define β Session Management: Stays authenticated throughout the entire test
Shaart augments AI reasoning with proven security tools:
- Nmap: Port scanning, service fingerprinting
- Subfinder: Subdomain discovery and enumeration
- WhatWeb: Technology and framework detection
- Schemathesis: OpenAPI/Swagger API fuzzing
Stop drowning in false positives. Shaart delivers actionable findings:
- π Copy-paste proof-of-concepts: Commands that reproduce the exploit
- π° Business impact analysis: What an attacker could actually do
- π§ Code-level remediation: Exact fixes for your tech stack
- π OWASP mapping: Industry-standard vulnerability classification
- Injection Attacks: SQL injection, command injection, NoSQL injection
- Cross-Site Scripting (XSS): Reflected, stored, DOM-based
- Authentication Bypass: Login circumvention, session hijacking, JWT attacks
- Authorization Failures: Privilege escalation, IDOR, missing access controls
- Server-Side Request Forgery (SSRF): Internal network access, cloud metadata exploitation
- Business Logic Vulnerabilities (#8): Workflow bypass, rate limit evasion, price manipulation
- API Security Testing (#7): REST/GraphQL native exploitation
- Expanded Injection Coverage (#9): LDAP, XML/XPath, XXE
- File Upload Exploitation (#11): Polyglot files, MIME bypass
- Blind Exploitation (#10): Time-based detection, DNS exfiltration
Issue #32: GitHub integration for complete vulnerability lifecycle management
- Auto-create GitHub issues for each discovered vulnerability
- Remediation guidance with multiple code-level fix options
- Retest workflow to verify fixes automatically
- Delta reporting to track new vs. fixed vulnerabilities over time
- Project board integration for remediation tracking
- Docker (recommended deployment method)
- Claude API access (Console account with credits or API key)
docker build -t shaart:latest .Shaart needs access to your application's source code:
# Clone your repository
git clone https://github.com/your-org/your-app.git repos/your-app
# Or for multi-repo apps, organize in one folder:
mkdir repos/your-app
cd repos/your-app
git clone https://github.com/your-org/frontend.git
git clone https://github.com/your-org/backend.gitWith OAuth Token:
docker run --rm -it \
--network host \
--cap-add=NET_RAW \
--cap-add=NET_ADMIN \
-e CLAUDE_CODE_OAUTH_TOKEN="$CLAUDE_CODE_OAUTH_TOKEN" \
-e CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 \
-v "$(pwd)/repos:/app/repos" \
shaart:latest \
"https://your-app.com" \
"/app/repos/your-app"With API Key:
docker run --rm -it \
--network host \
--cap-add=NET_RAW \
--cap-add=NET_ADMIN \
-e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
-e CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 \
-v "$(pwd)/repos:/app/repos" \
shaart:latest \
"https://your-app.com" \
"/app/repos/your-app"Testing localhost apps? Use host.docker.internal instead of localhost:
docker run --rm -it \
--add-host=host.docker.internal:host-gateway \
--cap-add=NET_RAW \
--cap-add=NET_ADMIN \
-e CLAUDE_CODE_OAUTH_TOKEN="$CLAUDE_CODE_OAUTH_TOKEN" \
-e CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000 \
-v "$(pwd)/repos:/app/repos" \
shaart:latest \
"http://host.docker.internal:3000" \
"/app/repos/your-app"Create configs/my-app.yaml:
authentication:
login_type: form
login_url: "https://your-app.com/login"
credentials:
username: "test@example.com"
password: "testpassword"
totp_secret: "BASE32SECRETHERE" # For 2FA
login_flow:
- "Type $username into the email field"
- "Type $password into the password field"
- "Click the 'Sign In' button"
success_condition:
type: url_contains
value: "/dashboard"
rules:
focus:
- description: "Prioritize API endpoints"
type: path
url_path: "/api"
avoid:
- description: "Skip logout testing"
type: path
url_path: "/logout"Then run with: --config /app/configs/my-app.yaml
Deliverables saved to repos/your-app/deliverables/:
code_analysis_deliverable.md- Source code reconnaissancerecon_deliverable.md- Attack surface mapping*_analysis_deliverable.md- Vulnerability hypotheses*_exploitation_evidence.md- Proven exploits with PoCscomprehensive_security_assessment_report.md- Final report
The most deliberately insecure web application in existence
Achievements:
- β Complete authentication bypass via SQL injection
- β Full database exfiltration (all users, passwords, cards)
- β Admin account creation through registration workflow bypass
- β Privilege escalation to administrator access
- β IDOR vulnerabilities across cart and profile management
- β SSRF for internal network reconnaissance
Checkmarx's intentionally vulnerable API for OWASP API Top 10 testing
Achievements:
- β Root-level command injection via debug endpoint
- β Authentication bypass using legacy v1 API endpoint
- β Mass assignment to escalate user to admin
- β Zero false positives (correctly identified XSS defenses)
Modern vulnerable API designed for OWASP API Security Top 10
Achievements:
- β JWT attacks (Algorithm Confusion, alg:none, weak key)
- β SQL injection for full database compromise
- β SSRF to forward internal auth tokens externally
- β High accuracy with zero XSS false positives
Shaart executes real attacks. This is not passive scanning.
- Production environments
- Systems you don't own
- Applications without explicit authorization
- Local development setups
- Staging environments
- Sandboxed test instances
- Your own applications with proper authorization
- Written authorization required from system owner
- Unauthorized testing violates Computer Fraud and Abuse Act (CFAA)
- User assumes all liability for misuse
- Creates test accounts and users
- Modifies application data
- May trigger side effects (emails, webhooks, etc.)
- Can delete or corrupt test data
You are responsible for ensuring proper authorization and environment selection.
Shaart emulates a human penetration tester's workflow using specialized AI agents orchestrated across four phases.
Phase 1: Reconnaissance β Source code analysis + live application exploration β Maps attack surface, tech stack, auth mechanisms β Produces comprehensive entry point inventory
Phase 2: Vulnerability Analysis (Parallel) β Specialized agents per OWASP category β Data flow tracing from user input to dangerous sinks β Generates hypothesized exploitation paths
Phase 3: Exploitation (Parallel) β Attempts real-world attacks via browser and CLI β "No Exploit, No Report" policy β Discards unproven hypotheses as false positives
Phase 4: Reporting β Consolidates only verified findings β Includes reproducible proof-of-concepts β Delivers pentester-grade actionable reports
Time: 1-1.5 hours for a typical web application Cost: ~$50 USD using Claude 4.5 Sonnet (subject to change) Token Optimization: Multi-model strategy (Haiku for analysis, Sonnet for exploitation)
Track active development and request features:
- Open Issues
- Feature Roadmap
- Join Discord to influence priorities
- π Report Bugs
- π¬ Discord Community
- π‘ Feature Requests
- π¦ @KeygraphHQ on Twitter
- πΌ Keygraph on LinkedIn
- π keygraph.io
Need enterprise features?
- Advanced data flow analysis engine
- Pre-built CI/CD integrations
- Multiple export formats (PDF, JSON, JIRA)
- Dedicated support with SLAs
- Compliance-ready audit reports
π Express interest in enterprise features π§ shaart@keygraph.io
Shaart: AGPL-3.0
- β Free for internal security testing
- β Private modifications allowed
β οΈ Network service providers must open-source modifications
Built by Keygraph
Autonomous security for the AI era