Truth Verification System

Overview

The Truth Verification System is a framework that provides verification and truth scoring for multi-agent operations in Claude-Flow. It includes real verification checks, training integration for continuous improvement, and practical tools for quality assurance.

Important Note: This system is currently a functional prototype that demonstrates verification concepts. While it performs real checks (compile, test, lint), full integration with all agent operations is still in development.

Quick Start

Initialize Verification System

# Initialize with specific mode (local version)
./claude-flow verify init strict      # Production mode (0.95 threshold)
./claude-flow verify init moderate    # Default mode (0.85 threshold)
./claude-flow verify init development # Development mode (0.75 threshold)

# After npm publish, will be available as:
npx claude-flow@alpha verify init strict

Current Implementation Status

✅ What's Working

Real Verification Checks
- Runs actual npm run typecheck, npm test, npm run lint
- Returns real scores based on actual command output
- Stores verification history in .swarm/verification-memory.json

Truth Scoring & Reporting

./claude-flow truth              # Basic truth scores
./claude-flow truth --report     # Detailed breakdown
./claude-flow truth --analyze    # Failure pattern analysis
./claude-flow truth --json       # Machine-readable output
./claude-flow truth --export report.json  # Export to file

Training Integration (NEW!)

./claude-flow verify-train status     # Show training status
./claude-flow verify-train feed       # Feed verification data to training
./claude-flow verify-train predict    # Predict verification outcomes
./claude-flow verify-train recommend  # Get agent recommendations

Verification Hooks

# Manual verification hooks
node src/cli/simple-commands/verification-hooks.js pre task-123 coder
node src/cli/simple-commands/verification-hooks.js post task-123 coder
node src/cli/simple-commands/verification-hooks.js status

⚠️ What's In Progress

Automatic Integration: Verification isn't automatically called by swarm/agent commands yet
Limited Agent Types: Only 4 agent types have specific verification logic
Consensus Features: Multi-agent consensus is simulated, not implemented

Verification-Training Integration

How Learning Works

The system uses real machine learning to improve over time:

Exponential Moving Average: Updates agent reliability scores with learning rate of 0.1
Pattern Recognition: Tracks which checks (compile, test, lint) succeed/fail most often
Trend Detection: Identifies if agents are improving or declining
Predictive Scoring: Predicts future verification outcomes based on history

Example: System Learning in Action

# Initial state: coder agent at 62.5% reliability
./claude-flow verify-train status

# Feed verification data to training
./claude-flow verify-train feed

# After 10 successful verifications:
# - Coder reliability: 62.5% → 81.5% 
# - System shows: "📈 Agent coder is improving (+0.289)"
# - Prediction changes: "use_different_agent" → "add_additional_checks"
# - Confidence increases: 60% → 70%

Training Data Storage

.claude-flow/
├── training/
│   └── verification-data.jsonl    # Training data in JSONL format
├── models/
│   ├── verification-model.json    # Main learning model
│   └── agent-coder.json          # Agent-specific models
└── metrics/
    └── agent-performance.json     # Performance metrics

Actual Verification Process

What Happens During Verification

// For Coder Agents:
1. compile:   Runs 'npm run typecheck' → Score based on errors
2. test:      Runs 'npm test' → Score based on pass/fail
3. lint:      Runs 'npm run lint' → Score based on warnings/errors
4. typecheck: Runs 'npm run typecheck' → Score based on errors

// Scores:
- No errors: 1.0
- Warnings only: 0.8
- Errors: 0.5
- Command fails: 0.3

Real Rollback Mechanism

When verification fails and rollback is enabled:

# Set environment variable
export VERIFICATION_ROLLBACK=true

# If verification fails, system runs:
git reset --hard HEAD

# You'll see:
"🔄 Attempting rollback..."
"✅ Rollback completed"

CLI Commands

Verification Commands

# Initialize verification system
./claude-flow verify init <mode>

# Run verification on a task
./claude-flow verify verify task-123 --agent coder

# Check verification status
./claude-flow verify status

Truth Scoring Commands

# Basic truth report
./claude-flow truth

# Detailed analysis options
./claude-flow truth --report           # Detailed breakdown
./claude-flow truth --analyze          # Failure patterns
./claude-flow truth --agent coder      # Agent-specific
./claude-flow truth --detailed         # With history
./claude-flow truth --json            # JSON output only
./claude-flow truth --export file.json # Export to file

Training Integration Commands

# Check training status
./claude-flow verify-train status

# Feed existing verifications to training
./claude-flow verify-train feed

# Predict if a task will pass
./claude-flow verify-train predict default coder
# Output: Predicted Score: 0.613, Confidence: 70%

# Get agent recommendation
./claude-flow verify-train recommend
# Output: Recommended: coder, Reliability: 81.5%

# Get improvement recommendations
./claude-flow verify-train recommendations

Environment Variables

# Set verification mode
export VERIFICATION_MODE=strict        # strict/moderate/development

# Enable automatic rollback
export VERIFICATION_ROLLBACK=true

# Custom threshold
export VERIFICATION_THRESHOLD=0.95

Integration Examples

Manual Verification in Scripts

#!/bin/bash
# Run pre-task verification
node src/cli/simple-commands/verification-hooks.js pre $TASK_ID coder

# Execute task
npm run build

# Run post-task verification
node src/cli/simple-commands/verification-hooks.js post $TASK_ID coder

# Feed results to training
node src/cli/simple-commands/verification-hooks.js train $TASK_ID coder

CI/CD Integration

name: Verification Pipeline
on: [push]

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Run Verification
        run: |
          ./claude-flow verify init strict
          ./claude-flow verify verify ${{ github.sha }} --agent coder
      
      - name: Check Truth Score
        run: |
          SCORE=$(./claude-flow truth --json | jq .averageScore)
          echo "Truth score: $SCORE"
          if (( $(echo "$SCORE < 0.85" | bc -l) )); then
            exit 1
          fi

How Training Improves the System

Learning Cycle

Verification Runs → Generates scores (pass/fail)
Training Ingests → Updates agent reliability scores
Pattern Detection → Identifies common failure points
Prediction Improves → Better task outcome predictions
Recommendations Update → Suggests best agents for tasks

Agent Performance Tracking

// Example: .claude-flow/models/agent-coder.json
{
  "agentType": "coder",
  "totalTasks": 72,
  "successfulTasks": 12,
  "averageScore": 0.815,
  "trend": {
    "direction": "improving",
    "change": 0.289,
    "recentAverage": 0.914,
    "previousAverage": 0.625
  },
  "checkPerformance": {
    "compile": { "total": 20, "passed": 15, "avgScore": 0.75 },
    "test": { "total": 20, "passed": 10, "avgScore": 0.50 },
    "lint": { "total": 20, "passed": 18, "avgScore": 0.90 }
  }
}

Current Limitations

Integration Gaps

Not Auto-Integrated: You need to manually run verification commands
Basic Checks Only: Runs npm scripts, not sophisticated analysis
Limited Agent Types: Only coder, reviewer, tester, architect have specific logic
No Real Consensus: Multi-agent consensus is simulated

What's Simulated vs Real

Feature	Status	Details
Compile Check	✅ Real	Runs actual `npm run typecheck`
Test Execution	✅ Real	Runs actual `npm test`
Lint Check	✅ Real	Runs actual `npm run lint`
Git Rollback	✅ Real	Runs actual `git reset --hard`
Training System	✅ Real	Real learning with persistence
Agent Consensus	❌ Simulated	Returns hardcoded values
Byzantine Tolerance	❌ Simulated	Not implemented
Cryptographic Signing	❌ Simulated	Not implemented

Best Practices

1. Regular Training Updates

# Daily: Feed new verification data to training
./claude-flow verify-train feed

# Weekly: Check training recommendations
./claude-flow verify-train recommendations

# Monthly: Review agent performance trends
./claude-flow verify-train status

2. Monitor Agent Reliability

# Check which agents are performing well
./claude-flow verify-train status | grep "Agent Reliability"

# Get recommendations for underperforming agents
./claude-flow verify-train recommendations

3. Use Predictions for Task Planning

# Before assigning a task, check predicted success
./claude-flow verify-train predict default coder

# If prediction is low, consider:
# - Using a different agent
# - Adding additional checks
# - Reviewing recent failures

Troubleshooting

Low Truth Scores

# Analyze what's failing
./claude-flow truth --analyze

# Check specific agent
./claude-flow truth --agent coder --detailed

# Review training recommendations
./claude-flow verify-train recommendations

Verification Not Running

# Check if npm scripts exist
npm run typecheck  # Should be defined in package.json
npm run test       # Should be defined in package.json
npm run lint       # Should be defined in package.json

# Run manual verification
node src/cli/simple-commands/verification-hooks.js status

Training Not Improving

# Check if data is being fed
./claude-flow verify-train status
# Look for "Training Data Points" - should be increasing

# Manually trigger learning
./claude-flow verify-train feed

# Check agent trends
cat .claude-flow/models/agent-coder.json | jq '.trend'

Future Development

Planned Improvements

Auto-Integration: Automatic verification for all agent operations
Deep Code Analysis: AST-based verification, not just npm scripts
Real Consensus: Actual multi-agent voting system
Smart Rollback: Selective rollback of only failed changes
Dashboard UI: Web interface for monitoring verification metrics

How to Contribute

The verification system is ready for enhancement. Key areas:

Integration Points: Add verification hooks to swarm/agent commands
Check Types: Add more verification checks beyond compile/test/lint
Agent Types: Add verification logic for more agent types
Training Models: Improve prediction algorithms

Truth Verification System

Truth Verification System

Overview

Quick Start

Initialize Verification System

Current Implementation Status

✅ What's Working

⚠️ What's In Progress

Verification-Training Integration

How Learning Works

Example: System Learning in Action

Training Data Storage

Actual Verification Process

What Happens During Verification

Real Rollback Mechanism

CLI Commands

Verification Commands

Truth Scoring Commands

Training Integration Commands

Environment Variables

Integration Examples

Manual Verification in Scripts

CI/CD Integration

How Training Improves the System

Learning Cycle

Agent Performance Tracking

Current Limitations

Integration Gaps

What's Simulated vs Real

Best Practices

1. Regular Training Updates

2. Monitor Agent Reliability

3. Use Predictions for Task Planning

Troubleshooting

Low Truth Scores

Verification Not Running

Training Not Improving

Future Development

Planned Improvements

How to Contribute

Related Documentation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!