-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Verification Training Integration
The Verification-Training Integration creates a continuous learning system that improves agent performance over time by learning from verification results. This is a real machine learning implementation that tracks, learns, and predicts based on actual verification outcomes.
The system uses an Exponential Moving Average (EMA) with a learning rate of 0.1:
newReliability = oldReliability * 0.9 + newScore * 0.1This means:
- Recent results have more weight than old results
- System adapts quickly to changes in agent performance
- Reliability scores smooth out temporary fluctuations
# Check current training status
./claude-flow verify-train status
# Feed verification data to training system
./claude-flow verify-train feed
# Predict verification outcome for a task
./claude-flow verify-train predict default coder
# Get agent recommendation
./claude-flow verify-train recommend
# Get improvement recommendations
./claude-flow verify-train recommendations./claude-flow verify-train status
# Output:
# 🤖 Agent Reliability:
# coder: 62.5%./claude-flow verify-train feed
# Output:
# 📊 Learning update for coder: reliability 0.625
# 📉 Agent coder is declining (0.000)# Run simulation with successful verifications
node test-verification.js
# Output:
# 📈 Agent coder is improving (+0.289)
# 📊 Learning update for coder: reliability 0.815./claude-flow verify-train predict default coder
# Before: Recommendation: use_different_agent
# After: Recommendation: add_additional_checks.claude-flow/
├── training/
│ └── verification-data.jsonl # Training data (append-only)
├── models/
│ ├── verification-model.json # Main learning model
│ ├── agent-coder.json # Coder agent model
│ ├── agent-reviewer.json # Reviewer agent model
│ └── agent-tester.json # Tester agent model
└── metrics/
└── agent-performance.json # Performance metrics
{
"version": "1.0.0",
"agentReliability": {
"coder": 0.815,
"reviewer": 0.566,
"tester": 0.750
},
"patterns": {
"coder_success": {
"count": 10,
"avgScore": 0.92,
"checks": {
"compile": { "success": 8, "failure": 2 },
"test": { "success": 6, "failure": 4 },
"lint": { "success": 9, "failure": 1 }
}
},
"coder_failure": {
"count": 30,
"avgScore": 0.625,
"checks": {
"compile": { "success": 10, "failure": 20 },
"test": { "success": 5, "failure": 25 },
"lint": { "success": 15, "failure": 15 }
}
}
}
}{
"agentType": "coder",
"totalTasks": 72,
"successfulTasks": 12,
"averageScore": 0.815,
"scoreHistory": [/* last 100 scores */],
"trend": {
"direction": "improving",
"change": 0.289,
"recentAverage": 0.914,
"previousAverage": 0.625
},
"checkPerformance": {
"compile": { "total": 20, "passed": 15, "avgScore": 0.75 },
"test": { "total": 20, "passed": 10, "avgScore": 0.50 },
"lint": { "total": 20, "passed": 18, "avgScore": 0.90 }
}
}- Tracks success rate for each agent type
- Updates reliability score with each verification
- Maintains historical performance data
- Identifies which checks fail most often
- Tracks success/failure patterns per agent
- Learns which combinations lead to failures
- Compares recent performance to historical
- Identifies improving or declining agents
- Shows performance delta
./claude-flow verify-train predict default coderOutput includes:
- Predicted Score: Expected verification score (0.0-1.0)
- Confidence: How confident the prediction is (based on data points)
- Recommendation: Action to take based on prediction
- Historical Success Rate: Past performance percentage
- Data Points: Number of verifications used for prediction
./claude-flow verify-train recommendSuggests the best agent based on:
- Historical reliability scores
- Recent performance trends
- Task type requirements
The system provides actionable recommendations:
"Retrain coder agent - reliability below 70%"
Triggered when agent reliability falls below threshold.
"Focus training on test, compile for coder"
Identifies specific checks that fail most often.
"Run more verification cycles to improve training accuracy"
Suggests when more data is needed for accurate predictions.
Every verification automatically feeds the training system:
// After each verification:
1. Score is calculated
2. Result fed to training
3. Agent model updated
4. Reliability recalculated
5. Trends detected
6. Predictions updated# Feed all existing verification data
./claude-flow verify-train feed
# This processes all records in .swarm/verification-memory.json- Total Verifications: Number of verification runs
- Pass Rate: Percentage of successful verifications
- Average Score: Mean verification score
- Agent Reliability: Per-agent success rate
- Performance by Hour: Time-based patterns
- Check-Specific Performance: Success rate per check type
# Full status report
./claude-flow verify-train status
# Agent-specific metrics
cat .claude-flow/models/agent-coder.json | jq
# Performance trends
cat .claude-flow/metrics/agent-performance.json | jqBefore assigning a task, check predicted success:
./claude-flow verify-train predict api-development coder
# If low confidence, consider different agentGet the best agent for a task:
./claude-flow verify-train recommend
# Use recommended agent with highest reliabilityTrack agent improvement over time:
./claude-flow verify-train status
# Review trends and reliability scoresIdentify areas needing attention:
./claude-flow verify-train recommendations
# Follow suggestions to improve system# Daily cron job
0 0 * * * cd /project && ./claude-flow verify-train feed# Check for declining agents weekly
./claude-flow verify-train status | grep "declining"# Review and implement suggestions
./claude-flow verify-train recommendations
# Example actions:
# - Retrain underperforming agents
# - Focus on failing check types
# - Collect more verification data# Before critical tasks
PREDICTION=$(./claude-flow verify-train predict critical coder)
echo $PREDICTION | jq '.confidence'
# If confidence < 0.7, add extra verification steps# Check if new data is being added
ls -la .claude-flow/training/verification-data.jsonl
# File should be growing
# Verify learning rate
grep learningRate src/cli/simple-commands/verification-training-integration.js
# Should be 0.1 for balanced learning# Check data points
./claude-flow verify-train status
# Need at least 50+ data points for good predictions
# Review pattern distribution
cat .claude-flow/models/verification-model.json | jq '.patterns'
# Should have both success and failure patterns# Manually trigger update
./claude-flow verify-train feed
# Check model file
cat .claude-flow/models/verification-model.json | jq '.agentReliability'
# Values should change after feeding dataEdit src/cli/simple-commands/verification-training-integration.js:
this.learningRate = 0.1; // Default: 0.1
// Higher = faster learning but more volatile
// Lower = slower learning but more stable// In predictVerificationOutcome()
if (predictedScore < 0.5) {
recommendation = 'use_different_agent';
} else if (predictedScore < 0.75) {
recommendation = 'add_additional_checks';
}// Keep last N scores for trend analysis
if (agentModel.scoreHistory.length > 100) {
agentModel.scoreHistory = agentModel.scoreHistory.slice(-100);
}- Neural Network Integration: Deep learning for complex patterns
- Multi-Factor Predictions: Consider time, complexity, dependencies
- A/B Testing: Compare different learning algorithms
- Collaborative Filtering: Learn from similar projects
- Real-time Dashboards: Web UI for monitoring
- Reinforcement learning for agent selection
- Transfer learning between projects
- Ensemble methods for predictions
- Anomaly detection for unusual failures
The Verification-Training Integration creates a self-improving system that learns from every verification, continuously enhancing agent reliability and task success rates through real machine learning.