Last Updated: 2026-04-12 Status: Production Ready
This document describes the standardized bug fix process at Atom, following Test-Driven Development (TDD) principles. All bug fixes must follow the red-green-refactor cycle to ensure quality and prevent regressions.
Why TDD for Bug Fixes?
- Prevents Regression: Failing tests ensure bugs don't reoccur
- Documents Intent: Tests explain what the code should do
- Enables Refactoring: Safe code improvements with test safety net
- Catches Root Causes: Tests often reveal deeper issues
Core Principle: Never fix a bug without a test that fails before the fix and passes after.
Goal: Reproduce the bug with a failing test.
Steps:
-
Understand the Bug
- Read bug report carefully
- Reproduce the bug manually
- Identify the affected component
- Understand expected vs. actual behavior
-
Write a Minimal Test Case
# Example: Bug - Agent maturity validation not working def test_agent_maturity_validation_blocks_demotion(): """AUTONOMOUS agents cannot be demoted to STUDENT.""" agent = AgentRegistry( id="test-auto", maturity=AgentMaturity.AUTONOMOUS ) service = AgentGovernanceService(db_session) # This should fail but currently doesn't (BUG) with pytest.raises(ValueError, match="Invalid maturity transition"): service.update_maturity("test-auto", AgentMaturity.STUDENT)
-
Verify Test Fails
pytest tests/test_bug_fix.py -v # Expected: FAILED (test identifies the bug) -
Commit Failing Test
git add tests/test_bug_fix.py git commit -m "test: add failing test for maturity validation bug"
Success Criteria:
- Test reproduces the bug
- Test fails with clear error message
- Test name describes the bug
- Test is isolated and independent
Goal: Fix the bug with minimal code changes.
Steps:
-
Identify Root Cause
- Use debugger or print statements
- Trace code execution
- Find where expected behavior diverges
- Don't just patch symptoms
-
Write Minimal Fix
# Example fix in agent_governance_service.py def update_maturity(self, agent_id: str, new_maturity: AgentMaturity): agent = self.get_agent(agent_id) if not agent: raise ValueError(f"Agent {agent_id} not found") # FIX: Add validation to prevent demotion if self._is_demotion(agent.maturity, new_maturity): raise ValueError( f"Invalid maturity transition: " f"{agent.maturity} → {new_maturity}" ) agent.maturity = new_maturity self.db.commit() return agent
-
Run Test
pytest tests/test_bug_fix.py -v # Expected: PASSED (fix works) -
Run Full Test Suite
pytest tests/ -v # Ensure no regressions -
Commit Fix
git add core/agent_governance_service.py git commit -m "fix: validate maturity transitions to prevent demotion"
Success Criteria:
- Test passes
- Full test suite passes (no regressions)
- Fix is minimal and focused
- Fix addresses root cause
Goal: Improve code quality while maintaining behavior.
Steps:
-
Look for Code Smells
- Duplicate code
- Long methods
- Complex conditionals
- Poor naming
-
Refactor with Test Safety Net
# Extract validation logic def _is_demotion(self, current: AgentMaturity, new: AgentMaturity) -> bool: """Check if transition is a demotion.""" maturity_levels = { AgentMaturity.STUDENT: 1, AgentMaturity.INTERN: 2, AgentMaturity.SUPERVISED: 3, AgentMaturity.AUTONOMOUS: 4 } return maturity_levels[new] < maturity_levels[current]
-
Verify Tests Still Pass
pytest tests/test_bug_fix.py tests/test_governance.py -v
-
Commit Refactoring
git add core/agent_governance_service.py git commit -m "refactor: extract maturity level comparison logic"
Success Criteria:
- All tests still pass
- Code is cleaner and more maintainable
- No behavior changes
- Better code organization
Symptom: Invalid data causes crashes or wrong behavior.
Test:
def test_invalid_input_raises_error():
with pytest.raises(ValueError):
process_agent_id(None) # Should reject NoneFix:
def process_agent_id(agent_id: str):
if not agent_id:
raise ValueError("agent_id cannot be None or empty")
# ... rest of logicSymptom: Code fails on boundary conditions (empty, zero, negative).
Test:
def test_empty_list_handling():
result = get_top_agents([]) # Empty input
assert result == [] # Should handle gracefullyFix:
def get_top_agents(agents):
if not agents:
return [] # Handle edge case
# ... rest of logicSymptom: Unexpected side effects modify shared state.
Test:
def test_no_state_mutation():
original = {"key": "value"}
process_dict(original)
assert original == {"key": "value"} # Should not changeFix:
def process_dict(data):
copy = data.copy() # Don't mutate input
# ... work with copySymptom: Components don't work together correctly.
Test:
def test_workflow_execution_integration():
agent = create_test_agent()
workflow = create_test_workflow()
result = execute_workflow(workflow.id, agent.id)
assert result.status == "completed"
assert agent.maturity == AgentMaturity.AUTONOMOUSFix:
def execute_workflow(workflow_id, agent_id):
# Fix integration issue between components
workflow = get_workflow(workflow_id)
agent = get_agent(agent_id)
# Ensure proper state synchronization
if not agent.can_execute(workflow):
raise PermissionError(...)
# ... rest of logic- Bug report is clear and reproducible
- Affected component identified
- Expected behavior understood
- Related tests reviewed
- Failing test written (Red phase)
- Test committed before fix
- Minimal fix implemented (Green phase)
- Full test suite passes
- Code refactored if needed (Refactor phase)
- All tests still pass
- Bug verified as fixed
- No regressions introduced
- Documentation updated (if needed)
- PR created with clear description
- Code review completed
- Tests pass in CI/CD
Automated Enforcement:
- Test Pass Rate Gate: All tests must pass (100%)
- Coverage Gate: Coverage must not decrease
- Build Gate: Build fails if any test fails
Workflow:
Push → Quality Gate → 100% Pass Required → Merge Allowed
↓
Any Failure → Block Merge
Bug: STUDENT agents can execute HIGH complexity actions.
Red Phase:
def test_student_blocked_from_high_complexity():
student = AgentRegistry(id="student", maturity=AgentMaturity.STUDENT)
service = AgentGovernanceService(db)
result = service.check_permission(student, "execute_workflow", complexity=3)
assert result.granted == False # Currently fails (BUG)Green Phase:
# Fix in agent_governance_service.py
def check_permission(self, agent, action, complexity=1):
if agent.maturity == AgentMaturity.STUDENT and complexity > 1:
return PermissionResult(granted=False, reason="STUDENT not allowed")
# ... rest of logicRefactor Phase:
# Extract maturity level mapping
MATURITY_LEVELS = {
AgentMaturity.STUDENT: 1,
AgentMaturity.INTERN: 2,
AgentMaturity.SUPERVISED: 3,
AgentMaturity.AUTONOMOUS: 4
}
def _check_complexity_allowed(self, maturity, complexity):
return MATURITY_LEVELS[maturity] >= complexityBug: Coverage decreased from 75% to 73% after refactor.
Red Phase:
def test_new_function_is_covered():
result = new_function("test")
assert result == "expected" # Currently uncoveredGreen Phase:
# Add test to cover new function
# Coverage increases back to 75%Possible Causes:
- Test not actually testing the bug
- Bug already fixed elsewhere
- Test setup incorrect
Solutions:
- Add debug print to verify test path
- Check test isolation
- Verify bug reproduction steps
Possible Causes:
- Fix too broad
- Shared state mutation
- Test dependency
Solutions:
- Narrow fix scope
- Use test fixtures for isolation
- Check for test coupling
Possible Causes:
- Environment-specific bug
- Race condition
- Incomplete bug report
Solutions:
- Add more context to bug report
- Check logs and traces
- Use debugger to trace execution
-
Test First, Always
- Never write code without a failing test
- Commit test before fix
- Keep tests independent
-
Fix Root Causes
- Don't patch symptoms
- Understand why bug occurred
- Prevent similar bugs
-
Keep Fixes Small
- One fix per commit
- Minimal code changes
- Easy to review and revert
-
Maintain Coverage
- Never decrease coverage
- Add tests for new code
- Cover edge cases
-
Document Decisions
- Why bug occurred
- Why fix approach chosen
- Lessons learned
- TDD Workflow:
docs/TDD_WORKFLOW.md - Testing Guide:
TESTING.md - Quality Gates:
.github/workflows/quality-gate.yml - Quality Dashboard:
docs/QUALITY_DASHBOARD.md
Bug Fix Process Version: 1.0 Last Updated: 2026-04-12 Maintained By: Development Team