|
1 | 1 | # Python Automated Testing Pipeline |
2 | 2 |
|
3 | | -A three-agent system for automated Python testing, security analysis, and coverage improvement. |
| 3 | +Current docs for the CLI, API, GUI, and extension-facing parts of the pipeline. |
4 | 4 |
|
5 | | -## Overview |
| 5 | +## Flow |
6 | 6 |
|
7 | | -This pipeline uses three specialized AI agents to ensure code quality: |
8 | | - |
9 | | -1. **Identification Agent**: Finds test scenarios (edge cases, security, critical paths). |
10 | | -2. **Implementation Agent**: Generates PyTest scripts with security awareness. |
11 | | -3. **Evaluation Agent**: Runs tests, checks coverage (target 90%), and analyzes security. |
12 | | - |
13 | | -**Key Features:** |
14 | | - |
15 | | -- **Auto-Improvement**: Iteratively generates tests until coverage goals are met. |
16 | | -- **Security Analysis**: Detects SQLi, XSS, secrets, and more. |
17 | | -- **Robustness**: Auto-fixes syntax errors, rotates API keys, and handles rate limits. |
18 | | - |
19 | | -## Architecture |
20 | | - |
21 | | -``` |
22 | | -┌─────────────────────────────────────────────────────────────────────────┐ |
23 | | -│ Python Testing Pipeline │ |
24 | | -├─────────────────────────────────────────────────────────────────────────┤ |
25 | | -│ ┌──────────────────┐ JSON ┌──────────────────┐ │ |
26 | | -│ │ Identification │ ─────────▶ │ Human Approval │ │ |
27 | | -│ │ Agent │ │ (Review) │ │ |
28 | | -│ └──────────────────┘ └────────┬─────────┘ │ |
29 | | -│ │ ▼ │ |
30 | | -│ │ ┌──────────────────┐ │ |
31 | | -│ │ │ Implementation │ │ |
32 | | -│ │ │ Agent │ │ |
33 | | -│ │ └────────┬─────────┘ │ |
34 | | -│ │ ▼ │ |
35 | | -│ │ ┌──────────────────┐ │ |
36 | | -│ │ │ Evaluation │◀────────────┐ │ |
37 | | -│ │ │ + Security │ │ │ |
38 | | -│ │ └────────┬─────────┘ │ │ |
39 | | -│ │ │ Coverage < 90%? │ │ |
40 | | -│ │ ▼ Yes │ │ |
41 | | -│ │ ┌──────────────────┐ │ │ |
42 | | -│ │ │ Generate More │─────────────┘ │ |
43 | | -│ │ │ Tests │ │ |
44 | | -│ │ └──────────────────┘ │ |
45 | | -└─────────────────────────────────────────────────────────────────────────┘ |
46 | | -``` |
47 | | - |
48 | | -## Usage |
49 | | - |
50 | | -### VS Code Integration |
51 | | - |
52 | | -Use the command in Copilot Chat: |
53 | | - |
54 | | -``` |
55 | | -@workspace /generatePythonTests ./my_project |
56 | | -``` |
57 | | - |
58 | | -### CLI Usage |
59 | | - |
60 | | -Run the standalone script: |
61 | | - |
62 | | -```bash |
63 | | -# Basic usage |
64 | | -python pythonTestingPipeline.py ./my_project |
65 | | - |
66 | | -# Common options |
67 | | -python pythonTestingPipeline.py ./my_project --coverage # Measure coverage |
68 | | -python pythonTestingPipeline.py ./my_project --auto-approve # Skip manual review |
69 | | -python pythonTestingPipeline.py ./my_project --no-run-tests # Generate only |
| 7 | +```text |
| 8 | +Identify -> Approve or refine -> Implement -> Run tests -> Evaluate |
| 9 | + ^ | |
| 10 | + |------ improve loop -----| |
| 11 | +Artifacts: tests, prompts, report, governance, coverage report |
70 | 12 | ``` |
71 | 13 |
|
72 | | -## Configuration |
73 | | - |
74 | | -**Requirements:** |
75 | | - |
76 | | -- Python 3.10+ |
77 | | -- `pip install pytest pytest-cov openai matplotlib` |
78 | | -- VS Code + GitHub Copilot (for extension usage) |
79 | | - |
80 | | -**LLM Setup:** |
81 | | -Configure `scripts/llm_config.py` and `scripts/.env`. |
82 | | - |
83 | | -- **Keys**: `GROQ_API_KEY`, `GROQ_API_KEY_1`, etc. (auto-rotates on 429 errors). |
84 | | -- **Models**: Defaults to `openai/gpt-oss-120b`, falls back to `groq/compound`, `llama`, etc. |
85 | | - |
86 | | -## Agents & Communication |
87 | | - |
88 | | -Agents communicate via JSON. |
89 | | - |
90 | | -- **Identification**: Outputs `test_scenarios` (description, priority). |
91 | | -- **Implementation**: Receives scenarios, outputs raw PyTest code. |
92 | | -- **Evaluation**: Outputs `execution_summary`, `code_coverage_percentage`, and `security_issues`. |
93 | | - |
94 | | -**Security Checks:** |
95 | | -The pipeline flags **Critical** to **Low** severity issues including: |
96 | | - |
97 | | -- SQL/Command Injection & XSS |
98 | | -- Path Traversal & Data Exposure |
99 | | -- Weak Authentication & Hardcoded Secrets |
100 | | - |
101 | | -## Contributing |
102 | | - |
103 | | -1. Follow existing patterns. |
104 | | -2. Add unit tests (`npm run test:unit`). |
105 | | -3. Ensure TypeScript compilation passes. |
| 14 | +## Entry Points |
| 15 | + |
| 16 | +- CLI |
| 17 | + Run: |
| 18 | + `python src/extension/pythonTestingPipeline/scripts/pythonTestingPipeline.py <codebase_path>` |
| 19 | + Common options: `--auto-approve`, `--no-run-tests`, `--output-dir`, `--model`. |
| 20 | + `--coverage` and `--run-tests` are compatibility flags. |
| 21 | +- API |
| 22 | + Run: |
| 23 | + `uvicorn src.extension.api.main:app --reload` |
| 24 | +- GUI |
| 25 | + Run: |
| 26 | + `python -m src.extension.GUI.main` |
| 27 | +- VS Code extension |
| 28 | + The command palette action is `Agentic Testing: Generate Tests` |
| 29 | + (`agentic-testing.generateTests`). |
| 30 | + It currently handles folder selection and progress UI, but it does not yet run |
| 31 | + the full end-to-end pipeline. |
| 32 | +- Internal model tools |
| 33 | + `generatePythonTests`, `implementPythonTests`, and `evaluatePythonTests` are |
| 34 | + internal tools, not public slash commands. |
| 35 | + |
| 36 | +## Outputs |
| 37 | + |
| 38 | +Default output location: `<codebase_path>/tests`, unless `--output-dir` is set. |
| 39 | + |
| 40 | +Generated artifacts may include: |
| 41 | +- `test_generated_<timestamp>.py` |
| 42 | +- `prompts_<run_id>.json` |
| 43 | +- `report_<run_id>.md` |
| 44 | +- `governance_<run_id>.json` |
| 45 | +- `coverage_report_<run_id>.json` |
| 46 | + |
| 47 | +## Runtime Notes |
| 48 | + |
| 49 | +- Python 3.10+ is the practical baseline for the code in `src/extension`. |
| 50 | +- Pipeline-related Python dependencies live in `requirements.txt`. |
| 51 | +- `scripts/llm_config.py` is the source of truth for model ordering. |
| 52 | +- Current model selection prefers Ollama-hosted models first, then Groq-backed |
| 53 | + fallbacks. |
| 54 | +- `GROQ_API_KEY`, `GROQ_API_KEY_1`, and similar variables are rotated when the |
| 55 | + client needs another key. |
| 56 | +- Safety checks are implemented in `scripts/prompt_safety.py`. |
| 57 | + |
| 58 | +## Current Caveats |
| 59 | + |
| 60 | +- Coverage is effectively always collected when generated tests are executed. |
| 61 | +- API prompt-history discovery can legitimately return no runs. |
| 62 | +- API pipeline status is in-memory only. |
| 63 | +- The Python CLI is the most complete execution path today. |
| 64 | + |
| 65 | +## Keep In Sync |
| 66 | + |
| 67 | +When updating docs here, cross-check: |
| 68 | +- `scripts/pythonTestingPipeline.py` |
| 69 | +- `scripts/llm_config.py` |
| 70 | +- `src/extension/api/main.py` |
| 71 | +- `src/extension/api/schemas.py` |
| 72 | +- `package.json` |
| 73 | +- `src/extension.ts` |
0 commit comments