Skip to content

Commit 14df6cd

Browse files
committed
Update python testing pipeline scripts and services
1 parent 18d878a commit 14df6cd

5 files changed

Lines changed: 91 additions & 108 deletions

File tree

Lines changed: 67 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -1,105 +1,73 @@
11
# Python Automated Testing Pipeline
22

3-
A three-agent system for automated Python testing, security analysis, and coverage improvement.
3+
Current docs for the CLI, API, GUI, and extension-facing parts of the pipeline.
44

5-
## Overview
5+
## Flow
66

7-
This pipeline uses three specialized AI agents to ensure code quality:
8-
9-
1. **Identification Agent**: Finds test scenarios (edge cases, security, critical paths).
10-
2. **Implementation Agent**: Generates PyTest scripts with security awareness.
11-
3. **Evaluation Agent**: Runs tests, checks coverage (target 90%), and analyzes security.
12-
13-
**Key Features:**
14-
15-
- **Auto-Improvement**: Iteratively generates tests until coverage goals are met.
16-
- **Security Analysis**: Detects SQLi, XSS, secrets, and more.
17-
- **Robustness**: Auto-fixes syntax errors, rotates API keys, and handles rate limits.
18-
19-
## Architecture
20-
21-
```
22-
┌─────────────────────────────────────────────────────────────────────────┐
23-
│ Python Testing Pipeline │
24-
├─────────────────────────────────────────────────────────────────────────┤
25-
│ ┌──────────────────┐ JSON ┌──────────────────┐ │
26-
│ │ Identification │ ─────────▶ │ Human Approval │ │
27-
│ │ Agent │ │ (Review) │ │
28-
│ └──────────────────┘ └────────┬─────────┘ │
29-
│ │ ▼ │
30-
│ │ ┌──────────────────┐ │
31-
│ │ │ Implementation │ │
32-
│ │ │ Agent │ │
33-
│ │ └────────┬─────────┘ │
34-
│ │ ▼ │
35-
│ │ ┌──────────────────┐ │
36-
│ │ │ Evaluation │◀────────────┐ │
37-
│ │ │ + Security │ │ │
38-
│ │ └────────┬─────────┘ │ │
39-
│ │ │ Coverage < 90%? │ │
40-
│ │ ▼ Yes │ │
41-
│ │ ┌──────────────────┐ │ │
42-
│ │ │ Generate More │─────────────┘ │
43-
│ │ │ Tests │ │
44-
│ │ └──────────────────┘ │
45-
└─────────────────────────────────────────────────────────────────────────┘
46-
```
47-
48-
## Usage
49-
50-
### VS Code Integration
51-
52-
Use the command in Copilot Chat:
53-
54-
```
55-
@workspace /generatePythonTests ./my_project
56-
```
57-
58-
### CLI Usage
59-
60-
Run the standalone script:
61-
62-
```bash
63-
# Basic usage
64-
python pythonTestingPipeline.py ./my_project
65-
66-
# Common options
67-
python pythonTestingPipeline.py ./my_project --coverage # Measure coverage
68-
python pythonTestingPipeline.py ./my_project --auto-approve # Skip manual review
69-
python pythonTestingPipeline.py ./my_project --no-run-tests # Generate only
7+
```text
8+
Identify -> Approve or refine -> Implement -> Run tests -> Evaluate
9+
^ |
10+
|------ improve loop -----|
11+
Artifacts: tests, prompts, report, governance, coverage report
7012
```
7113

72-
## Configuration
73-
74-
**Requirements:**
75-
76-
- Python 3.10+
77-
- `pip install pytest pytest-cov openai matplotlib`
78-
- VS Code + GitHub Copilot (for extension usage)
79-
80-
**LLM Setup:**
81-
Configure `scripts/llm_config.py` and `scripts/.env`.
82-
83-
- **Keys**: `GROQ_API_KEY`, `GROQ_API_KEY_1`, etc. (auto-rotates on 429 errors).
84-
- **Models**: Defaults to `openai/gpt-oss-120b`, falls back to `groq/compound`, `llama`, etc.
85-
86-
## Agents & Communication
87-
88-
Agents communicate via JSON.
89-
90-
- **Identification**: Outputs `test_scenarios` (description, priority).
91-
- **Implementation**: Receives scenarios, outputs raw PyTest code.
92-
- **Evaluation**: Outputs `execution_summary`, `code_coverage_percentage`, and `security_issues`.
93-
94-
**Security Checks:**
95-
The pipeline flags **Critical** to **Low** severity issues including:
96-
97-
- SQL/Command Injection & XSS
98-
- Path Traversal & Data Exposure
99-
- Weak Authentication & Hardcoded Secrets
100-
101-
## Contributing
102-
103-
1. Follow existing patterns.
104-
2. Add unit tests (`npm run test:unit`).
105-
3. Ensure TypeScript compilation passes.
14+
## Entry Points
15+
16+
- CLI
17+
Run:
18+
`python src/extension/pythonTestingPipeline/scripts/pythonTestingPipeline.py <codebase_path>`
19+
Common options: `--auto-approve`, `--no-run-tests`, `--output-dir`, `--model`.
20+
`--coverage` and `--run-tests` are compatibility flags.
21+
- API
22+
Run:
23+
`uvicorn src.extension.api.main:app --reload`
24+
- GUI
25+
Run:
26+
`python -m src.extension.GUI.main`
27+
- VS Code extension
28+
The command palette action is `Agentic Testing: Generate Tests`
29+
(`agentic-testing.generateTests`).
30+
It currently handles folder selection and progress UI, but it does not yet run
31+
the full end-to-end pipeline.
32+
- Internal model tools
33+
`generatePythonTests`, `implementPythonTests`, and `evaluatePythonTests` are
34+
internal tools, not public slash commands.
35+
36+
## Outputs
37+
38+
Default output location: `<codebase_path>/tests`, unless `--output-dir` is set.
39+
40+
Generated artifacts may include:
41+
- `test_generated_<timestamp>.py`
42+
- `prompts_<run_id>.json`
43+
- `report_<run_id>.md`
44+
- `governance_<run_id>.json`
45+
- `coverage_report_<run_id>.json`
46+
47+
## Runtime Notes
48+
49+
- Python 3.10+ is the practical baseline for the code in `src/extension`.
50+
- Pipeline-related Python dependencies live in `requirements.txt`.
51+
- `scripts/llm_config.py` is the source of truth for model ordering.
52+
- Current model selection prefers Ollama-hosted models first, then Groq-backed
53+
fallbacks.
54+
- `GROQ_API_KEY`, `GROQ_API_KEY_1`, and similar variables are rotated when the
55+
client needs another key.
56+
- Safety checks are implemented in `scripts/prompt_safety.py`.
57+
58+
## Current Caveats
59+
60+
- Coverage is effectively always collected when generated tests are executed.
61+
- API prompt-history discovery can legitimately return no runs.
62+
- API pipeline status is in-memory only.
63+
- The Python CLI is the most complete execution path today.
64+
65+
## Keep In Sync
66+
67+
When updating docs here, cross-check:
68+
- `scripts/pythonTestingPipeline.py`
69+
- `scripts/llm_config.py`
70+
- `src/extension/api/main.py`
71+
- `src/extension/api/schemas.py`
72+
- `package.json`
73+
- `src/extension.ts`

src/extension/pythonTestingPipeline/common/pythonTestingPipelineService.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,10 +47,11 @@ export interface IPythonTestingPipelineService {
4747
identifyTestScenarios(codebasePath: string, targetFiles: readonly string[] | undefined, token: CancellationToken): Promise<ITestScenariosOutput>;
4848

4949
/**
50-
* Requests human approval for identified scenarios.
50+
* Approval hook for identified scenarios.
51+
* The current node-side implementation is non-interactive and returns the scenarios unchanged.
5152
* @param scenarios The identified test scenarios
5253
* @param token Cancellation token
53-
* @returns Approved scenarios (may be modified by user)
54+
* @returns Approved scenarios
5455
*/
5556
requestApproval(scenarios: ITestScenariosOutput, token: CancellationToken): Promise<ITestScenariosOutput>;
5657

src/extension/pythonTestingPipeline/common/types.ts

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,10 @@ export interface IPythonTestingPipelineOptions {
105105
readonly testOutputDir?: string;
106106
/** Optional: whether to run tests automatically after generation. Defaults to true. */
107107
readonly autoRunTests?: boolean;
108-
/** Optional: collect coverage data when running tests. Defaults to true. */
108+
/**
109+
* Optional compatibility flag for callers that want to express a coverage preference.
110+
* The current node-side pipeline implementation still collects coverage whenever tests run.
111+
*/
109112
readonly collectCoverage?: boolean;
110113
/** Optional: target coverage percentage (default: 90) */
111114
readonly targetCoverage?: number;

src/extension/pythonTestingPipeline/node/pythonTestingPipelineService.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -186,14 +186,14 @@ export class PythonTestingPipelineService implements IPythonTestingPipelineServi
186186
}
187187

188188
/**
189-
* Requests human approval for identified scenarios.
190-
* In a real implementation, this would show a UI for user interaction.
189+
* Placeholder approval step.
190+
* This implementation currently returns the identified scenarios unchanged.
191191
*/
192192
async requestApproval(
193193
scenarios: ITestScenariosOutput,
194194
_token: CancellationToken
195195
): Promise<ITestScenariosOutput> {
196-
// Placeholder - in a real implementation, show UI dialog
196+
// Placeholder - interactive approval is not wired into the node service yet.
197197
return scenarios;
198198
}
199199

src/extension/pythonTestingPipeline/scripts/pythonTestingPipeline.py

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,13 @@
33
Python Automated Testing Pipeline
44
55
Usage:
6-
python pythonTestingPipeline.py <codebase_path> [--coverage] [--auto-approve] \\
7-
[--no-run-tests]
6+
python pythonTestingPipeline.py <codebase_path> [--auto-approve] \\
7+
[--no-run-tests] [--output-dir <dir>] [--model <model>]
88
99
Note:
1010
Generated tests are run by default unless --no-run-tests is supplied.
11+
The --coverage flag is accepted for compatibility, but coverage is already
12+
collected automatically when tests are executed.
1113
1214
Example:
1315
python pythonTestingPipeline.py ./my_project --auto-approve
@@ -475,6 +477,8 @@ def run_pipeline(
475477
) -> dict:
476478
"""
477479
Runs the complete testing pipeline.
480+
481+
Coverage is collected automatically whenever test execution is enabled.
478482
"""
479483
print("=" * 60)
480484
print("🚀 Python Automated Testing Pipeline")
@@ -969,7 +973,14 @@ def main():
969973
action="store_true",
970974
help="Do not run generated tests",
971975
)
972-
parser.add_argument("--coverage", action="store_true", help="Collect coverage")
976+
parser.add_argument(
977+
"--coverage",
978+
action="store_true",
979+
help=(
980+
"Compatibility flag; coverage is already collected automatically "
981+
"when tests run"
982+
),
983+
)
973984
parser.add_argument(
974985
"--auto-approve", action="store_true", help="Auto-approve scenarios"
975986
)

0 commit comments

Comments
 (0)