feat(python): Python starter checks, fixture test coverage, and real-world validation by brettdavies · Pull Request #15 · brettdavies/agentnative-cli

brettdavies · 2026-04-16T01:19:03Z

Summary

Implements Plan 004: 3 starter Python source checks proving the ast-grep + tree-sitter-python pipeline works end-to-end,
un-ignores 3 fixture integration tests, validates against 4 real-world CLIs (bird, xurl-rs, ripgrep, markitdown), then
resolves 20 review findings (P1/P2 correctness + testing gaps) and adds a Python integration fixture.

Changelog

Added

Add code-bare-except Python source check — detects bare except: clauses without exception types
Add p4-sys-exit Python source check — detects sys.exit() calls outside if __name__ == "__main__": guards and
__main__.py files
Add p6-no-color Python source check — detects NO_COLOR env var handling (Warn, not Fail — behavioral check is the
primary gate)
Add language-parameterized source helpers has_pattern_in(), find_pattern_matches_in(), and has_string_literal_in()
supporting Python and Rust

Fixed

Fix false positive: sys.exit() in __main__.py (Python entry point) no longer flagged
Fix is_main_guard: now handles inline comments, parenthesized guards, no-space operators, and reversed operand
order (e.g. if "__main__" == __name__:)
Fix is_bare_except: restrict parsing to first line of node text (prevents false negatives on error-recovery nodes)
Fix __main__.py skip to check filename component, not path suffix (prevents false skips on files like my__main__.py)
Fix TOCTOU gap in parsed_files lazy initialization (replaced RefCell with OnceLock)
Remove dead except* branch from bare-except detection (PEP 654 makes bare except*: a syntax error)

Type of Change

feat: New feature (non-breaking change which adds functionality)
fix: Bug fix (non-breaking change which fixes an issue)
test: Adding or updating tests

Related Issues/Stories

Plan: docs/plans/2026-04-02-004-feat-python-checks-validation-coverage-plan.md
Design: ~/.gstack/projects/brettdavies-agentnative/brett-main-design-20260327-214808.md (updated)
Review: ce-review run on 2026-04-15 surfaced 20 findings across 7 reviewers (correctness, testing, maintainability,
project-standards, agent-native, learnings-researcher, adversarial) — all addressable findings fixed

Testing

Unit tests added/updated
Integration tests added/updated
Manual testing completed
All tests passing

Test Summary:

Unit tests: 304 passing (was 253 — +51 new across Python checks, source helpers, and review-driven coverage)
Integration tests: 38 passing, 0 ignored (was 34 + 3 ignored — +4 new including the Python fixture test)
Clippy: clean with -Dwarnings

Files Modified

Modified:

src/source.rs — added has_pattern_in(), find_pattern_matches_in(), has_string_literal_in() with generic language
dispatch
src/project.rs — parsed_files lazy init now uses OnceLock (fixes TOCTOU)
src/checks/source/python/mod.rs — registered 3 Python checks
src/checks/source/rust/no_color.rs — uses shared has_string_literal_in()
src/checks/behavioral/mod.rs, src/checks/project/dry_run.rs — adopt OnceLock constructor
tests/integration.rs — removed #[ignore] from 3 fixture tests; added test_broken_python_fixture

Created:

src/checks/source/python/bare_except.rs — bare-except detection via AST walking
src/checks/source/python/sys_exit.rs — sys.exit() guard analysis with robust __main__ guard parser and
__main__.py skip
src/checks/source/python/no_color.rs — NO_COLOR env var pattern matching with string-literal fallback
tests/fixtures/broken-python/pyproject.toml, tests/fixtures/broken-python/src/app.py — integration fixture

Key Details

Python checks use direct AST walking (tree-sitter-python) for patterns that aren't expressible as single ast-grep
templates (guard scope analysis, bare-except detection)
sys.exit() check skips __main__.py files (entry point) — mirrors Rust's main.rs skip for process::exit()
NO_COLOR check returns Warn (not Fail) since many Python libraries handle it transparently
is_main_guard normalizes the if __name__ == "__main__": comparison rather than exact-matching a small set of
canonical forms, handling real-world variants (inline comments, parens, whitespace, reversed ordering)
has_string_literal_in() lives in src/source.rs alongside the other cross-language helpers — the previous
per-language private copies diverged in behavior (single vs double quote handling)
check_x() functions in the Python checks return CheckStatus directly; run() is the sole CheckResult
constructor using self.id()/self.group()/self.layer() — eliminates ID string literal triplication (the matching
refactor for the 16 existing Rust checks ships in follow-up PR refactor: enforce check_x → CheckStatus convention across all source checks #16)

Real-World Validation Results

Target	Type	Checks	Pass	Warn	Fail	Notes
bird	Rust (source+behavioral)	30	20	8	1	No regressions
xurl-rs	Rust (source+behavioral)	30	20	8	1	No regressions
ripgrep	Binary (behavioral only)	8	7	0	0	Clean scorecard
markitdown	Python (source only)	6	2	4	0	`__main__.py` false positive caught and fixed

No breaking changes

Checklist

Code follows project conventions and style guidelines
Commit messages follow Conventional Commits
Self-review of code completed
Tests added/updated and passing
No new warnings or errors introduced
Changes are backward compatible

Implements the first Python source checks, proving the ast-grep + tree-sitter-python pipeline works end-to-end: - code-bare-except: detects bare `except:` clauses (P4 — actionable errors) - p4-sys-exit: detects `sys.exit()` outside `if __name__ == "__main__":` guard - p6-no-color: detects NO_COLOR env var handling for color output Adds language-parameterized helpers `has_pattern_in()` and `find_pattern_matches_in()` to `src/source.rs` so future Python checks can dispatch through the same ast-grep machinery as Rust checks. The original Rust-only helpers remain as thin wrappers. The bare-except and sys-exit checks walk the AST manually to detect patterns that aren't expressible as a single ast-grep template (e.g. "any except_clause without a type child", "any sys.exit() not under a __main__ guard"). The no-color check uses pattern matching across common Python env-var access forms with a string-literal fallback. Test coverage: 38 new unit tests covering happy paths, edge cases (string literals, comments, nested guards, multiple violations), and applicability across language manifests.

The source-only, broken-rust, and perfect-rust fixture tests were marked #[ignore] pending parseable source and built binaries in fixture directories. Both conditions are now met. Removing the ignore attributes brings these into CI coverage. Integration suite: 37 passed, 0 ignored (was 34 + 3 ignored).

__main__.py is the Python entry-point module (like Rust's main.rs). sys.exit() is expected there. Real-world validation against markitdown exposed this as a false positive.

All 6 implementation units complete: 3 Python source checks, 3 fixture tests un-ignored, real-world validation passed, design doc updated.

Review-driven fixes across Python source checks, source helpers, and project infrastructure: - Fix is_main_guard false positives: normalize comparison instead of exact string match — handles inline comments, parenthesized forms, no-spaces, and reversed operand order - Fix is_bare_except to split only on first line of node text - Fix __main__.py skip to check filename component, not suffix - Remove dead except* branch (PEP 654 requires a type after except*) - Replace RefCell with OnceLock for parsed_files lazy init (TOCTOU fix) - Refactor check_X functions to return CheckStatus instead of CheckResult — eliminates ID string literal triplication per check - Deduplicate has_string_literal into source::has_string_literal_in - Pre-build AST patterns in no_color (parse source once, not 10x) - Inline single-use find_bare_excepts/find_unguarded_sys_exits wrappers - Add tests for all new guard variants, multi-file aggregation, run()-level pass/warn branches, typed except*, and string fallback

Proves the tree-sitter-python pipeline works end-to-end via binary invocation, not just unit tests. The fixture contains a bare except clause and an unguarded sys.exit() call. The integration test verifies source-layer checks fire and code-bare-except specifically fails.

…-chain pin (#17) ## Summary Two related hardening efforts bundled: 1. **Convention enforcement** — standardizes all source checks on a single pattern (`check_x()` returns `CheckStatus`; `run()` is the sole `CheckResult` constructor, using `self.id()`/`self.group()`/`self.layer()`). Documents the convention in CLAUDE.md, refactors all 16 Rust source checks to match, and adds a Rust-only integration test that walks the source tree and fails CI if any `check_x()` still returns `CheckResult`. 2. **Toolchain supply-chain pin** — `rust-toolchain.toml` now pins to a specific `X.Y.Z` release with a trailing rustc commit-SHA comment. Rustup verifies component SHA256s from the distribution manifest, so the pin is effectively a SHA pin (the manifest is the toolchain's lockfile). Both local and CI read the same file and install identical bits. Motivated by a CI-only clippy failure during this PR's own review: local clippy 1.94 passed a lint that CI clippy 1.95 rejected. Convention enforcement was identified during the [PR #15 code review](#15) — the 3 new Python checks were fixed in that PR, but the 16 existing Rust checks still used the old pattern. The toolchain pin was added mid-review after the clippy divergence exposed that floating `channel = "stable"` leaves local and CI on different rustc versions. ## Changelog  ## Type of Change - [x] `refactor`: Code refactoring (no functional changes) - [x] `chore`: Maintenance tasks (supply-chain toolchain pin) - [x] `docs`: Documentation update (CLAUDE.md conventions + toolchain policy) - [x] `test`: Adding or updating tests (convention enforcement test) ## Related Issues/Stories - Related PRs: #15 (review finding that identified the convention drift) - Follow-up: `dot-github` todo #4 — determine the automated bump workflow for `rust-toolchain.toml` across repos (central reusable workflow vs. local script vs. hybrid) ## Testing - [x] Unit tests added/updated - [x] Integration tests added/updated - [x] All tests passing **Test Summary:** - Unit tests: 304 passing (unchanged count — refactor preserves behavior) - Integration tests: 39 passing (+1 new: `convention_check_x_returns_check_status_not_check_result`) - Clippy: clean with `-Dwarnings` on rustc 1.94.1 (pinned) ## Files Modified **Modified:** - `CLAUDE.md` — added Source Check Convention section (Tier 1) and Toolchain Pin section documenting the 7-day quarantine policy - `rust-toolchain.toml` — pinned `channel = "1.94.1"` with trailing comment naming the rustc commit SHA and release date; bumped from floating `channel = "stable"` - `scripts/hooks/pre-push` — removed `rustup update stable` step; pin + rustup's manifest verification now handle toolchain integrity - 16 files in `src/checks/source/rust/` — `check_x()` returns `CheckStatus`, `run()` uses `self.id()`/`self.group()`/ `self.layer()` instead of duplicated string/enum literals - `src/checks/source/rust/no_pager.rs` — collapsed nested `if` inside a `match` arm into a match guard (fixes clippy `collapsible_match` lint introduced in clippy 1.95) - `tests/integration.rs` — added `convention_check_x_returns_check_status_not_check_result` enforcement test ## Key Details **Tier 1 (Document):** CLAUDE.md now codifies the source check structure. Covers the `Check` trait impl shape, the `check_x() → CheckStatus` separation, and the convention that `run()` is the sole `CheckResult` constructor. Also adds a new Toolchain Pin section explaining the supply-chain pin policy. **Tier 2 (Refactor):** All 16 Rust source checks aligned to the convention. Net -132 lines by removing duplicated `CheckResult` construction (previously the `id` / `label` / `group` / `layer` fields were hardcoded in both `run()` and every `check_x()` helper). **Tier 3 (Enforce):** Integration test walks `src/checks/source/` with `std::fs` (no shell dependencies), finds every `fn check_` signature (any visibility, single-line or multi-line), and fails if any return `CheckResult` instead of `CheckStatus`. Uses `CARGO_MANIFEST_DIR` for a stable absolute path. **Toolchain pin format:** `channel = "1.94.1" # rustc e408947bfd200af42db322daf0fadfe7e26d3bd1, released 2026-03-25`. Comment mirrors the GitHub Actions SHA-pin pattern (`action@<sha> # vN.N.N`). Policy: bump only via reviewed PR after the new stable has aged ≥7 days (supply-chain quarantine consistent with `UV_EXCLUDE_NEWER` / bun `minimumReleaseAge` / `npm_config_min_release_age` in dotfiles). ## Benefits - Eliminates ID string literal triplication (3 copies per check → 1 authoritative source in `fn id()`) - New checks can't drift — the convention is documented AND enforced by CI - -132 lines of boilerplate removed - Local and CI toolchains are guaranteed identical — no more "local clippy older than CI clippy" false greens - Toolchain updates route through reviewed PRs with a 7-day quarantine, matching the broader brettdavies supply-chain posture ## Breaking Changes - [x] No breaking changes ## Deployment Notes - [x] No special deployment steps required ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [x] Changes are backward compatible

…world validation (#15)

…-chain pin (#17) ## Summary Two related hardening efforts bundled: 1. **Convention enforcement** — standardizes all source checks on a single pattern (`check_x()` returns `CheckStatus`; `run()` is the sole `CheckResult` constructor, using `self.id()`/`self.group()`/`self.layer()`). Documents the convention in CLAUDE.md, refactors all 16 Rust source checks to match, and adds a Rust-only integration test that walks the source tree and fails CI if any `check_x()` still returns `CheckResult`. 2. **Toolchain supply-chain pin** — `rust-toolchain.toml` now pins to a specific `X.Y.Z` release with a trailing rustc commit-SHA comment. Rustup verifies component SHA256s from the distribution manifest, so the pin is effectively a SHA pin (the manifest is the toolchain's lockfile). Both local and CI read the same file and install identical bits. Motivated by a CI-only clippy failure during this PR's own review: local clippy 1.94 passed a lint that CI clippy 1.95 rejected. Convention enforcement was identified during the [PR #15 code review](#15) — the 3 new Python checks were fixed in that PR, but the 16 existing Rust checks still used the old pattern. The toolchain pin was added mid-review after the clippy divergence exposed that floating `channel = "stable"` leaves local and CI on different rustc versions. ## Changelog  ## Type of Change - [x] `refactor`: Code refactoring (no functional changes) - [x] `chore`: Maintenance tasks (supply-chain toolchain pin) - [x] `docs`: Documentation update (CLAUDE.md conventions + toolchain policy) - [x] `test`: Adding or updating tests (convention enforcement test) ## Related Issues/Stories - Related PRs: #15 (review finding that identified the convention drift) - Follow-up: `dot-github` todo #4 — determine the automated bump workflow for `rust-toolchain.toml` across repos (central reusable workflow vs. local script vs. hybrid) ## Testing - [x] Unit tests added/updated - [x] Integration tests added/updated - [x] All tests passing **Test Summary:** - Unit tests: 304 passing (unchanged count — refactor preserves behavior) - Integration tests: 39 passing (+1 new: `convention_check_x_returns_check_status_not_check_result`) - Clippy: clean with `-Dwarnings` on rustc 1.94.1 (pinned) ## Files Modified **Modified:** - `CLAUDE.md` — added Source Check Convention section (Tier 1) and Toolchain Pin section documenting the 7-day quarantine policy - `rust-toolchain.toml` — pinned `channel = "1.94.1"` with trailing comment naming the rustc commit SHA and release date; bumped from floating `channel = "stable"` - `scripts/hooks/pre-push` — removed `rustup update stable` step; pin + rustup's manifest verification now handle toolchain integrity - 16 files in `src/checks/source/rust/` — `check_x()` returns `CheckStatus`, `run()` uses `self.id()`/`self.group()`/ `self.layer()` instead of duplicated string/enum literals - `src/checks/source/rust/no_pager.rs` — collapsed nested `if` inside a `match` arm into a match guard (fixes clippy `collapsible_match` lint introduced in clippy 1.95) - `tests/integration.rs` — added `convention_check_x_returns_check_status_not_check_result` enforcement test ## Key Details **Tier 1 (Document):** CLAUDE.md now codifies the source check structure. Covers the `Check` trait impl shape, the `check_x() → CheckStatus` separation, and the convention that `run()` is the sole `CheckResult` constructor. Also adds a new Toolchain Pin section explaining the supply-chain pin policy. **Tier 2 (Refactor):** All 16 Rust source checks aligned to the convention. Net -132 lines by removing duplicated `CheckResult` construction (previously the `id` / `label` / `group` / `layer` fields were hardcoded in both `run()` and every `check_x()` helper). **Tier 3 (Enforce):** Integration test walks `src/checks/source/` with `std::fs` (no shell dependencies), finds every `fn check_` signature (any visibility, single-line or multi-line), and fails if any return `CheckResult` instead of `CheckStatus`. Uses `CARGO_MANIFEST_DIR` for a stable absolute path. **Toolchain pin format:** `channel = "1.94.1" # rustc e408947bfd200af42db322daf0fadfe7e26d3bd1, released 2026-03-25`. Comment mirrors the GitHub Actions SHA-pin pattern (`action@<sha> # vN.N.N`). Policy: bump only via reviewed PR after the new stable has aged ≥7 days (supply-chain quarantine consistent with `UV_EXCLUDE_NEWER` / bun `minimumReleaseAge` / `npm_config_min_release_age` in dotfiles). ## Benefits - Eliminates ID string literal triplication (3 copies per check → 1 authoritative source in `fn id()`) - New checks can't drift — the convention is documented AND enforced by CI - -132 lines of boilerplate removed - Local and CI toolchains are guaranteed identical — no more "local clippy older than CI clippy" false greens - Toolchain updates route through reviewed PRs with a 7-day quarantine, matching the broader brettdavies supply-chain posture ## Breaking Changes - [x] No breaking changes ## Deployment Notes - [x] No special deployment steps required ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [x] Changes are backward compatible

brettdavies added 6 commits April 15, 2026 20:11

fix(python): skip __main__.py in sys.exit() check

8c05591

__main__.py is the Python entry-point module (like Rust's main.rs). sys.exit() is expected there. Real-world validation against markitdown exposed this as a false positive.

docs: mark Plan 004 as done

b9d6aa7

All 6 implementation units complete: 3 Python source checks, 3 fixture tests un-ignored, real-world validation passed, design doc updated.

brettdavies mentioned this pull request Apr 16, 2026

refactor: enforce check_x → CheckStatus convention across all source checks #16

Closed

15 tasks

brettdavies merged commit 0990900 into dev Apr 16, 2026
6 checks passed

brettdavies deleted the feat/python-checks-and-validation branch April 16, 2026 17:03

brettdavies mentioned this pull request Apr 16, 2026

refactor: enforce check_x → CheckStatus convention + toolchain supply-chain pin #17

Merged

15 tasks

brettdavies added a commit that referenced this pull request Apr 16, 2026

feat(python): Python starter checks, fixture test coverage, and real-…

266b046

…world validation (#15)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(python): Python starter checks, fixture test coverage, and real-world validation#15

feat(python): Python starter checks, fixture test coverage, and real-world validation#15
brettdavies merged 6 commits intodevfrom
feat/python-checks-and-validation

brettdavies commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brettdavies commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changelog

Added

Fixed

Type of Change

Related Issues/Stories

Testing

Files Modified

Key Details

Real-World Validation Results

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

brettdavies commented Apr 16, 2026 •

edited

Loading