Is your feature request related to a problem? Please describe.
While testing the graph builder against a real Python project, the .venv directory was not being ignored during file collection, causing the parser to traverse thousands of vendored dependency files. Adding .venv to the hardcoded _IGNORED_DIRS blocklist resolved it — but this highlighted a deeper problem: the blocklist can never be exhaustive. Any project can have large directories that are irrelevant to analysis (generated code, internal tooling output, large data folders, etc.) with no way for the user to exclude them without patching codesteward itself.
Describe the solution you'd like
Support a .codestewardignore file placed in the root of the repository being analyzed. The file uses the same gitignore pattern syntax users already know (**/*.generated.ts, internal/, src/fixtures/). When present, codesteward skips any files matched by its patterns during graph construction — in addition to the built-in hardcoded blocklist. If the file does not exist, behavior is unchanged.
Describe alternatives you've considered
- Expanding the hardcoded blocklist — already done incrementally (
.venv, .ruff_cache, htmlcov, etc.), but this is a maintenance game. It cannot cover project-specific directories and forces users to open issues for every new case.
- Respecting
.gitignore — considered, but rejected: .gitignore exists for VCS purposes, not for analysis tooling. A project may intentionally commit files (test fixtures, seed data) that are legitimately .gitignore-excluded for other reasons, or conversely include build output that should still be analyzed. A separate file keeps the concerns independent.
- CLI flag / config option — possible, but a file in the repo is portable, version-controllable, and works without changing any invocation.
Additional context
The implementation uses pathspec (gitignore pattern format) which is already a transitive dependency of the project via mypy. Patterns are loaded once per full-build scan and evaluated against each file's repo-relative path before it enters the parse queue.
Is your feature request related to a problem? Please describe.
While testing the graph builder against a real Python project, the
.venvdirectory was not being ignored during file collection, causing the parser to traverse thousands of vendored dependency files. Adding.venvto the hardcoded_IGNORED_DIRSblocklist resolved it — but this highlighted a deeper problem: the blocklist can never be exhaustive. Any project can have large directories that are irrelevant to analysis (generated code, internal tooling output, large data folders, etc.) with no way for the user to exclude them without patching codesteward itself.Describe the solution you'd like
Support a
.codestewardignorefile placed in the root of the repository being analyzed. The file uses the same gitignore pattern syntax users already know (**/*.generated.ts,internal/,src/fixtures/). When present, codesteward skips any files matched by its patterns during graph construction — in addition to the built-in hardcoded blocklist. If the file does not exist, behavior is unchanged.Describe alternatives you've considered
.venv,.ruff_cache,htmlcov, etc.), but this is a maintenance game. It cannot cover project-specific directories and forces users to open issues for every new case..gitignore— considered, but rejected:.gitignoreexists for VCS purposes, not for analysis tooling. A project may intentionally commit files (test fixtures, seed data) that are legitimately.gitignore-excluded for other reasons, or conversely include build output that should still be analyzed. A separate file keeps the concerns independent.Additional context
The implementation uses
pathspec(gitignorepattern format) which is already a transitive dependency of the project viamypy. Patterns are loaded once per full-build scan and evaluated against each file's repo-relative path before it enters the parse queue.