Skip to content

[FEATURE] Support .codestewardignore for project-specific file exclusions #5

@barnakun

Description

@barnakun

Is your feature request related to a problem? Please describe.

While testing the graph builder against a real Python project, the .venv directory was not being ignored during file collection, causing the parser to traverse thousands of vendored dependency files. Adding .venv to the hardcoded _IGNORED_DIRS blocklist resolved it — but this highlighted a deeper problem: the blocklist can never be exhaustive. Any project can have large directories that are irrelevant to analysis (generated code, internal tooling output, large data folders, etc.) with no way for the user to exclude them without patching codesteward itself.

Describe the solution you'd like

Support a .codestewardignore file placed in the root of the repository being analyzed. The file uses the same gitignore pattern syntax users already know (**/*.generated.ts, internal/, src/fixtures/). When present, codesteward skips any files matched by its patterns during graph construction — in addition to the built-in hardcoded blocklist. If the file does not exist, behavior is unchanged.

Describe alternatives you've considered

  • Expanding the hardcoded blocklist — already done incrementally (.venv, .ruff_cache, htmlcov, etc.), but this is a maintenance game. It cannot cover project-specific directories and forces users to open issues for every new case.
  • Respecting .gitignore — considered, but rejected: .gitignore exists for VCS purposes, not for analysis tooling. A project may intentionally commit files (test fixtures, seed data) that are legitimately .gitignore-excluded for other reasons, or conversely include build output that should still be analyzed. A separate file keeps the concerns independent.
  • CLI flag / config option — possible, but a file in the repo is portable, version-controllable, and works without changing any invocation.

Additional context

The implementation uses pathspec (gitignore pattern format) which is already a transitive dependency of the project via mypy. Patterns are loaded once per full-build scan and evaluated against each file's repo-relative path before it enters the parse queue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions