Skip to content

[spike] Better support for pre-existing codebases #38

@spencatro-pub

Description

@spencatro-pub

agentblame should better support pre-existing codebases via optional initialization configurations. Marked as a spike to kick off the discussion, but below are some ideas.

Currently, the only initialization option is to assume an entire codebase is human-generated. This is a sane default, but certainly one with an opinion. Different teams with stricter requirements may want a different initialization strategy. (Personally, in an existing codebase that has used agentic generation before integrating agentblame, I would prefer to default all existing code to an "unknown" state, and thus force a human to take attribution over the lines they really care about.) Marking a line of code as unknown attribution would be a signal for humans that is a bit stronger than "line was human generated," but a softer signal than "line was AI generated," when it comes to the concern of "does this code require additional scrutiny?"

This feature might fit hand-in-hand with a mechanism similar to CODEOWNERS, where a repo can set up more fine-grained attribution rules, similar to how they might assign rules about test coverage or ownership. (Perhaps the source files in src/lib/auth/*.rs should be stricter about allowing non-human attributed code than say src/view/styles/*.css .) I'm imagining a CI pipeline that simply calls something like agentblame verify-attribution-rules to automatically reject PR's that don't meet some criteria. (The value here maybe being more about codifying an org's posture into actionable code that can generate actionable data, instead of e.g. a confluence / wiki page with no real enforcement or method of measurement.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions