-
Notifications
You must be signed in to change notification settings - Fork 4
Description
agentblame should better support pre-existing codebases via optional initialization configurations. Marked as a spike to kick off the discussion, but below are some ideas.
Currently, the only initialization option is to assume an entire codebase is human-generated. This is a sane default, but certainly one with an opinion. Different teams with stricter requirements may want a different initialization strategy. (Personally, in an existing codebase that has used agentic generation before integrating agentblame, I would prefer to default all existing code to an "unknown" state, and thus force a human to take attribution over the lines they really care about.) Marking a line of code as unknown attribution would be a signal for humans that is a bit stronger than "line was human generated," but a softer signal than "line was AI generated," when it comes to the concern of "does this code require additional scrutiny?"
This feature might fit hand-in-hand with a mechanism similar to CODEOWNERS, where a repo can set up more fine-grained attribution rules, similar to how they might assign rules about test coverage or ownership. (Perhaps the source files in src/lib/auth/*.rs should be stricter about allowing non-human attributed code than say src/view/styles/*.css .) I'm imagining a CI pipeline that simply calls something like agentblame verify-attribution-rules to automatically reject PR's that don't meet some criteria. (The value here maybe being more about codifying an org's posture into actionable code that can generate actionable data, instead of e.g. a confluence / wiki page with no real enforcement or method of measurement.)