-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Upstream model generated file and line linkification #1803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| export class FileLinkificationInstructions extends PromptElement<{}> { | ||
| render() { | ||
| return <Tag name='file_linkification'> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a gut feeling but this seems pretty verbose and I'm not sure how well models will follow the good/bad method written this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| For casual greetings, acknowledgements, or other one-off conversational messages that are not delivering substantive information or structured results, respond naturally without section headers or bullet formatting.<br /> | ||
| <br /> | ||
| When referring to a filename or symbol in the user's workspace, wrap it in backticks.<br /> | ||
| <FileLinkificationInstructions /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should break up the example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| - Don'ts: no nested bullets/hierarchies; no ANSI codes; don't cram unrelated keywords; keep keyword lists short—wrap/reformat if long; avoid naming formatting styles in answers.<br /> | ||
| - Adaptation: code explanations → precise, structured with code refs; simple tasks → lead with outcome; big changes → logical walkthrough + rationale + next actions; casual one-offs → plain sentences, no headers/bullets.<br /> | ||
| - File References: When referencing files in your response, always follow the below rules:<br /> | ||
| * Use inline code to make file paths clickable.<br /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should revisit these instructions too as they duplicate the linkfication stuff. I'd expect to have some base set of linkfication rules and then a (hopefully) small set of additional customizations for each model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
| // Example: [src/file.ts](src/file.ts#L10-12) or [src/file.ts](src/file.ts) | ||
| const modelLinkRe = /\[(?<text>[^\]\n]+)\]\((?<target>[^\s)]+)\)/gu; | ||
|
|
||
| export class ModelFilePathLinkifier implements IContributedLinkifier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's break up the existing FilePathLinkifier into the real links functionality and the inline code functionality. Maybe just delete the real links stuff from FilePathLinkifier and keep this class. I'd like to avoid the duplication though and make it so we just have one place that handles the markdown file links
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
… into vijayu/autoGen-links-3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR moves file path linkification upstream to AI models by instructing them to emit canonical Markdown links with line number anchors ([path](path#L10-12)). A new ModelFilePathLinkifier processes these structured links first, with the existing FilePathLinkifier serving as fallback for other patterns.
Key changes:
- New
FileLinkificationInstructionscomponent with detailed prompting rules for models ModelFilePathLinkifierclass to parse model-generated links with#Lline anchors- Integration across all model-specific prompts (OpenAI, Anthropic, Gemini, xAI, VSC)
- Removal of markdown link handling from legacy
FilePathLinkifier - Enhanced
LinkifyLocationAnchorwithtitleparameter for display paths
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/extension/prompts/node/agent/fileLinkificationInstructions.tsx | New shared prompt component with linkification instructions for models |
| src/extension/prompts/node/agent/openAIPrompts.tsx | Integrated FileLinkificationInstructions, updated GPT-5 examples, removed old inline instructions |
| src/extension/prompts/node/agent/anthropicPrompts.tsx | Added FileLinkificationInstructions to Claude prompts |
| src/extension/prompts/node/agent/geminiPrompts.tsx | Added FileLinkificationInstructions to Gemini prompts |
| src/extension/prompts/node/agent/xAIPrompts.tsx | Added FileLinkificationInstructions to Grok prompts |
| src/extension/prompts/node/agent/vscModelPrompts.tsx | Added FileLinkificationInstructions to VSC model prompts |
| src/extension/linkify/common/modelFilePathLinkifier.ts | New linkifier for parsing model-generated markdown links with line anchors |
| src/extension/linkify/common/filePathLinkifier.ts | Removed markdown link handling, improved directory slash handling |
| src/extension/linkify/common/linkifyService.ts | Registered ModelFilePathLinkifier before FilePathLinkifier for priority processing |
| src/extension/linkify/test/node/modelFilePathLinkifier.spec.ts | Test suite for new model linkifier functionality |
| src/extension/linkify/test/node/util.ts | Enhanced mock services to support directory detection and workspace methods |
| src/extension/prompt/node/chatParticipantRequestHandler.ts | Updated anchor-to-markdown conversion to use title field and include line numbers in labels |
Comments suppressed due to low confidence (4)
src/extension/prompts/node/agent/xAIPrompts.tsx:115
- [nitpick] The examples only show file references with backticks but don't demonstrate the line number linkification format introduced by
<FileLinkificationInstructions />. For consistency with the GPT-5 prompt (which includes an example like[src/network/httpClient.ts](src/network/httpClient.ts#L42)), consider adding a similar example to help the model understand when and how to use the line number format:
<Tag name='example'>
The class `Person` is in `src/models/person.ts`.<br />
The function `calculateTotal` is defined in [lib/utils/math.ts](lib/utils/math.ts#L42).<br />
You can find the configuration in `config/app.config.json`.
</Tag> <Tag name='example'>
The class `Person` is in `src/models/person.ts`.<br />
The function `calculateTotal` is defined in `lib/utils/math.ts`.<br />
You can find the configuration in `config/app.config.json`.
</Tag>
src/extension/prompts/node/agent/geminiPrompts.tsx:110
- [nitpick] The examples only show file references with backticks but don't demonstrate the line number linkification format introduced by
<FileLinkificationInstructions />. For consistency with the GPT-5 prompt (which includes an example like[src/network/httpClient.ts](src/network/httpClient.ts#L42)), consider adding a similar example to help the model understand when and how to use the line number format:
<Tag name='example'>
The class `Person` is in `src/models/person.ts`.<br />
The function `calculateTotal` is defined in [lib/utils/math.ts](lib/utils/math.ts#L42).<br />
You can find the configuration in `config/app.config.json`.
</Tag> <Tag name='example'>
The class `Person` is in `src/models/person.ts`.<br />
The function `calculateTotal` is defined in `lib/utils/math.ts`.<br />
You can find the configuration in `config/app.config.json`.
src/extension/prompts/node/agent/openAIPrompts.tsx:110
- [nitpick] The examples only show file references with backticks but don't demonstrate the line number linkification format introduced by
<FileLinkificationInstructions />. For consistency with the GPT-5 prompt (which includes an example like[src/network/httpClient.ts](src/network/httpClient.ts#L42)), consider adding a similar example to help the model understand when and how to use the line number format:
<Tag name='example'>
The class `Person` is in `src/models/person.ts`.<br />
The function `calculateTotal` is defined in [lib/utils/math.ts](lib/utils/math.ts#L42).<br />
You can find the configuration in `config/app.config.json`.
</Tag> <Tag name='example'>
The class `Person` is in `src/models/person.ts`.<br />
The function `calculateTotal` is defined in `lib/utils/math.ts`.<br />
You can find the configuration in `config/app.config.json`.
</Tag>
src/extension/prompt/node/chatParticipantRequestHandler.ts:463
- The markdown link format on line 463 appears incorrect. When generating markdown links for Location anchors with line numbers, the line number should be part of the URL fragment (e.g.,
path#L10), not as a separate title attribute in the markdown link syntax.
Currently: [text](path "title") where text includes #L10 but path doesn't
Expected: [text](path#L10) where the line number is part of the URL
Consider updating to:
if (isLocation(anchor.value)) {
const lineFragment = `#L${anchor.value.range.start.line + 1}${anchor.value.range.start.line === anchor.value.range.end.line ? '' : `-${anchor.value.range.end.line + 1}`}`;
path = `${getWorkspaceFileDisplayPath(workspaceService, anchor.value.uri)}${lineFragment}`;
const label = anchor.title ?? path;
text = `\`${label}\``;
}This ensures the URL includes the line fragment so it's linkifiable by the ModelFilePathLinkifier.
} else if (isLocation(anchor.value)) {
path = getWorkspaceFileDisplayPath(workspaceService, anchor.value.uri);
const label = anchor.title ?? `${path}#L${anchor.value.range.start.line + 1}${anchor.value.range.start.line === anchor.value.range.end.line ? '' : `-${anchor.value.range.end.line + 1}`}`;
text = `\`${label}\``;
} else if (isSymbolInformation(anchor.value)) {
path = getWorkspaceFileDisplayPath(workspaceService, anchor.value.location.uri);
text = `\`${anchor.value.name}\``;
} else {
// Unknown anchor type
return '';
}
return `[${text}](${path} ${anchor.title ? `"${anchor.title}"` : ''})`;
src/extension/prompts/node/agent/fileLinkificationInstructions.tsx
Outdated
Show resolved
Hide resolved
src/extension/prompts/node/agent/fileLinkificationInstructions.tsx
Outdated
Show resolved
Hide resolved
| * foo.ts | ||
| * ``` | ||
| */ | ||
| export class FilePathLinkifier implements IContributedLinkifier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this class at all? Maybe just delete it and call new ModelFilePathLinkifier FilePathLinkifier instead
It seems like the new linkifier should handle everything that the current one does. If there are any gaps we can fix those instead of keeping the fallback around
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now ModelFilePathLinkifier handles structured markdown links with anchors texttext, while FilePathLinkifier handles inline code and plain text paths.
They're registered with clear separation of concerns, where FilePathLinkifier acts as a fallback for informal references.
That said, I'm happy to explore merging them if you think that would be better! Let me know your preference and I can make it work either way.
|
|
||
| // Matches markdown links where the text is a path and optional #L anchor is present | ||
| // Example: [src/file.ts](src/file.ts#L10-12) or [src/file.ts](src/file.ts) | ||
| const modelLinkRe = /\[(?<text>[^\]\n]+)\]\((?<target>[^\s)]+)\)/gu; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure this can support the case where the path text is code too as this is fairly common and models have likely been trained on a lot of it:
[`file.ts`](file.ts)
Especially true if we get rid of the old linkifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, works. Added unit test to validate this.
| continue; | ||
| } | ||
|
|
||
| const resolved = await this.resolveTarget(parsed.targetPath, workspaceFolders, parsed.preserveDirectorySlash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to run these type of calls in parallel instead of awaiting each iteration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid point. Updated
| - `See [path/to/file.ts](path/to/file.ts#L10-12) for the range.`<br /> | ||
| - `Configuration is defined in [path/to/file.ts](path/to/file.ts).` (whole file)<br /> | ||
| When you need a bullet list of references, write a short description before the link, for example:<br /> | ||
| - Await chat view: [path/to/chatQuick.ts](path/to/chatQuick.ts#L142)<br /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the syntax we want or should we support:
[Await chat view](path/to/chatQuick.ts#L142)
That feels more natural if we can get the model to generate these links properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, updated
| - Show widget: [path/to/chatQuick.ts](path/to/chatQuick.ts#L321)<br /> | ||
| Examples:<br /> | ||
| ❌ `The function is in exampleScript.ts at line 25.`<br /> | ||
| ✓ `The function is in [exampleScript.ts](exampleScript.ts#L25).`<br /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the ❌ how we provided counter examples to the model? Seems like it's mostly used in our docs, not in the actual prompts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
| return <Tag name='file_linkification'> | ||
| ALWAYS convert file paths to markdown links with 1-based line numbers whenever you cite specific code locations. Use links inside normal sentences, not as the entire answer.<br /> | ||
| Format examples:<br /> | ||
| - `The handler lives in [path/to/file.ts](path/to/file.ts#L10).` (single line)<br /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of this over a path in backticks? We're asking the model to repeat the path twice, it could be long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In every other path scenario, we ask it to use absolute paths. How do you handle relative paths within multiroot workspaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And right now I think we support linkifying many different patterns, does this mean we would no longer support those patterns? (I strongly support simplifying the patterns we handle)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For multi root workspace, currently the first match is selected.
| - `The handler lives in [path/to/file.ts](path/to/file.ts#L10).` (single line)<br /> | ||
| - `See [path/to/file.ts](path/to/file.ts#L10-12) for the range.`<br /> | ||
| - `Configuration is defined in [path/to/file.ts](path/to/file.ts).` (whole file)<br /> | ||
| When you need a bullet list of references, write a short description before the link, for example:<br /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model was generating responses like below, without any description on what those lines represent.
chatListRenderer.ts#857-857
chatListRenderer.ts#1001-1001
chatListRenderer.ts#1477-1477
chatListRenderer.ts#1486-1486
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting. I wonder if all the rules about links caused it to hyper-focus on links specifically and forget what is was actually trying to communicate 😅
|
|
||
| export class FileLinkificationInstructions extends PromptElement<{}> { | ||
| render() { | ||
| return <Tag name='file_linkification'> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot, I just don't think all these rules and examples are necessary for a modern model. What you're asking for is extremely simple, I would expect that a decent model can pick it up with just a couple sentences.
We could do it in a scientific way and write some evals to test all models and how far we can trim the rules while still getting good results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid point, I started with simple few sentence prompts, but I got inconsistent results across models. Within the model also it was inconsistent between different files. I will try pruning it offline separately (with evals) and simplify without loosing the intent.
Motivation
Move file path linkification upstream to the model so it emits canonical, machine‑parsable Markdown links (path), reducing reliance on heuristic fallback and enabling precise line anchors. Linkification falls back to current component if model does not handle it.
Also ensure consistent behavior across all model families (OpenAI/GPT‑5, Anthropic/Claude, etc.) while eliminating duplicated prompt text.
This is example response with line number linkification: