-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Edit Logging Functionality + Utilities for Next Edit #6268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ture/next-edit-mvp
…ture/next-edit-mvp
…ture/next-edit-mvp
…to jacob/feature/next-edit-mvp
…s context to vscode messenger
…ture/next-edit-mvp
Jacob/feature/next edit mvp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool to run this locally and see my first next edit prediction! Worked well for the most part, some of the insertion logic felt a bit off but will start to share more concrete examples of that once we get this shipped.
I got this error trying to use the feature with a Hub secret, but didn't get the error with a hardcoded API key, so probably fine to ignore for now:
HTTP 500 Internal Server Error from https://api.continue-stage.tools/model-proxy/v1/chat/completions Error in Continue server: 502 Bad Gateway
Lot of code in the PR so I didn't dive too deep into the meat of the nextEdit
code.
A few high level thoughts as we build out the feature
- Where are there opportunities to unit test some of the behavior? Benefits being to provide examples of expected inputs/outputs, speed up feedback loops (eg the agent can run the tests to iterate on your feedback), lockdown behavior, etc. Unit tests are pretty easy to write with LLMs
- We tend to not comment our code very heavily, but in this case with you guys building a large feature but not necessarily being around afterwards to maintain it, I would like to err on the side of over-communication/documentation in the code. I think what this looks like in practice: a
nextEdit/README.md
with things like general feature overview, basic diagrams, concept definitions, etc; comments on non-obvious params, functions, etc - e2e tests for a sanity check that next edit is working would be nice. This might be a bit tricky with the experimental toggle. One way we could maybe get around that is by just enabling next edit by default for Continue org members
core/core.ts
Outdated
on("files/smallEdit", async ({ data }) => { | ||
const EDIT_AGGREGATION_OPTIONS = { | ||
deltaT: 1.0, | ||
deltaL: 5, | ||
maxEdits: 250, | ||
maxDuration: 100.0, | ||
contextSize: 5, | ||
}; | ||
|
||
if (!global._editAggregator) { | ||
global._editAggregator = new EditAggregator( | ||
EDIT_AGGREGATION_OPTIONS, | ||
( | ||
beforeAfterdiff: BeforeAfterDiff, | ||
cursorPosBeforeEdit: Position, | ||
cursorPosAfterPrevEdit: Position, | ||
) => { | ||
NextEditProvider.getInstance().addDiffToContext( | ||
createDiff( | ||
beforeAfterdiff.beforeContent, | ||
beforeAfterdiff.afterContent, | ||
beforeAfterdiff.filePath, | ||
DiffFormatType.Unified, | ||
), | ||
); | ||
|
||
// Get the current context data from the most recent message | ||
const currentData = (global._editAggregator as any) | ||
.latestContextData || { | ||
configHandler: data.configHandler, | ||
getDefsFromLspFunction: data.getDefsFromLspFunction, | ||
recentlyEditedRanges: [], | ||
recentlyVisitedRanges: [], | ||
}; | ||
|
||
void processNextEditData( | ||
beforeAfterdiff.filePath, | ||
beforeAfterdiff.beforeContent, | ||
beforeAfterdiff.afterContent, | ||
cursorPosBeforeEdit, | ||
cursorPosAfterPrevEdit, | ||
this.ide, | ||
currentData.configHandler, | ||
currentData.getDefsFromLspFunction, | ||
currentData.recentlyEditedRanges, | ||
currentData.recentlyVisitedRanges, | ||
currentData.workspaceDir, | ||
); | ||
}, | ||
); | ||
} | ||
|
||
const workspaceDir = | ||
data.actions.length > 0 ? data.actions[0].workspaceDir : undefined; | ||
|
||
// Store the latest context data on the aggregator | ||
(global._editAggregator as any).latestContextData = { | ||
configHandler: data.configHandler, | ||
getDefsFromLspFunction: data.getDefsFromLspFunction, | ||
recentlyEditedRanges: data.recentlyEditedRanges, | ||
recentlyVisitedRanges: data.recentlyVisitedRanges, | ||
workspaceDir: workspaceDir, | ||
}; | ||
|
||
// queueMicrotask prevents blocking the UI thread during typing | ||
queueMicrotask(() => { | ||
void global._editAggregator.processEdits(data.actions); | ||
}); | ||
}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we pull this out into a helper fn in the nextEdit
folder? This file has a lot of inline fn definitions like this but we've been trying to avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved as per commit 7aebea0
export interface EditClusterConfig { | ||
deltaT: number; | ||
deltaL: number; | ||
maxEdits: number; | ||
maxDuration: number; | ||
contextSize: number; | ||
contextLines: number; | ||
verbose: boolean; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments on what these are would be helpful. Most are somewhat self-explanatory but the delta
ones I don't know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably goes for most of the interfaces. Would be nice to have this well documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved as per commit 7aebea0
core/core.ts
Outdated
( | ||
beforeAfterdiff: BeforeAfterDiff, | ||
cursorPosBeforeEdit: Position, | ||
cursorPosAfterPrevEdit: Position, | ||
) => { | ||
NextEditProvider.getInstance().addDiffToContext( | ||
createDiff( | ||
beforeAfterdiff.beforeContent, | ||
beforeAfterdiff.afterContent, | ||
beforeAfterdiff.filePath, | ||
DiffFormatType.Unified, | ||
), | ||
); | ||
|
||
// Get the current context data from the most recent message | ||
const currentData = (global._editAggregator as any) | ||
.latestContextData || { | ||
configHandler: data.configHandler, | ||
getDefsFromLspFunction: data.getDefsFromLspFunction, | ||
recentlyEditedRanges: [], | ||
recentlyVisitedRanges: [], | ||
}; | ||
|
||
void processNextEditData( | ||
beforeAfterdiff.filePath, | ||
beforeAfterdiff.beforeContent, | ||
beforeAfterdiff.afterContent, | ||
cursorPosBeforeEdit, | ||
cursorPosAfterPrevEdit, | ||
this.ide, | ||
currentData.configHandler, | ||
currentData.getDefsFromLspFunction, | ||
currentData.recentlyEditedRanges, | ||
currentData.recentlyVisitedRanges, | ||
currentData.workspaceDir, | ||
); | ||
}, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we pull this out into a function def instead of defining it anonymously in the callback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved as per commit 7aebea0
export const processNextEditData = async ( | ||
filePath: string, | ||
beforeContent: string, | ||
afterContent: string, | ||
cursorPosBeforeEdit: Position, | ||
cursorPosAfterPrevEdit: Position, | ||
ide: IDE, | ||
configHandler: ConfigHandler, | ||
getDefinitionsFromLsp: GetLspDefinitionsFunction, | ||
recentlyEditedRanges: RecentlyEditedRange[], | ||
recentlyVisitedRanges: AutocompleteCodeSnippet[], | ||
workspaceDir: string, | ||
modelNameOrInstance?: string | undefined, | ||
// eslint-disable-next-line max-params | ||
) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree with the eslint-disable-next-line max-params
here 😅
The easy/nice fix is to just define a single param which is an object. The main benefits here are that 1) it makes each param more clear, and 2) you can pass the args in any order you want
export const processNextEditData = async ( | |
filePath: string, | |
beforeContent: string, | |
afterContent: string, | |
cursorPosBeforeEdit: Position, | |
cursorPosAfterPrevEdit: Position, | |
ide: IDE, | |
configHandler: ConfigHandler, | |
getDefinitionsFromLsp: GetLspDefinitionsFunction, | |
recentlyEditedRanges: RecentlyEditedRange[], | |
recentlyVisitedRanges: AutocompleteCodeSnippet[], | |
workspaceDir: string, | |
modelNameOrInstance?: string | undefined, | |
// eslint-disable-next-line max-params | |
) => { | |
interface ProcessNextEditDataParams { | |
filePath: string; | |
beforeContent: string; | |
afterContent: string; | |
cursorPosBeforeEdit: Position; | |
cursorPosAfterPrevEdit: Position; | |
ide: IDE; | |
configHandler: ConfigHandler; | |
getDefinitionsFromLsp: GetLspDefinitionsFunction; | |
recentlyEditedRanges: RecentlyEditedRange[]; | |
recentlyVisitedRanges: AutocompleteCodeSnippet[]; | |
workspaceDir: string; | |
modelNameOrInstance?: string | undefined; | |
} | |
export const processNextEditData = async ({ | |
filePath, | |
beforeContent, | |
afterContent, | |
cursorPosBeforeEdit, | |
cursorPosAfterPrevEdit, | |
ide, | |
configHandler, | |
getDefinitionsFromLsp, | |
recentlyEditedRanges, | |
recentlyVisitedRanges, | |
workspaceDir, | |
modelNameOrInstance, | |
}: ProcessNextEditDataParams) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! resolved as per commit 7aebea0
content: string; | ||
} | ||
|
||
export function renderDefaultSystemPrompt(): SystemPrompt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be easier to read if it was written as a template literal instead of an array of strings. We do that for our other system messages.
Also, thoughts on adding an example in this prompt? Feels likely that would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using templates now. I feel like the template makes it self-explaining, but let me know if you think otherwise.
Description
Logs user editing data to construct a next-edit-prediction dataset, as follows:
Introduces capability to aggregate (chunk) user actions into coherent edits based on spatial and temporal locality. Captures the user cursor position before the edit and after the previous edit, for use in constructing an editable-region excerpt. Establishes a pipeline to get autocomplete context snippets manually, which is useful for logging alongside edits. Routes items necessary for edit logging to dev data using a new nextEdit schema, conditional on the user enabling this experimental feature. Limits logging to the open-source continue repo for data safety.
This PR also includes several of @jpoly1219's updates which lay foundations for next edit functionality. These are disabled for now.
EDIT: @jpoly1219 merged his changes here as well ( see #6287 ). They show up under this PR.
Checklist