Skip to content

feat: transactional applier for structured multi-file edits#96

Open
jamubc wants to merge 1 commit into
mainfrom
feat/transactional-edit-apply
Open

feat: transactional applier for structured multi-file edits#96
jamubc wants to merge 1 commit into
mainfrom
feat/transactional-edit-apply

Conversation

@jamubc

@jamubc jamubc commented Jun 5, 2026

Copy link
Copy Markdown
Owner

Summary

Adds apply-edits, a tool that accepts the raw OLD/NEW text produced by ask-gemini changeMode:true and applies every edit atomically — all files are written or none are, with automatic rollback on partial failure. This makes repo-scale, multi-file refactors produced via changeMode safe to apply in one step.

What changes

  • src/utils/changeModeApplier.ts — a pure, IO-injectable utility:
    • planApply resolves each target path (confined to the project root; absolute, ~-prefixed, and ..-escaping paths rejected before any read/write), then requires each OLD block to match exactly once (0 → "not found", >1 → "ambiguous"), building the new file content in memory.
    • renderDiff produces a unified-diff preview with no external dependencies.
    • commitPlan validates everything first, then writes; if a write throws partway, already-written files are restored to their original content before re-throwing.
  • src/tools/apply-edits.tool.tsapply-edits UnifiedTool (category utility). Two-phase by design: dryRun (default true) returns the diff and writes nothing; applying requires dryRun: false and confirm: true.

Verification

  • npx tsc --noEmit — clean
  • node scripts/run-tests.mjs unit integration — 81 pass / 0 fail (planApply success/not-found/ambiguous/path-escape, renderDiff, and commitPlan atomicity + rollback are all covered)

Notes

  • Opt-in: a new tool; no existing behavior changes.
  • OLD-block matching is textual (exact substring), matching the existing changeMode contract — whitespace-divergent blocks report "not found" rather than applying a wrong edit.

Adds the `apply-edits` tool and its backing `changeModeApplier` utility,
enabling repo-scale refactors produced by `ask-gemini changeMode` to be
validated and applied atomically.

- `src/utils/changeModeApplier.ts`: pure, IO-injectable module with
  `planApply` (validate + in-memory apply), `renderDiff` (line-based
  unified diff, no external deps), and `commitPlan` (all-or-nothing
  write with rollback on partial failure).
- `src/tools/apply-edits.tool.ts`: `apply-edits` UnifiedTool (category
  utility). Two-phase: dryRun:true (default) previews; dryRun:false +
  confirm:true commits. Paths are confined to cwd (rejects absolute,
  ~-prefixed, and ..-escaping references).
- `src/constants.ts`: added APPLY_EDITS_* error and status messages.
- `src/tools/index.ts`: registers `applyEditsTool`.
- `test/unit/utils/changeModeApplier.test.ts`: hermetic unit tests for
  planApply (success, not-found, ambiguous, path escapes), renderDiff,
  and commitPlan atomicity / rollback.
- `test/unit/tools/applyEdits.test.ts`: tool-level dryRun, error-path,
  and registry/schema tests.
- `docs/usage/apply-edits.md`: usage guide (workflow, parameters, error
  table, example session).

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the apply-edits tool, which transactionally applies structured changeMode edit sets (OLD/NEW blocks) produced by ask-gemini. It includes validation of edits, an in-memory diff preview, and atomic file writes with automatic rollback on failure. The PR also adds comprehensive unit tests, utility functions, and documentation. The review feedback highlights two important improvement opportunities: optimizing the $O(M \cdot N)$ LCS line diff algorithm by trimming common prefixes/suffixes to prevent OOM or event loop blocking on large files, and normalizing CRLF/LF line endings to ensure robust cross-platform matching on Windows.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +298 to +337
function diffLines(a: string[], b: string[]): DiffOp[] {
const m = a.length;
const n = b.length;

// Build LCS length table.
// dp[i][j] = LCS length of a[0..i-1] vs b[0..j-1]
const dp: number[][] = [];
for (let i = 0; i <= m; i++) {
dp[i] = new Array(n + 1).fill(0);
}
for (let i = 1; i <= m; i++) {
for (let j = 1; j <= n; j++) {
if (a[i - 1] === b[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
} else {
dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
}
}
}

// Back-track to build the edit sequence.
const ops: DiffOp[] = [];
let i = m;
let j = n;
while (i > 0 || j > 0) {
if (i > 0 && j > 0 && a[i - 1] === b[j - 1]) {
ops.push({ type: 'equal', line: a[i - 1] });
i--;
j--;
} else if (j > 0 && (i === 0 || dp[i][j - 1] >= dp[i - 1][j])) {
ops.push({ type: 'add', line: b[j - 1] });
j--;
} else {
ops.push({ type: 'remove', line: a[i - 1] });
i--;
}
}
ops.reverse();
return ops;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current diffLines implementation uses a standard $O(M \cdot N)$ LCS dynamic programming algorithm over the entire file contents. For large files (e.g., thousands of lines), this will allocate a massive 2D array, leading to high memory consumption, potential Out-Of-Memory (OOM) crashes, and blocking the single-threaded Node.js event loop for several seconds.\n\nSince code edits are typically highly localized, we can optimize this dramatically by:\n1. Trimming the common prefix and common suffix of the two files before running the DP. This reduces the DP table size to only the modified region (usually very small).\n2. Adding a safeguard threshold (e.g., $10,000,000$ operations) to fall back to a simple block-based diff if a massive file is completely rewritten, preventing any potential event loop freeze or OOM.

function diffLines(a: string[], b: string[]): DiffOp[] {\n  const m = a.length;\n  const n = b.length;\n\n  // Find common prefix\n  let start = 0;\n  while (start < m && start < n && a[start] === b[start]) {\n    start++;\n  }\n\n  // Find common suffix\n  let endA = m - 1;\n  let endB = n - 1;\n  while (endA >= start && endB >= start && a[endA] === b[endB]) {\n    endA--;\n    endB--;\n  }\n\n  const middleA = a.slice(start, endA + 1);\n  const middleB = b.slice(start, endB + 1);\n\n  const mMid = middleA.length;\n  const nMid = middleB.length;\n\n  // Reconstruct full ops helper\n  const reconstructOps = (middleOps: DiffOp[]): DiffOp[] => {\n    const prefixOps: DiffOp[] = [];\n    for (let k = 0; k < start; k++) {\n      prefixOps.push({ type: 'equal', line: a[k] });\n    }\n    const suffixOps: DiffOp[] = [];\n    for (let k = endA + 1; k < m; k++) {\n      suffixOps.push({ type: 'equal', line: a[k] });\n    }\n    return [...prefixOps, ...middleOps, ...suffixOps];\n  };\n\n  // Fallback to simple diff to prevent OOM/CPU block on massive completely-rewritten files\n  if (mMid * nMid > 10000000) {\n    const middleOps: DiffOp[] = [];\n    for (const line of middleA) {\n      middleOps.push({ type: 'remove', line });\n    }\n    for (const line of middleB) {\n      middleOps.push({ type: 'add', line });\n    }\n    return reconstructOps(middleOps);\n  }\n\n  // Build LCS length table for the middle part only.\n  const dp: number[][] = [];\n  for (let i = 0; i <= mMid; i++) {\n    dp[i] = new Array(nMid + 1).fill(0);\n  }\n  for (let i = 1; i <= mMid; i++) {\n    for (let j = 1; j <= nMid; j++) {\n      if (middleA[i - 1] === middleB[j - 1]) {\n        dp[i][j] = dp[i - 1][j - 1] + 1;\n      } else {\n        dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);\n      }\n    }\n  }\n\n  // Back-track to build the edit sequence for the middle part.\n  const middleOps: DiffOp[] = [];\n  let i = mMid;\n  let j = nMid;\n  while (i > 0 || j > 0) {\n    if (i > 0 && j > 0 && middleA[i - 1] === middleB[j - 1]) {\n      middleOps.push({ type: 'equal', line: middleA[i - 1] });\n      i--;\n      j--;\n    } else if (j > 0 && (i === 0 || dp[i][j - 1] >= dp[i - 1][j])) {\n      middleOps.push({ type: 'add', line: middleB[j - 1] });\n      j--;\n    } else {\n      middleOps.push({ type: 'remove', line: middleA[i - 1] });\n      i--;\n    }\n  }\n  middleOps.reverse();\n\n  return reconstructOps(middleOps);\n}

Comment on lines +137 to +144
let currentContent = originalContent;
let fileErrors = 0;

for (const { filename, edit } of entries) {
const { oldCode, newCode } = edit;

// Count occurrences of oldCode in the current (already-partially-edited) content.
const occurrences = countOccurrences(currentContent, oldCode);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

On Windows platforms, files often use CRLF (\r\n) line endings, whereas LLM outputs (and parsed edit blocks) typically use LF (\n). If there is a mismatch, the exact substring matching in countOccurrences will fail to find the oldCode block, resulting in a "not found" error.\n\nTo make the tool robust across different platforms, we should detect the line endings of the target file and normalize the edit's oldCode and newCode line endings to match the file before performing the search and replacement.

    const hasCrlf = originalContent.includes('\r\n');\n    let currentContent = originalContent;\n    let fileErrors = 0;\n\n    for (const { filename, edit } of entries) {\n      let oldCode = edit.oldCode;\n      let newCode = edit.newCode;\n\n      if (hasCrlf) {\n        oldCode = oldCode.replace(/\r?\n/g, '\r\n');\n        newCode = newCode.replace(/\r?\n/g, '\r\n');\n      } else {\n        oldCode = oldCode.replace(/\r\n/g, '\n');\n        newCode = newCode.replace(/\r\n/g, '\n');\n      }\n\n      // Count occurrences of oldCode in the current (already-partially-edited) content.\n      const occurrences = countOccurrences(currentContent, oldCode);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants