feat: expand git submodule contents in sandbox filetree#238
Conversation
The GitHub Trees API returns submodules as type "commit" entries, which previously appeared as files in the filetree. Now detects submodule entries, fetches .gitmodules to resolve their URLs, and recursively fetches each submodule's tree to include as directory contents. Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds Git submodule support to repository file-tree retrieval: detects submodule entries, fetches and parses Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant getRepoFileTree as getRepoFileTree()
participant GitHubAPI as GitHub API
participant getRepoGitModules as getRepoGitModules()
participant parseGitModules as parseGitModules()
Client->>getRepoFileTree: Request file tree
getRepoFileTree->>GitHubAPI: Fetch repo tree
GitHubAPI-->>getRepoFileTree: Tree entries (may include submodules)
alt Submodules detected
getRepoFileTree->>getRepoGitModules: Fetch .gitmodules (owner,repo,branch)
getRepoGitModules->>GitHubAPI: Request raw .gitmodules
GitHubAPI-->>getRepoGitModules: .gitmodules content / 404
getRepoGitModules->>parseGitModules: Parse content
parseGitModules-->>getRepoFileTree: SubmoduleEntry[] (path,url)
loop For each submodule
getRepoFileTree->>GitHubAPI: Fetch submodule repo tree (via resolved URL)
GitHubAPI-->>getRepoFileTree: Submodule tree entries
getRepoFileTree->>getRepoFileTree: Recursively expand and prepend parent path
end
end
getRepoFileTree-->>Client: Expanded file tree (including nested submodules)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 1✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/github/getRepoFileTree.ts (1)
18-116: 🛠️ Refactor suggestion | 🟠 MajorDecompose
getRepoFileTreeto keep it under 50 lines and SRP‑aligned.This function now handles token validation, repo metadata retrieval, tree fetch, gitmodules parsing, and submodule expansion in one block. Extracting helpers (e.g.,
fetchDefaultBranch,fetchRepoTree,fetchGitmodules,expandSubmodules) will improve readability, testing, and maintenance.As per coding guidelines, “Single responsibility per function” and “Keep functions under 50 lines.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/github/getRepoFileTree.ts` around lines 18 - 116, getRepoFileTree is too large and does multiple responsibilities; split it into small helpers: implement fetchDefaultBranch(repoInfo, headers) to get default_branch using parseGitHubRepoUrl, fetchRepoTree(repoInfo, branch, headers) to call /git/trees?recursive=1 and return tree array, fetchGitmodules(repoInfo, branch, headers) to fetch the raw .gitmodules text (or null if missing), and expandSubmodules(submoduleEntries, submodulesMap, headers) (or expandSubmodules(submoduleEntries) that calls getRepoFileTree recursively) to resolve submodule entries into flattened FileTreeEntry[]; then refactor getRepoFileTree to validate GITHUB_TOKEN, build headers, call these helpers in sequence and assemble regularEntries exactly as before, preserving use of parseGitModules and the FileTreeEntry shape.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/github/getRepoFileTree.ts`:
- Around line 97-104: The submodule expansion in getRepoFileTree ignores the
pinned submodule commit because it calls getRepoFileTree(url) without using
sub.sha; fix by resolving the submodule commit SHA to its tree SHA before
recursing: when iterating submoduleEntries (and using submoduleUrlMap), if
sub.sha exists fetch the commit object for that repo/sha to extract
commit.tree.sha, then call getRepoFileTree with that tree SHA (or extend
getRepoFileTree to accept an explicit tree SHA/ref) instead of calling
getRepoFileTree(url) so the recursive fetch uses the pinned commit’s tree rather
than the default branch.
---
Outside diff comments:
In `@lib/github/getRepoFileTree.ts`:
- Around line 18-116: getRepoFileTree is too large and does multiple
responsibilities; split it into small helpers: implement
fetchDefaultBranch(repoInfo, headers) to get default_branch using
parseGitHubRepoUrl, fetchRepoTree(repoInfo, branch, headers) to call
/git/trees?recursive=1 and return tree array, fetchGitmodules(repoInfo, branch,
headers) to fetch the raw .gitmodules text (or null if missing), and
expandSubmodules(submoduleEntries, submodulesMap, headers) (or
expandSubmodules(submoduleEntries) that calls getRepoFileTree recursively) to
resolve submodule entries into flattened FileTreeEntry[]; then refactor
getRepoFileTree to validate GITHUB_TOKEN, build headers, call these helpers in
sequence and assemble regularEntries exactly as before, preserving use of
parseGitModules and the FileTreeEntry shape.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
lib/github/__tests__/getRepoFileTree.test.tsis excluded by!**/*.test.*,!**/__tests__/**and included bylib/**lib/github/__tests__/parseGitModules.test.tsis excluded by!**/*.test.*,!**/__tests__/**and included bylib/**
📒 Files selected for processing (2)
lib/github/getRepoFileTree.tslib/github/parseGitModules.ts
lib/github/getRepoFileTree.ts
Outdated
| const submoduleResults = await Promise.all( | ||
| submoduleEntries.map(async sub => { | ||
| const url = submoduleUrlMap.get(sub.path); | ||
| if (!url) return { path: sub.path, sha: sub.sha, entries: null }; | ||
|
|
||
| const entries = await getRepoFileTree(url); | ||
| return { path: sub.path, sha: sub.sha, entries }; | ||
| }), |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
GitHub API git trees endpoint accept commit SHA parameter documentation
💡 Result:
The Git trees endpoint is documented as:
GET /repos/{owner}/{repo}/git/trees/{tree_sha}- where
tree_shais “the SHA1 value or ref (branch or tag) name of the tree” (i.e., a tree SHA, or a branch/tag ref name), per the official docs. [1]
If you have a commit SHA, the documented flow is:
GET /repos/{owner}/{repo}/git/commits/{commit_sha}to retrieve the commit object (the response includes atree.sha). [2]GET /repos/{owner}/{repo}/git/trees/{tree_sha}using thattree.sha. [1]
Docs: Git trees [1], Git commits (get commit object) [2].
Submodule expansion ignores pinned commit SHA.
Line 102 calls getRepoFileTree(url) without the submodule's sha, so the recursive fetch uses the submodule's default branch instead of the pinned commit, which can return incorrect contents.
The GitHub Trees API endpoint /git/trees/{tree_sha} accepts a tree SHA or branch/tag ref name, but not a commit SHA directly. To use a commit SHA, you must first fetch the commit object to extract its tree SHA.
🔧 Suggested fix (fetch commit object to get tree SHA)
-export async function getRepoFileTree(githubRepoUrl: string): Promise<FileTreeEntry[] | null> {
+export async function getRepoFileTree(
+ githubRepoUrl: string,
+ ref?: string,
+): Promise<FileTreeEntry[] | null> {
const token = process.env.GITHUB_TOKEN;
if (!token) {
console.error("GITHUB_TOKEN environment variable is not set");
return null;
}
const repoInfo = parseGitHubRepoUrl(githubRepoUrl);
if (!repoInfo) {
console.error(`Failed to parse GitHub repo URL: ${githubRepoUrl}`);
return null;
}
const headers = {
Authorization: `Bearer ${token}`,
Accept: "application/vnd.github.v3+json",
"User-Agent": "Recoup-API",
};
try {
- const repoResponse = await fetch(
- `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}`,
- { headers },
- );
- if (!repoResponse.ok) {
- console.error(`GitHub API error fetching repo: ${repoResponse.status}`);
- return null;
- }
- const repoData = (await repoResponse.json()) as { default_branch: string };
- const defaultBranch = repoData.default_branch;
+ let treeRef = ref;
+
+ // If ref is a commit SHA (64 hex chars), fetch the commit to get its tree SHA
+ if (treeRef && /^[a-f0-9]{40}$/.test(treeRef)) {
+ const commitResponse = await fetch(
+ `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}/git/commits/${treeRef}`,
+ { headers },
+ );
+ if (!commitResponse.ok) {
+ console.error(`GitHub API error fetching commit: ${commitResponse.status}`);
+ return null;
+ }
+ const commitData = (await commitResponse.json()) as { tree: { sha: string } };
+ treeRef = commitData.tree.sha;
+ } else if (!treeRef) {
+ // No ref provided; fetch default branch
+ const repoResponse = await fetch(
+ `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}`,
+ { headers },
+ );
+ if (!repoResponse.ok) {
+ console.error(`GitHub API error fetching repo: ${repoResponse.status}`);
+ return null;
+ }
+ const repoData = (await repoResponse.json()) as { default_branch: string };
+ treeRef = repoData.default_branch;
+ }
const treeResponse = await fetch(
- `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}/git/trees/${defaultBranch}?recursive=1`,
+ `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}/git/trees/${treeRef}?recursive=1`,
{ headers },- const entries = await getRepoFileTree(url);
+ const entries = await getRepoFileTree(url, sub.sha);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/github/getRepoFileTree.ts` around lines 97 - 104, The submodule expansion
in getRepoFileTree ignores the pinned submodule commit because it calls
getRepoFileTree(url) without using sub.sha; fix by resolving the submodule
commit SHA to its tree SHA before recursing: when iterating submoduleEntries
(and using submoduleUrlMap), if sub.sha exists fetch the commit object for that
repo/sha to extract commit.tree.sha, then call getRepoFileTree with that tree
SHA (or extend getRepoFileTree to accept an explicit tree SHA/ref) instead of
calling getRepoFileTree(url) so the recursive fetch uses the pinned commit’s
tree rather than the default branch.
Moves the raw.githubusercontent.com fetch + .gitmodules parsing out of getRepoFileTree into a dedicated getRepoGitModules function for SRP. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Moves submodule detection, URL resolution, tree fetching, and merging out of getRepoFileTree into a dedicated expandSubmoduleEntries function. getRepoFileTree now only fetches the raw GitHub tree and delegates submodule expansion. Co-Authored-By: Claude Opus 4.6 <[email protected]>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
lib/github/getRepoFileTree.ts (1)
96-102: Submodule expansion still ignores pinned commit SHA.Line 101 calls
getRepoFileTree(url)without usingsub.sha, so contents may not match the pinned submodule state. This matches the prior review note.GitHub REST API "Get a tree" commit SHA vs tree SHA; how to resolve commit SHA to tree SHA🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/github/getRepoFileTree.ts` around lines 96 - 102, The submodule expansion ignores the pinned commit SHA because getRepoFileTree(url) is called without sub.sha; update the logic so submoduleResults passes the pinned SHA into getRepoFileTree (e.g., call getRepoFileTree(url, sub.sha)) or modify getRepoFileTree to accept an optional commitSha parameter and when provided resolve the commit to its tree SHA (use the GitHub API to get the commit -> commit.tree.sha or fetch the tree for that commit) before requesting the tree entries; reference submoduleResults, submoduleEntries, submoduleUrlMap, getRepoFileTree, and sub.sha when making this change.
🧹 Nitpick comments (1)
lib/github/getRepoFileTree.ts (1)
18-115: Consider extracting submodule expansion helpers to keep this function short.This function now spans well over 50 lines; splitting tree fetch and submodule merge into private helpers would improve readability/testability. As per coding guidelines, keep functions under 50 lines.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/github/getRepoFileTree.ts` around lines 18 - 115, The getRepoFileTree function is too long; extract the tree-fetching and submodule-expansion logic into small helpers to keep getRepoFileTree under 50 lines. Concretely: move the code that calls the GitHub tree API and builds regularEntries/submoduleEntries into a helper like fetchRepoTree(repoInfo, defaultBranch, headers) that returns { regularEntries, submoduleEntries }, and move the logic that resolves submodules (calls getRepoGitModules, maps submodule URLs, recursively calls getRepoFileTree, and merges entries) into a helper like expandSubmodules(submoduleEntries, repoInfo, defaultBranch, headers). Update getRepoFileTree to call these two helpers in sequence and return the final entries. Use the existing identifiers (getRepoFileTree, getRepoGitModules, submoduleEntries, regularEntries) so callers and tests remain easy to update.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@lib/github/getRepoFileTree.ts`:
- Around line 96-102: The submodule expansion ignores the pinned commit SHA
because getRepoFileTree(url) is called without sub.sha; update the logic so
submoduleResults passes the pinned SHA into getRepoFileTree (e.g., call
getRepoFileTree(url, sub.sha)) or modify getRepoFileTree to accept an optional
commitSha parameter and when provided resolve the commit to its tree SHA (use
the GitHub API to get the commit -> commit.tree.sha or fetch the tree for that
commit) before requesting the tree entries; reference submoduleResults,
submoduleEntries, submoduleUrlMap, getRepoFileTree, and sub.sha when making this
change.
---
Nitpick comments:
In `@lib/github/getRepoFileTree.ts`:
- Around line 18-115: The getRepoFileTree function is too long; extract the
tree-fetching and submodule-expansion logic into small helpers to keep
getRepoFileTree under 50 lines. Concretely: move the code that calls the GitHub
tree API and builds regularEntries/submoduleEntries into a helper like
fetchRepoTree(repoInfo, defaultBranch, headers) that returns { regularEntries,
submoduleEntries }, and move the logic that resolves submodules (calls
getRepoGitModules, maps submodule URLs, recursively calls getRepoFileTree, and
merges entries) into a helper like expandSubmodules(submoduleEntries, repoInfo,
defaultBranch, headers). Update getRepoFileTree to call these two helpers in
sequence and return the final entries. Use the existing identifiers
(getRepoFileTree, getRepoGitModules, submoduleEntries, regularEntries) so
callers and tests remain easy to update.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
lib/github/__tests__/getRepoGitModules.test.tsis excluded by!**/*.test.*,!**/__tests__/**and included bylib/**
📒 Files selected for processing (2)
lib/github/getRepoFileTree.tslib/github/getRepoGitModules.ts
Summary
"commit") in the GitHub Trees API response, which previously appeared as plain files in the filetree.gitmodulesto resolve submodule GitHub URLs, then recursively fetches each submodule's file tree"tree") entries.gitmodulesor a submodule fetch fails, the submodule still appears as an empty directoryTest plan
parseGitModulesunit tests (6 tests) — parsing single/multiple entries, spaces, incomplete entriesgetRepoFileTreesubmodule tests (5 tests) — expansion,.gitmodulesfailure, missing URL, fetch failure, multiple submodulesgetRepoFileTreetests pass (8 existing + 5 new)getSandboxesHandlertests still pass (no changes needed at handler level)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation