Skip to content

feat: expand git submodule contents in sandbox filetree#238

Merged
sweetmantech merged 3 commits intotestfrom
sweetmantech/myc-4324-api-get-apisandboxes-update-filetree-to-include-contents-of
Feb 24, 2026
Merged

feat: expand git submodule contents in sandbox filetree#238
sweetmantech merged 3 commits intotestfrom
sweetmantech/myc-4324-api-get-apisandboxes-update-filetree-to-include-contents-of

Conversation

@sweetmantech
Copy link
Copy Markdown
Contributor

@sweetmantech sweetmantech commented Feb 24, 2026

Summary

  • Detects git submodule entries (type "commit") in the GitHub Trees API response, which previously appeared as plain files in the filetree
  • Fetches .gitmodules to resolve submodule GitHub URLs, then recursively fetches each submodule's file tree
  • Merges submodule contents into the parent tree with correct path prefixes, converting submodule entries to directory ("tree") entries
  • Gracefully degrades: if .gitmodules or a submodule fetch fails, the submodule still appears as an empty directory

Test plan

  • Added parseGitModules unit tests (6 tests) — parsing single/multiple entries, spaces, incomplete entries
  • Added getRepoFileTree submodule tests (5 tests) — expansion, .gitmodules failure, missing URL, fetch failure, multiple submodules
  • All 13 getRepoFileTree tests pass (8 existing + 5 new)
  • All 16 getSandboxesHandler tests still pass (no changes needed at handler level)
  • Lint clean on all changed files

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added support for Git submodules: .gitmodules are fetched and parsed, submodule contents are recursively expanded and shown as nested entries; unresolved submodules are added as placeholders.
    • New parsing and fetch logic to read .gitmodules and integrate submodule trees.
  • Bug Fixes

    • Improved error handling and graceful fallbacks for missing auth, parse failures, and API errors.
  • Documentation

    • Added note about submodule expansion in function docs.

The GitHub Trees API returns submodules as type "commit" entries, which
previously appeared as files in the filetree. Now detects submodule entries,
fetches .gitmodules to resolve their URLs, and recursively fetches each
submodule's tree to include as directory contents.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Feb 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
recoup-api Ready Ready Preview Feb 24, 2026 4:37pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 24, 2026

Warning

Rate limit exceeded

@sweetmantech has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 48 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between c790eb5 and 5255b17.

⛔ Files ignored due to path filters (1)
  • lib/github/__tests__/expandSubmoduleEntries.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (2)
  • lib/github/expandSubmoduleEntries.ts
  • lib/github/getRepoFileTree.ts
📝 Walkthrough

Walkthrough

Adds Git submodule support to repository file-tree retrieval: detects submodule entries, fetches and parses .gitmodules, and recursively expands submodule file trees into the parent tree with fallback placeholders and existing error handling.

Changes

Cohort / File(s) Summary
Submodule Expansion Core
lib/github/getRepoFileTree.ts
Detects submodule entries (type "commit"), fetches .gitmodules via new helper, builds path→URL map, recursively resolves and inlines submodule file trees under parent paths; preserves fallbacks when .gitmodules or submodule fetches fail.
Submodule Parsing Utility
lib/github/parseGitModules.ts
New module exporting SubmoduleEntry and parseGitModules(content: string) which parses .gitmodules content line-by-line and returns path/url pairs.
GitModules Fetcher
lib/github/getRepoGitModules.ts
New module getRepoGitModules({owner, repo, branch}) that fetches raw .gitmodules from GitHub (honors GITHUB_TOKEN) and returns parsed SubmoduleEntry[] or null on non-OK responses.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant getRepoFileTree as getRepoFileTree()
    participant GitHubAPI as GitHub API
    participant getRepoGitModules as getRepoGitModules()
    participant parseGitModules as parseGitModules()

    Client->>getRepoFileTree: Request file tree
    getRepoFileTree->>GitHubAPI: Fetch repo tree
    GitHubAPI-->>getRepoFileTree: Tree entries (may include submodules)

    alt Submodules detected
        getRepoFileTree->>getRepoGitModules: Fetch .gitmodules (owner,repo,branch)
        getRepoGitModules->>GitHubAPI: Request raw .gitmodules
        GitHubAPI-->>getRepoGitModules: .gitmodules content / 404
        getRepoGitModules->>parseGitModules: Parse content
        parseGitModules-->>getRepoFileTree: SubmoduleEntry[] (path,url)

        loop For each submodule
            getRepoFileTree->>GitHubAPI: Fetch submodule repo tree (via resolved URL)
            GitHubAPI-->>getRepoFileTree: Submodule tree entries
            getRepoFileTree->>getRepoFileTree: Recursively expand and prepend parent path
        end
    end

    getRepoFileTree-->>Client: Expanded file tree (including nested submodules)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🌲 Submodules whispered, then widened the view,
Paths stitched together, old and new,
.gitmodules read with tidy regard,
Trees unfolded, nested and starred,
One repository, many stories now true.

🚥 Pre-merge checks | ✅ 1
✅ Passed checks (1 passed)
Check name Status Explanation
Solid & Clean Code ✅ Passed PR demonstrates strong SOLID adherence with excellent separation of concerns. New modules (parseGitModules, getRepoGitModules) follow single responsibility principle with focused purposes. Code is well-tested with 13 comprehensive tests covering edge cases.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sweetmantech/myc-4324-api-get-apisandboxes-update-filetree-to-include-contents-of

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/github/getRepoFileTree.ts (1)

18-116: 🛠️ Refactor suggestion | 🟠 Major

Decompose getRepoFileTree to keep it under 50 lines and SRP‑aligned.

This function now handles token validation, repo metadata retrieval, tree fetch, gitmodules parsing, and submodule expansion in one block. Extracting helpers (e.g., fetchDefaultBranch, fetchRepoTree, fetchGitmodules, expandSubmodules) will improve readability, testing, and maintenance.

As per coding guidelines, “Single responsibility per function” and “Keep functions under 50 lines.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/github/getRepoFileTree.ts` around lines 18 - 116, getRepoFileTree is too
large and does multiple responsibilities; split it into small helpers: implement
fetchDefaultBranch(repoInfo, headers) to get default_branch using
parseGitHubRepoUrl, fetchRepoTree(repoInfo, branch, headers) to call
/git/trees?recursive=1 and return tree array, fetchGitmodules(repoInfo, branch,
headers) to fetch the raw .gitmodules text (or null if missing), and
expandSubmodules(submoduleEntries, submodulesMap, headers) (or
expandSubmodules(submoduleEntries) that calls getRepoFileTree recursively) to
resolve submodule entries into flattened FileTreeEntry[]; then refactor
getRepoFileTree to validate GITHUB_TOKEN, build headers, call these helpers in
sequence and assemble regularEntries exactly as before, preserving use of
parseGitModules and the FileTreeEntry shape.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/github/getRepoFileTree.ts`:
- Around line 97-104: The submodule expansion in getRepoFileTree ignores the
pinned submodule commit because it calls getRepoFileTree(url) without using
sub.sha; fix by resolving the submodule commit SHA to its tree SHA before
recursing: when iterating submoduleEntries (and using submoduleUrlMap), if
sub.sha exists fetch the commit object for that repo/sha to extract
commit.tree.sha, then call getRepoFileTree with that tree SHA (or extend
getRepoFileTree to accept an explicit tree SHA/ref) instead of calling
getRepoFileTree(url) so the recursive fetch uses the pinned commit’s tree rather
than the default branch.

---

Outside diff comments:
In `@lib/github/getRepoFileTree.ts`:
- Around line 18-116: getRepoFileTree is too large and does multiple
responsibilities; split it into small helpers: implement
fetchDefaultBranch(repoInfo, headers) to get default_branch using
parseGitHubRepoUrl, fetchRepoTree(repoInfo, branch, headers) to call
/git/trees?recursive=1 and return tree array, fetchGitmodules(repoInfo, branch,
headers) to fetch the raw .gitmodules text (or null if missing), and
expandSubmodules(submoduleEntries, submodulesMap, headers) (or
expandSubmodules(submoduleEntries) that calls getRepoFileTree recursively) to
resolve submodule entries into flattened FileTreeEntry[]; then refactor
getRepoFileTree to validate GITHUB_TOKEN, build headers, call these helpers in
sequence and assemble regularEntries exactly as before, preserving use of
parseGitModules and the FileTreeEntry shape.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f8656c and 3c9ac81.

⛔ Files ignored due to path filters (2)
  • lib/github/__tests__/getRepoFileTree.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
  • lib/github/__tests__/parseGitModules.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (2)
  • lib/github/getRepoFileTree.ts
  • lib/github/parseGitModules.ts

Comment on lines +97 to +104
const submoduleResults = await Promise.all(
submoduleEntries.map(async sub => {
const url = submoduleUrlMap.get(sub.path);
if (!url) return { path: sub.path, sha: sub.sha, entries: null };

const entries = await getRepoFileTree(url);
return { path: sub.path, sha: sub.sha, entries };
}),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub API git trees endpoint accept commit SHA parameter documentation

💡 Result:

The Git trees endpoint is documented as:

  • GET /repos/{owner}/{repo}/git/trees/{tree_sha}
  • where tree_sha is “the SHA1 value or ref (branch or tag) name of the tree” (i.e., a tree SHA, or a branch/tag ref name), per the official docs. [1]

If you have a commit SHA, the documented flow is:

  1. GET /repos/{owner}/{repo}/git/commits/{commit_sha} to retrieve the commit object (the response includes a tree.sha). [2]
  2. GET /repos/{owner}/{repo}/git/trees/{tree_sha} using that tree.sha. [1]

Docs: Git trees [1], Git commits (get commit object) [2].


Submodule expansion ignores pinned commit SHA.

Line 102 calls getRepoFileTree(url) without the submodule's sha, so the recursive fetch uses the submodule's default branch instead of the pinned commit, which can return incorrect contents.

The GitHub Trees API endpoint /git/trees/{tree_sha} accepts a tree SHA or branch/tag ref name, but not a commit SHA directly. To use a commit SHA, you must first fetch the commit object to extract its tree SHA.

🔧 Suggested fix (fetch commit object to get tree SHA)
-export async function getRepoFileTree(githubRepoUrl: string): Promise<FileTreeEntry[] | null> {
+export async function getRepoFileTree(
+  githubRepoUrl: string,
+  ref?: string,
+): Promise<FileTreeEntry[] | null> {
   const token = process.env.GITHUB_TOKEN;
   if (!token) {
     console.error("GITHUB_TOKEN environment variable is not set");
     return null;
   }

   const repoInfo = parseGitHubRepoUrl(githubRepoUrl);
   if (!repoInfo) {
     console.error(`Failed to parse GitHub repo URL: ${githubRepoUrl}`);
     return null;
   }

   const headers = {
     Authorization: `Bearer ${token}`,
     Accept: "application/vnd.github.v3+json",
     "User-Agent": "Recoup-API",
   };

   try {
-    const repoResponse = await fetch(
-      `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}`,
-      { headers },
-    );
-    if (!repoResponse.ok) {
-      console.error(`GitHub API error fetching repo: ${repoResponse.status}`);
-      return null;
-    }
-    const repoData = (await repoResponse.json()) as { default_branch: string };
-    const defaultBranch = repoData.default_branch;
+    let treeRef = ref;
+    
+    // If ref is a commit SHA (64 hex chars), fetch the commit to get its tree SHA
+    if (treeRef && /^[a-f0-9]{40}$/.test(treeRef)) {
+      const commitResponse = await fetch(
+        `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}/git/commits/${treeRef}`,
+        { headers },
+      );
+      if (!commitResponse.ok) {
+        console.error(`GitHub API error fetching commit: ${commitResponse.status}`);
+        return null;
+      }
+      const commitData = (await commitResponse.json()) as { tree: { sha: string } };
+      treeRef = commitData.tree.sha;
+    } else if (!treeRef) {
+      // No ref provided; fetch default branch
+      const repoResponse = await fetch(
+        `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}`,
+        { headers },
+      );
+      if (!repoResponse.ok) {
+        console.error(`GitHub API error fetching repo: ${repoResponse.status}`);
+        return null;
+      }
+      const repoData = (await repoResponse.json()) as { default_branch: string };
+      treeRef = repoData.default_branch;
+    }

     const treeResponse = await fetch(
-      `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}/git/trees/${defaultBranch}?recursive=1`,
+      `https://api.github.com/repos/${repoInfo.owner}/${repoInfo.repo}/git/trees/${treeRef}?recursive=1`,
       { headers },
-        const entries = await getRepoFileTree(url);
+        const entries = await getRepoFileTree(url, sub.sha);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/github/getRepoFileTree.ts` around lines 97 - 104, The submodule expansion
in getRepoFileTree ignores the pinned submodule commit because it calls
getRepoFileTree(url) without using sub.sha; fix by resolving the submodule
commit SHA to its tree SHA before recursing: when iterating submoduleEntries
(and using submoduleUrlMap), if sub.sha exists fetch the commit object for that
repo/sha to extract commit.tree.sha, then call getRepoFileTree with that tree
SHA (or extend getRepoFileTree to accept an explicit tree SHA/ref) instead of
calling getRepoFileTree(url) so the recursive fetch uses the pinned commit’s
tree rather than the default branch.

Moves the raw.githubusercontent.com fetch + .gitmodules parsing out of
getRepoFileTree into a dedicated getRepoGitModules function for SRP.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Moves submodule detection, URL resolution, tree fetching, and merging
out of getRepoFileTree into a dedicated expandSubmoduleEntries function.
getRepoFileTree now only fetches the raw GitHub tree and delegates
submodule expansion.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
lib/github/getRepoFileTree.ts (1)

96-102: Submodule expansion still ignores pinned commit SHA.

Line 101 calls getRepoFileTree(url) without using sub.sha, so contents may not match the pinned submodule state. This matches the prior review note.

GitHub REST API "Get a tree" commit SHA vs tree SHA; how to resolve commit SHA to tree SHA
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/github/getRepoFileTree.ts` around lines 96 - 102, The submodule expansion
ignores the pinned commit SHA because getRepoFileTree(url) is called without
sub.sha; update the logic so submoduleResults passes the pinned SHA into
getRepoFileTree (e.g., call getRepoFileTree(url, sub.sha)) or modify
getRepoFileTree to accept an optional commitSha parameter and when provided
resolve the commit to its tree SHA (use the GitHub API to get the commit ->
commit.tree.sha or fetch the tree for that commit) before requesting the tree
entries; reference submoduleResults, submoduleEntries, submoduleUrlMap,
getRepoFileTree, and sub.sha when making this change.
🧹 Nitpick comments (1)
lib/github/getRepoFileTree.ts (1)

18-115: Consider extracting submodule expansion helpers to keep this function short.

This function now spans well over 50 lines; splitting tree fetch and submodule merge into private helpers would improve readability/testability. As per coding guidelines, keep functions under 50 lines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/github/getRepoFileTree.ts` around lines 18 - 115, The getRepoFileTree
function is too long; extract the tree-fetching and submodule-expansion logic
into small helpers to keep getRepoFileTree under 50 lines. Concretely: move the
code that calls the GitHub tree API and builds regularEntries/submoduleEntries
into a helper like fetchRepoTree(repoInfo, defaultBranch, headers) that returns
{ regularEntries, submoduleEntries }, and move the logic that resolves
submodules (calls getRepoGitModules, maps submodule URLs, recursively calls
getRepoFileTree, and merges entries) into a helper like
expandSubmodules(submoduleEntries, repoInfo, defaultBranch, headers). Update
getRepoFileTree to call these two helpers in sequence and return the final
entries. Use the existing identifiers (getRepoFileTree, getRepoGitModules,
submoduleEntries, regularEntries) so callers and tests remain easy to update.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@lib/github/getRepoFileTree.ts`:
- Around line 96-102: The submodule expansion ignores the pinned commit SHA
because getRepoFileTree(url) is called without sub.sha; update the logic so
submoduleResults passes the pinned SHA into getRepoFileTree (e.g., call
getRepoFileTree(url, sub.sha)) or modify getRepoFileTree to accept an optional
commitSha parameter and when provided resolve the commit to its tree SHA (use
the GitHub API to get the commit -> commit.tree.sha or fetch the tree for that
commit) before requesting the tree entries; reference submoduleResults,
submoduleEntries, submoduleUrlMap, getRepoFileTree, and sub.sha when making this
change.

---

Nitpick comments:
In `@lib/github/getRepoFileTree.ts`:
- Around line 18-115: The getRepoFileTree function is too long; extract the
tree-fetching and submodule-expansion logic into small helpers to keep
getRepoFileTree under 50 lines. Concretely: move the code that calls the GitHub
tree API and builds regularEntries/submoduleEntries into a helper like
fetchRepoTree(repoInfo, defaultBranch, headers) that returns { regularEntries,
submoduleEntries }, and move the logic that resolves submodules (calls
getRepoGitModules, maps submodule URLs, recursively calls getRepoFileTree, and
merges entries) into a helper like expandSubmodules(submoduleEntries, repoInfo,
defaultBranch, headers). Update getRepoFileTree to call these two helpers in
sequence and return the final entries. Use the existing identifiers
(getRepoFileTree, getRepoGitModules, submoduleEntries, regularEntries) so
callers and tests remain easy to update.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c9ac81 and c790eb5.

⛔ Files ignored due to path filters (1)
  • lib/github/__tests__/getRepoGitModules.test.ts is excluded by !**/*.test.*, !**/__tests__/** and included by lib/**
📒 Files selected for processing (2)
  • lib/github/getRepoFileTree.ts
  • lib/github/getRepoGitModules.ts

@sweetmantech sweetmantech merged commit c99f439 into test Feb 24, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant