Skip to content

⚡ Bolt: [Performance Fix] Optimize N+1 queries in report card validation#231

Open
ldsgroups225 wants to merge 1 commit intomasterfrom
bolt/optimize-report-card-validation-15308150819762732944
Open

⚡ Bolt: [Performance Fix] Optimize N+1 queries in report card validation#231
ldsgroups225 wants to merge 1 commit intomasterfrom
bolt/optimize-report-card-validation-15308150819762732944

Conversation

@ldsgroups225
Copy link
Copy Markdown
Owner

@ldsgroups225 ldsgroups225 commented Apr 1, 2026

💡 What:
Replaced the Promise.all logic inside validateReportCardPayloads that sequentially mapped over each ID array with single parallelized Drizzle bulk operations leveraging the inArray() condition.

🎯 Why:
The original approach suffered from an N+1 query vulnerability where for a large input payload of report cards, multiple single rows were queried instead of bulk retrieving everything. This led to high DB connection limits and heavy I/O operations for larger imports.

📊 Impact:
Significantly reduces query load from O(N) queries to an exact parallelization of O(1) query per unique ID group (5 queries total, run simultaneously via Promise.all()).

🔬 Measurement:
Run pnpm --filter @repo/data-ops test locally and check test passing results. The execution time of report card seeding routines should drop substantially.


PR created automatically by Jules for task 15308150819762732944 started by @ldsgroups225

Summary by CodeRabbit

  • Bug Fixes

    • Added explicit validation checks for report card payloads to ensure all referenced students, classes, terms, school years, and templates exist, with clear error messages when data is missing.
  • Refactor

    • Improved report card validation performance through optimized query patterns, reducing overall execution time.

@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

📝 Walkthrough

Walkthrough

The validateReportCardPayloads function was refactored to replace sequential per-ID scoped lookups with bulk parallelized Drizzle queries using inArray() filters, eliminating N+1 query patterns and adding explicit count validation for each entity type (students, classes, terms, school years, templates).

Changes

Cohort / File(s) Summary
Database Query Optimization
packages/data-ops/src/queries/report-cards.ts
Refactored validateReportCardPayloads to replace per-ID getScoped* function calls with bulk inArray() Drizzle queries executed in parallel via Promise.all(), adds explicit length-matching validation for each result set, and throws dbError('NOT_FOUND', ...) when retrieval counts don't match requested IDs.
Changelog Documentation
pr_body.md
Added new changelog entry documenting the N+1 query pattern elimination and bulk query optimization strategy, with guidance for test execution and performance verification.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 Queries once ran one-by-one, so slow,
But bulk operations steal the show!
Parallel threads dance through the night,
One database roundtrip, shining bright! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: optimizing N+1 queries in report card validation by converting sequential per-ID queries to parallelized bulk operations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bolt/optimize-report-card-validation-15308150819762732944

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
packages/data-ops/src/queries/report-cards.ts (2)

435-449: Count validation is correct; error messages could be more specific for debugging.

The cardinality checks correctly detect missing/unauthorized entities. However, unlike the previous per-ID approach, the error doesn't identify which specific ID(s) failed validation. For large batch imports, this could complicate troubleshooting.

💡 Optional: Include missing IDs in error details for easier debugging
   if (studentResults.length !== uniqueStudentIds.length) {
-    throw dbError('NOT_FOUND', 'One or more students not found or access denied')
+    const foundIds = new Set(studentResults.map(s => s.id))
+    const missingIds = uniqueStudentIds.filter(id => !foundIds.has(id))
+    throw dbError('NOT_FOUND', 'One or more students not found or access denied', { missingIds })
   }

Apply the same pattern for other entity types. The details parameter of dbError supports arbitrary metadata.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/data-ops/src/queries/report-cards.ts` around lines 435 - 449, The
count checks are fine but the thrown dbError calls should include which specific
IDs are missing to aid debugging: for each entity type (studentResults vs
uniqueStudentIds, classResults vs uniqueClassIds, termResults vs uniqueTermIds,
schoolYearResults vs uniqueSchoolYearIds, templateResults vs uniqueTemplateIds)
compute the set difference (e.g., IDs in uniqueXIds not present in results.map(r
=> r.id)) and pass those missing IDs in the dbError details parameter (use
dbError('NOT_FOUND', 'One or more X not found or access denied', { missingIds
})) so each error message names the missing IDs while keeping the existing error
code and text.

424-433: Performance optimization looks correct; consider breaking up long lines for readability.

The bulk query refactoring correctly preserves the existing behavior:

  • All queries maintain schoolId scoping for multi-tenant isolation
  • The filters match the original getScopedX helper functions (no additional status filters were present before)

The implementation is functionally sound, but the single-line queries (lines 428-432) are quite long and hard to scan.

♻️ Optional: Break queries into separate statements for readability
   const db = getDb()

-  // ⚡ Bolt Optimization: Replace Promise.all/N+1 queries with parallelized bulk Drizzle queries
-  const [studentResults, classResults, termResults, schoolYearResults, templateResults] = await Promise.all([
-    uniqueStudentIds.length ? db.select({ id: students.id }).from(students).where(and(inArray(students.id, uniqueStudentIds), eq(students.schoolId, schoolId))) : Promise.resolve([]),
-    uniqueClassIds.length ? db.select({ id: classes.id, schoolYearId: classes.schoolYearId }).from(classes).where(and(inArray(classes.id, uniqueClassIds), eq(classes.schoolId, schoolId))) : Promise.resolve([]),
-    uniqueTermIds.length ? db.select({ id: terms.id, schoolYearId: terms.schoolYearId }).from(terms).innerJoin(schoolYears, eq(terms.schoolYearId, schoolYears.id)).where(and(inArray(terms.id, uniqueTermIds), eq(schoolYears.schoolId, schoolId))) : Promise.resolve([]),
-    uniqueSchoolYearIds.length ? db.select({ id: schoolYears.id }).from(schoolYears).where(and(inArray(schoolYears.id, uniqueSchoolYearIds), eq(schoolYears.schoolId, schoolId))) : Promise.resolve([]),
-    uniqueTemplateIds.length ? db.select({ id: reportCardTemplates.id }).from(reportCardTemplates).where(and(inArray(reportCardTemplates.id, uniqueTemplateIds), eq(reportCardTemplates.schoolId, schoolId))) : Promise.resolve([]),
-  ])
+  // ⚡ Bolt Optimization: Replace Promise.all/N+1 queries with parallelized bulk Drizzle queries
+  const [studentResults, classResults, termResults, schoolYearResults, templateResults] = await Promise.all([
+    uniqueStudentIds.length
+      ? db.select({ id: students.id })
+          .from(students)
+          .where(and(inArray(students.id, uniqueStudentIds), eq(students.schoolId, schoolId)))
+      : Promise.resolve([]),
+    uniqueClassIds.length
+      ? db.select({ id: classes.id, schoolYearId: classes.schoolYearId })
+          .from(classes)
+          .where(and(inArray(classes.id, uniqueClassIds), eq(classes.schoolId, schoolId)))
+      : Promise.resolve([]),
+    uniqueTermIds.length
+      ? db.select({ id: terms.id, schoolYearId: terms.schoolYearId })
+          .from(terms)
+          .innerJoin(schoolYears, eq(terms.schoolYearId, schoolYears.id))
+          .where(and(inArray(terms.id, uniqueTermIds), eq(schoolYears.schoolId, schoolId)))
+      : Promise.resolve([]),
+    uniqueSchoolYearIds.length
+      ? db.select({ id: schoolYears.id })
+          .from(schoolYears)
+          .where(and(inArray(schoolYears.id, uniqueSchoolYearIds), eq(schoolYears.schoolId, schoolId)))
+      : Promise.resolve([]),
+    uniqueTemplateIds.length
+      ? db.select({ id: reportCardTemplates.id })
+          .from(reportCardTemplates)
+          .where(and(inArray(reportCardTemplates.id, uniqueTemplateIds), eq(reportCardTemplates.schoolId, schoolId)))
+      : Promise.resolve([]),
+  ])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/data-ops/src/queries/report-cards.ts` around lines 424 - 433, The
bulk parallel query block using getDb() returns studentResults, classResults,
termResults, schoolYearResults, and templateResults but the individual
db.select(...) calls are written as very long single-line expressions; refactor
each conditional query (the expressions referencing students, classes, terms,
schoolYears, reportCardTemplates and their inArray/eq filters) into separate,
short, well-named intermediate statements (e.g., studentQuery, classQuery,
termQuery, schoolYearQuery, templateQuery) before the Promise.all call so each
query is readable while preserving the exact logic and schoolId scoping; then
feed those intermediate promises to Promise.all to keep behavior identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/data-ops/src/queries/report-cards.ts`:
- Around line 435-449: The count checks are fine but the thrown dbError calls
should include which specific IDs are missing to aid debugging: for each entity
type (studentResults vs uniqueStudentIds, classResults vs uniqueClassIds,
termResults vs uniqueTermIds, schoolYearResults vs uniqueSchoolYearIds,
templateResults vs uniqueTemplateIds) compute the set difference (e.g., IDs in
uniqueXIds not present in results.map(r => r.id)) and pass those missing IDs in
the dbError details parameter (use dbError('NOT_FOUND', 'One or more X not found
or access denied', { missingIds })) so each error message names the missing IDs
while keeping the existing error code and text.
- Around line 424-433: The bulk parallel query block using getDb() returns
studentResults, classResults, termResults, schoolYearResults, and
templateResults but the individual db.select(...) calls are written as very long
single-line expressions; refactor each conditional query (the expressions
referencing students, classes, terms, schoolYears, reportCardTemplates and their
inArray/eq filters) into separate, short, well-named intermediate statements
(e.g., studentQuery, classQuery, termQuery, schoolYearQuery, templateQuery)
before the Promise.all call so each query is readable while preserving the exact
logic and schoolId scoping; then feed those intermediate promises to Promise.all
to keep behavior identical.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e17173ba-e0e3-4d52-acb5-52f3e7a5b4ca

📥 Commits

Reviewing files that changed from the base of the PR and between 7e37ef8 and 239f420.

📒 Files selected for processing (2)
  • packages/data-ops/src/queries/report-cards.ts
  • pr_body.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant