Skip to content

🚨 Lighthouse: Optimize multiple COUNT queries#228

Open
ldsgroups225 wants to merge 1 commit intomasterfrom
lighthouse-count-optimization-8444507562885530474
Open

🚨 Lighthouse: Optimize multiple COUNT queries#228
ldsgroups225 wants to merge 1 commit intomasterfrom
lighthouse-count-optimization-8444507562885530474

Conversation

@ldsgroups225
Copy link
Copy Markdown
Owner

@ldsgroups225 ldsgroups225 commented Apr 1, 2026

Impact on performance:
Significantly reduces database load and latency for the dashboard and analytics pages by combining multiple COUNT(*) queries on the same table (such as schools and students) into a single database query. This eliminates multiple parallel table scans and DB connection overhead, shifting the conditional logic to Postgres.

Technical rationale:
Previously, the code executed several independent db.select({ count: count() }) queries with different WHERE clauses for different statuses or date ranges using Promise.all(). Now, it uses a single db.select() with multiple sql<number>\count(*) filter (where ...)`columns, leveraging PostgreSQL'sFILTER` clause. This represents a substantial query optimization by converting N queries into 1.


PR created automatically by Jules for task 8444507562885530474 started by @ldsgroups225

Summary by CodeRabbit

Refactor

  • Optimized performance of analytics and dashboard statistics retrieval, delivering faster data loading for end-users.

@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

📝 Walkthrough

Walkthrough

Multiple analytics and dashboard queries are refactored to consolidate parallel count() queries into single aggregated queries using SQL FILTER expressions, reducing database load and connection overhead across three files.

Changes

Cohort / File(s) Summary
Analytics Query Consolidation
packages/data-ops/src/queries/analytics.ts
Four query functions (getAnalyticsOverview, getSchoolsPerformance, getEnrollmentStats, getEnrollmentGrowth) are refactored to replace multiple parallel count() calls with single aggregated select statements using SQL count(*) FILTER (WHERE ...) expressions. Result extraction updated with Number(...) coercion and derived variable computation.
Dashboard Stats Query Consolidation
packages/data-ops/src/queries/dashboard-stats.ts
getDashboardStats consolidates four separate school count queries into a single aggregated query with filtered count(*) expressions for total, active, inactive, and recent (30-day window) counts, eliminating parallel Promise resolution.
Performance Documentation
pr-body.txt
New file documenting technical rationale for query consolidation: reduced database load, latency, and connection overhead by moving conditional aggregation into PostgreSQL via single db.select() with filtered aggregates rather than multiple db.select({ count: count() }) calls.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~18 minutes

Possibly related PRs

Poem

🐰 In database fields where queries did play,
Four counts became one, in a brighter way,
No parallel scans to make connections run,
One filtered SELECT—now the work is done!
Latency drops, and the database grins,
Consolidation magic—let optimization begin!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title references 'Lighthouse' and 'Optimize multiple COUNT queries', which aligns with the main objective of consolidating multiple COUNT(*) queries into single aggregated queries. However, the emoji (🚨) adds noise and reduces clarity.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch lighthouse-count-optimization-8444507562885530474

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/data-ops/src/queries/analytics.ts`:
- Around line 351-357: The students aggregates (db.select
{...}.from(students)... using count()/sql and the similar block at lines
417-420) currently omit tenant scoping and can aggregate across schools; require
a schoolId filter by ensuring the query includes where(eq(students.schoolId,
schoolId)) (or make schoolId a required parameter) and add this predicate into
baseConditions (or the query's where clause) so every execution against the
students table is scoped to the given schoolId; alternatively, separate this
into an explicit admin/global-only query with proper authorization checks if
global totals are needed.
- Around line 47-51: The db.select aggregation on school-scoped queries is
missing mandatory tenant filtering and the helpers bypass the ResultAsync
error-contract; update the school/scoped queries (the db.select that uses total:
count() and currentPeriod/previousPeriod sql counts and any functions
referencing baseConditions) to always include where(eq(students.schoolId,
schoolId)) instead of optional baseConditions (make schoolId non-optional or
return an error when missing), and refactor the four helper functions
(getEnrollmentStats, getEnrollmentGrowth and the two other student-query helpers
mentioned) to return ResultAsync<T, DatabaseError> by wrapping their async DB
calls and appending .mapErr(tapLogErr(logger, context)) so database errors are
captured and logged rather than thrown.

In `@packages/data-ops/src/queries/dashboard-stats.ts`:
- Around line 19-24: The getDashboardStats function currently returns a
Promise<DashboardStats> from the db.select call; change it to return
ResultAsync<DashboardStats, DatabaseError> by wrapping the database query in
ResultAsync.fromPromise (or equivalent from `@praha/byethrow`) and then attach
.mapErr(tapLogErr(logger, context)) to the resulting ResultAsync; specifically
wrap the db.select(...).from(schools) call that builds
total/active/recent/inactive (references: db.select, schools, thirtyDaysAgo,
DashboardStats) so all DB errors are mapped to DatabaseError and logged via
tapLogErr(logger, context) before returning the ResultAsync from
getDashboardStats.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f2dcd79b-767e-4c52-9143-799eedc20772

📥 Commits

Reviewing files that changed from the base of the PR and between 7e37ef8 and ab950bb.

📒 Files selected for processing (3)
  • packages/data-ops/src/queries/analytics.ts
  • packages/data-ops/src/queries/dashboard-stats.ts
  • pr-body.txt

Comment on lines +47 to +51
db.select({
total: count(),
currentPeriod: sql<number>`count(*) filter (where ${schools.createdAt} >= ${startDate})`,
previousPeriod: sql<number>`count(*) filter (where ${schools.createdAt} >= ${previousStartDate} and ${schools.createdAt} < ${startDate})`,
}).from(schools),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | head -60

Repository: ldsgroups225/Yeko

Length of output: 2498


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '40,55p'

Repository: ldsgroups225/Yeko

Length of output: 842


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '115,135p'

Repository: ldsgroups225/Yeko

Length of output: 969


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '345,365p'

Repository: ldsgroups225/Yeko

Length of output: 1130


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '410,425p'

Repository: ldsgroups225/Yeko

Length of output: 1095


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '90,105p'

Repository: ldsgroups225/Yeko

Length of output: 476


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '300,320p'

Repository: ldsgroups225/Yeko

Length of output: 838


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '330,345p'

Repository: ldsgroups225/Yeko

Length of output: 520


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '395,415p'

Repository: ldsgroups225/Yeko

Length of output: 862


🏁 Script executed:

cat -n packages/data-ops/src/queries/analytics.ts | sed -n '108,115p'

Repository: ldsgroups225/Yeko

Length of output: 457


🏁 Script executed:

rg "export async function getSchoolsPerformance" packages/data-ops/src/queries/analytics.ts -A 3

Repository: ldsgroups225/Yeko

Length of output: 238


Add mandatory tenant filtering for school-scoped tables and keep query helpers on the ResultAsync path.

The functions at lines 337-359 and 400-420 query the students table (school-scoped) with optional schoolId parameters. When schoolId is undefined, baseConditions becomes an empty array, causing .where() to evaluate to undefined — the query then executes without any tenant filtering, exposing cross-tenant data.

Additionally, all four touched query helpers run inside raw async functions and bypass the required ResultAsync<T, DatabaseError> + tapLogErr(...) contract. Database errors can throw uncaught.

Changes required:

  1. Add mandatory where(eq(students.schoolId, schoolId)) for getEnrollmentStats and getEnrollmentGrowth — remove the optional check; if schoolId is required, make it non-optional; if optional, you must still filter or return an error.
  2. Wrap all four functions to return ResultAsync<T, DatabaseError> with .mapErr(tapLogErr(logger, context)) on the result chain.

Applies to lines 47–51, 124–128, 351–357, 417–420 and their parent functions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/data-ops/src/queries/analytics.ts` around lines 47 - 51, The
db.select aggregation on school-scoped queries is missing mandatory tenant
filtering and the helpers bypass the ResultAsync error-contract; update the
school/scoped queries (the db.select that uses total: count() and
currentPeriod/previousPeriod sql counts and any functions referencing
baseConditions) to always include where(eq(students.schoolId, schoolId)) instead
of optional baseConditions (make schoolId non-optional or return an error when
missing), and refactor the four helper functions (getEnrollmentStats,
getEnrollmentGrowth and the two other student-query helpers mentioned) to return
ResultAsync<T, DatabaseError> by wrapping their async DB calls and appending
.mapErr(tapLogErr(logger, context)) so database errors are captured and logged
rather than thrown.

Comment on lines +351 to +357
db.select({
total: count(),
active: sql<number>`count(*) filter (where ${students.status} = 'active')`,
graduated: sql<number>`count(*) filter (where ${students.status} = 'graduated')`,
transferred: sql<number>`count(*) filter (where ${students.status} = 'transferred')`,
withdrawn: sql<number>`count(*) filter (where ${students.status} = 'withdrawn')`,
}).from(students).where(baseConditions.length > 0 ? and(...baseConditions) : undefined),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Require schoolId scoping on these students aggregates.

students is a school-scoped table, but both queries drop the tenant filter when schoolId is absent. That leaves a cross-tenant aggregation path for enrollment totals and growth. Make schoolId mandatory here, or split the global/admin case into a separately authorized query.

Based on learnings: Every query on school-scoped tables MUST include where(eq(table.schoolId, schoolId)) for multi-tenant isolation.

Also applies to: 417-420

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/data-ops/src/queries/analytics.ts` around lines 351 - 357, The
students aggregates (db.select {...}.from(students)... using count()/sql and the
similar block at lines 417-420) currently omit tenant scoping and can aggregate
across schools; require a schoolId filter by ensuring the query includes
where(eq(students.schoolId, schoolId)) (or make schoolId a required parameter)
and add this predicate into baseConditions (or the query's where clause) so
every execution against the students table is scoped to the given schoolId;
alternatively, separate this into an explicit admin/global-only query with
proper authorization checks if global totals are needed.

Comment on lines +19 to +24
db.select({
total: count(),
active: sql<number>`count(*) filter (where ${schools.status} = 'active')`,
recent: sql<number>`count(*) filter (where ${schools.createdAt} >= ${thirtyDaysAgo})`,
inactive: sql<number>`count(*) filter (where ${schools.status} = 'inactive')`,
}).from(schools),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat packages/data-ops/src/queries/dashboard-stats.ts

Repository: ldsgroups225/Yeko

Length of output: 8691


Return ResultAsync with tapLogErr error logging from getDashboardStats function.

The function returns Promise<DashboardStats> directly, but per the coding guidelines for packages/data-ops/src/queries/**/*.ts, it must return ResultAsync<T, DatabaseError> from @praha/byethrow and attach .mapErr(tapLogErr(logger, context)) to all database operations. Currently, any database error thrown by the queries at lines 19-24 will propagate uncaught instead of following the structured error-handling pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/data-ops/src/queries/dashboard-stats.ts` around lines 19 - 24, The
getDashboardStats function currently returns a Promise<DashboardStats> from the
db.select call; change it to return ResultAsync<DashboardStats, DatabaseError>
by wrapping the database query in ResultAsync.fromPromise (or equivalent from
`@praha/byethrow`) and then attach .mapErr(tapLogErr(logger, context)) to the
resulting ResultAsync; specifically wrap the db.select(...).from(schools) call
that builds total/active/recent/inactive (references: db.select, schools,
thirtyDaysAgo, DashboardStats) so all DB errors are mapped to DatabaseError and
logged via tapLogErr(logger, context) before returning the ResultAsync from
getDashboardStats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant