Skip to content

feat: Setup Django Project and setup api app and a testing router#17

Merged
kaihere14 merged 41 commits intokaihere14:mainfrom
GURUDAS-DEV:main
Feb 21, 2026
Merged

feat: Setup Django Project and setup api app and a testing router#17
kaihere14 merged 41 commits intokaihere14:mainfrom
GURUDAS-DEV:main

Conversation

@GURUDAS-DEV
Copy link
Copy Markdown
Contributor

@GURUDAS-DEV GURUDAS-DEV commented Feb 16, 2026

Summary by CodeRabbit

  • New Features

    • API endpoints: health check, single-route and batch route scoring; scheduler endpoints for manual and periodic batch scoring and health checks
    • Responses now include a searchId for cached breakpoint data; saving routes persists breakpoints for faster reuse
    • Client: improved fetch lifecycle to avoid stale results and include searchId when saving
  • Chores

    • Backend setup, docs, and dependency updates for scoring, scheduling, and caching infrastructure

GURUDAS-DEV and others added 30 commits February 14, 2026 12:20
- Added HomeMap component to display a Mapbox map with geolocation features.
- Updated Home page to render the HomeMap component.
- Introduced PrivateLayout component for consistent layout structure.
- Updated TypeScript configuration to include additional type definitions.
- Added Mapbox GL and related dependencies to package.json and package-lock.json.
- Created global.d.ts for Mapbox CSS module declaration.
…nation selection with search and geolocation.
…RouteItemClient components, and configure Next.js image domains.
GURUDAS-DEV and others added 7 commits February 15, 2026 17:33
…ature

- Updated type definitions to include �qiScore in RouteData.
- Modified API response handling to extract AQI data from backend.
- Enhanced UI component to display AQI information with color-coded categories.
- Implemented AQI score calculation logic in the backend, including pollutant breakdown.
- Added error handling for AQI data fetching and display.
- Improved breakpoint calculation to optimize API calls for AQI data.
- Updated documentation to reflect changes in AQI integration and testing procedures.
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Feb 16, 2026

@GURUDAS-DEV is attempting to deploy a commit to the kaihere14's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 16, 2026

📝 Walkthrough

Walkthrough

Adds a Django scoring API with Pathway integration and Python fallback, Node server scheduler and Pathway client, Redis searchId-based breakpoint caching, MongoDB breakpoint persistence, client changes to propagate searchId, and supporting project entrypoints and configs.

Changes

Cohort / File(s) Summary
Django core & entrypoints
data-processing/dataProcessingServer/dataProcessingServer/settings.py, .../wsgi.py, .../asgi.py, data-processing/dataProcessingServer/manage.py
New Django 6 settings and standard WSGI/ASGI/manage entrypoints added for the dataProcessingServer project.
Django URL routing
data-processing/dataProcessingServer/dataProcessingServer/urls.py, data-processing/dataProcessingServer/api/urls.py
Root URLs wired (admin, healthCheck, api/ include); API routes for health/, compute-scores/, and compute-score/ registered.
Django API views
data-processing/dataProcessingServer/api/views.py
New endpoints: health_check, compute_scores (batch with validation and optional Pathway delegation), and compute_single_score (single-route scoring); CSRF exemption and structured JSON responses.
Pathway pipeline & transformers
data-processing/dataProcessingServer/api/pathway/pipeline.py, .../transformers.py, .../__init__.py
New Pathway pipeline module with UDF batch pipeline, Python fallback, streaming skeleton, helpers, and scoring transformers (weather/AQI/traffic, per-route and batch scoring).
Node server scheduler & Pathway client
server/src/utils/scheduler/computeData.scheduler.ts, server/src/utils/scheduler/pathwayClient.ts, server/src/index.ts
New cron/manual batch scoring scheduler; Pathway HTTP client with typed inputs/outputs and health check; server endpoints to trigger scheduler and check Pathway health.
Redis utils & caching
server/src/utils/redis.ts, server/src/controllers/score.controller.ts
Upstash Redis client singleton; score controller generates searchId, caches raw breakpoints with TTL, and includes searchId in JSON responses.
Breakpoint persistence & saved routes
server/src/Schema/breakPoints.ts, server/src/controllers/savedRoutes.controllers.ts
New Mongoose BreakPoint schema with 2dsphere index; saveRoute reads cached breakpoints by searchId (or recomputes), bulk-inserts BreakPoint docs, and cascades deletions.
Client route save flow
client/app/(private)/home/routes/(from)/(to)/page.tsx
Client tracks fetch generation and searchId; discards stale score responses; includes searchId in save payload; uses route.overallScore for lastComputedScore.
Requirements & package changes
data-processing/dataProcessingServer/requirements.txt, server/package.json
Adds Pathway and pins Django>=6; Node deps added for cron, uuid, Upstash Redis and types.
Docs
server/README.md
README describing backend architecture, scoring pipeline, caching, and persistence strategy.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client App
    participant NodeServer as Node Server
    participant DjangoAPI as Django API
    participant Pathway as Pathway (engine)
    participant Redis as Redis
    participant MongoDB as MongoDB

    Client->>NodeServer: POST /api/v1/score (route payload)
    NodeServer->>DjangoAPI: POST /api/compute-scores/ (batch payload)
    DjangoAPI->>Pathway: run pipeline or use Python fallback
    Pathway-->>DjangoAPI: computed routes, bestRoute, summary
    DjangoAPI-->>NodeServer: computation result (routes, bestRoute, summary)
    NodeServer->>Redis: SET route_search:${searchId} (breakpoints + ts) TTL=1h
    NodeServer->>MongoDB: INSERT/UPDATE Route and BreakPoint documents (save)
    NodeServer-->>Client: response with scores and searchId
Loading
sequenceDiagram
    participant Scheduler as Cron Scheduler
    participant MongoDB as MongoDB
    participant Services as External Services (Weather/AQI)
    participant NodeServer as Node Server
    participant DjangoAPI as Django API
    participant Pathway as Pathway

    Scheduler->>MongoDB: Fetch routes and breakpoints
    MongoDB-->>Scheduler: routes + breakpoints
    Scheduler->>Services: Parallel fetch weather & AQI
    Services-->>Scheduler: weather & AQI data
    Scheduler->>DjangoAPI: POST /api/compute-scores/ (batch)
    DjangoAPI->>Pathway: compute via UDF or fallback
    Pathway-->>DjangoAPI: per-route scores
    DjangoAPI-->>Scheduler: results
    Scheduler->>MongoDB: Update route documents with new scores
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I hopped through Pathway, numbers in tow,

Cached my breakpoints where warm redis flows,
Scores gleamed like carrots, bestRoute in sight,
Mongo keeps footprints safe through the night,
A tiny rabbit cheers: computations take flight! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the primary objectives of the PR: setting up a Django project with an API app and routing configuration.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@data-processing/dataProcessingServer/api/urls.py`:
- Line 2: The import HttpResponse in urls.py is unused; remove the unused symbol
by deleting the HttpResponse import (the "from django.http import HttpResponse"
line) so the module no longer imports HttpResponse unnecessarily and the file is
clean of unused imports.

In `@data-processing/dataProcessingServer/api/views.py`:
- Around line 3-8: The module uses a global mutable counter `a` and a function
`myRoute` which causes race conditions and per-process isolation; replace this
by using a persistent, atomic counter (e.g., Django model with an IntegerField
and atomic F() update or Django cache/Redis with cache.incr) instead of the
module-level `a`, rename `myRoute` to snake_case `my_route`, and rename `a` to a
descriptive name like `hit_counter`/`hits` in code and docs; update the view to
fetch and atomically increment the persistent counter and return its value so
counts are correct across threads/processes.

In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py`:
- Around line 22-28: Replace the hardcoded SECRET_KEY with an environment-backed
value: import os at the top of settings.py and set SECRET_KEY =
os.getenv('DJANGO_SECRET_KEY') (provide a clear error or fail-fast behavior if
missing in non-dev environments); similarly gate DEBUG with
os.getenv('DJANGO_DEBUG') (coerce to boolean) and populate ALLOWED_HOSTS from an
env var like DJANGO_ALLOWED_HOSTS (split comma-separated hosts), ensuring
sensible development defaults only when no production flag is set and
documenting that DJANGO_SECRET_KEY must be set for production.

In `@data-processing/dataProcessingServer/dataProcessingServer/urls.py`:
- Line 4: Remove the unnecessary top-level import "from api import views" in
urls.py: it is unused because API routes are already included via
include('api.urls') (see the include(...) call). Delete the import line "from
api import views" and run tests/linter to confirm no remaining references to the
views symbol.
🧹 Nitpick comments (2)
data-processing/dataProcessingServer/api/urls.py (1)

5-7: Add a trailing slash and a name kwarg to the route.

Django convention is to include a trailing slash in URL patterns. Adding name= enables reverse() lookups and keeps routing maintainable.

Proposed fix
 urlpatterns = [
-   path('myRoute', views.myRoute)
+   path('myRoute/', views.myRoute, name='my-route'),
 ]
data-processing/dataProcessingServer/dataProcessingServer/urls.py (1)

8-8: Consider using a named view function and a trailing slash for the health-check URL.

The inline lambda works but triggers a linter warning (Ruff ARG005) for the unused request argument, and healthCheck uses camelCase which is unconventional for Django URL paths. A small named function or django.http shortcut is cleaner.

Proposed fix
-    path('healthCheck', lambda request: HttpResponse('Hello World!')),
+    path('health-check/', lambda _: HttpResponse('Hello World!')),

Or extract a proper view:

# In a views module or at the top of urls.py
def health_check(request):
    return HttpResponse('OK')

GURUDAS-DEV and others added 2 commits February 16, 2026 19:29
- Added scoring transformers in 	ransformers.py to calculate weather, AQI, and traffic scores.
- Created functions for computing overall route scores based on environmental data.
- Introduced batch processing for multiple routes in compute_batch_scores.
- Established a cron job scheduler for periodic route score computation.
- Developed a Pathway client for communication with the scoring server.
- Added Redis utility for caching breakpoints.
- Created MongoDB schemas for storing breakpoints and routes.
- Documented backend architecture and data flow in README.md.
- Added tests for new functionalities in the 	ests directory.
- Updated requirements.txt with necessary dependencies.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
client/app/(private)/home/routes/(from)/(to)/page.tsx (2)

200-212: ⚠️ Potential issue | 🟠 Major

Stale closure: routes[i] references previous state, not the newly fetched routes.

fetchScores is called at Line 349 with fetchedRoutes as routeData, but Line 204 reads routes[i] from the component's state, which hasn't been updated yet (React batches state updates). This means trafficDuration and duration will come from the previous render's routes (likely empty on first load).

Use routeData[i] instead:

Proposed fix
         overallScores.forEach((score: number, i: number) => {
           const dur =
-            routes[i]?.trafficDuration ?? routes[i]?.duration ?? Infinity;
+            routeData[i]?.trafficDuration ?? routeData[i]?.duration ?? Infinity;
           if (
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx around lines 200 -
212, The selection loop uses the stale component state `routes[i]` instead of
the freshly fetched data passed into `fetchScores`, causing durations to come
from the previous render; update the duration lookup to use the `routeData`
array (i.e., replace `routes[i]?.trafficDuration ?? routes[i]?.duration` with
`routeData[i]?.trafficDuration ?? routeData[i]?.duration`) inside the block that
computes `bestIndex`/`bestDuration` alongside `overallScores`, so it references
the newly fetched routes rather than component state.

422-423: ⚠️ Potential issue | 🟠 Major

Random fallback score will persist arbitrary data to the backend.

Math.floor(Math.random() * 100) as a fallback for lastComputedScore means that if scoring failed or hasn't completed, a random number gets saved to the database. This corrupts stored data and is misleading. Consider using null or 0 instead, or prevent saving when scores are unavailable.

Proposed fix
-          lastComputedScore:
-            route.overallScore || Math.floor(Math.random() * 100),
+          lastComputedScore: route.overallScore ?? null,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx around lines 422 -
423, The code sets lastComputedScore to route.overallScore ||
Math.floor(Math.random() * 100), which can persist misleading random values;
change the fallback to a deterministic empty value (e.g., null or 0) or skip
persisting the score when route.overallScore is undefined/invalid. Locate the
assignment of lastComputedScore (the object using route.overallScore) in
page.tsx and replace the Math.random fallback with null (or 0 per domain rules),
and add a guard where the payload is saved to avoid writing records that have no
valid score.
server/src/controllers/savedRoutes.controllers.ts (2)

66-142: ⚠️ Potential issue | 🟠 Major

Non-atomic Route + BreakPoint creation risks orphaned documents.

If BreakPoint.insertMany (Line 141) throws, the outer catch (Line 146) returns a 500 to the client, but the Route document created at Line 66 is already persisted. The client sees a failure and may retry, creating duplicate Route documents with no breakpoints.

Consider wrapping this in a Mongoose session/transaction, or at minimum, deleting the Route on breakpoint insertion failure:

🛡️ Minimal cleanup approach
       if (breakPointDocs.length > 0) {
-        await BreakPoint.insertMany(breakPointDocs);
+        try {
+          await BreakPoint.insertMany(breakPointDocs);
+        } catch (bpError) {
+          console.error("Failed to insert breakpoints, cleaning up route:", bpError);
+          await Route.findByIdAndDelete(newRoute._id);
+          throw bpError;
+        }
       }

A transaction-based approach would be more robust if the MongoDB deployment supports replica sets.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 66 - 142, The
current flow creates a Route via Route.create (newRoute) and later calls
BreakPoint.insertMany; if insertMany fails the Route remains orphaned—wrap the
Route + breakpoint creation in a Mongoose transaction/session (startSession,
session.withTransaction) so both Route.create and BreakPoint.insertMany run
atomically, or if transactions are not available, ensure you perform cleanup by
deleting the created Route (e.g., Route.deleteOne({_id: newRoute._id})) inside
the catch that handles failures from BreakPoint.insertMany; update the code
paths around Route.create, BreakPoint.insertMany and the surrounding try/catch
to either use session and pass { session } to both operations or call delete on
newRoute in the failure branch.

156-176: ⚠️ Potential issue | 🟠 Major

deleteRoute does not clean up associated BreakPoint documents.

When a Route is deleted, its BreakPoint documents remain orphaned in MongoDB. Add a cascade delete for breakpoints.

🛡️ Proposed fix
     const route = await Route.findOneAndDelete({ _id: id, userId });
 
     if (!route) {
       res.status(404).json({ success: false, message: "Route not found" });
       return;
     }
 
+    // Clean up associated breakpoints
+    await BreakPoint.deleteMany({ routeId: id });
+
     res.status(200).json({ success: true, message: "Route deleted" });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 156 - 176,
deleteRoute currently deletes the Route but leaves orphaned BreakPoint
documents; update deleteRoute to cascade-delete breakpoints by calling
BreakPoint.deleteMany({ routeId: route._id || id }) after successfully
finding/deleting the Route (or before returning) so associated breakpoints are
removed, and add/import the BreakPoint model if not already imported; ensure the
deleteMany runs only when a route was found and handle any errors the same way
as the existing catch block.
🧹 Nitpick comments (10)
server/src/utils/redis.ts (1)

3-6: Non-null assertions on env vars will produce a cryptic error if unset.

If UPSTASH_REDIS_REST_URL or UPSTASH_REDIS_REST_TOKEN is missing, the Redis client will be constructed with undefined, likely causing an unhelpful runtime error. Consider failing fast with a clear message.

Proposed fix
-const redis = new Redis({
-  url: process.env.UPSTASH_REDIS_REST_URL!,
-  token: process.env.UPSTASH_REDIS_REST_TOKEN!,
-});
+const url = process.env.UPSTASH_REDIS_REST_URL;
+const token = process.env.UPSTASH_REDIS_REST_TOKEN;
+
+if (!url || !token) {
+  throw new Error("Missing UPSTASH_REDIS_REST_URL or UPSTASH_REDIS_REST_TOKEN environment variables");
+}
+
+const redis = new Redis({ url, token });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/redis.ts` around lines 3 - 6, The code currently constructs
the Redis client with non-null assertions on UPSTASH_REDIS_REST_URL and
UPSTASH_REDIS_REST_TOKEN which will produce cryptic runtime errors if those env
vars are missing; before creating the Redis instance (the const redis = new
Redis(...) line), add explicit checks for process.env.UPSTASH_REDIS_REST_URL and
process.env.UPSTASH_REDIS_REST_TOKEN and throw a clear, descriptive Error (or
call process.exit with a logged message) if either is undefined so the
application fails fast with a helpful message rather than constructing Redis
with undefined values.
server/package.json (1)

32-32: @types/uuid belongs in devDependencies.

Type definition packages are only needed at compile time and should not be shipped as a runtime dependency.

Proposed fix

Move to devDependencies:

  "devDependencies": {
+   "@types/uuid": "^10.0.0",
    ...
  },
  "dependencies": {
-   "@types/uuid": "^10.0.0",
    ...
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/package.json` at line 32, The package "@types/uuid" is a compile-time
type package and must be moved out of runtime dependencies; edit package.json to
remove "@types/uuid" from the dependencies section and add it to devDependencies
(or run npm/yarn with --save-dev to make the change), then update
lockfile/install to ensure the devDependency is recorded; target the dependency
entry "@types/uuid" in package.json to implement this change.
server/src/index.ts (2)

63-69: Scheduler is enabled by default — intentional?

process.env.ENABLE_SCHEDULER !== "false" means the scheduler runs unless explicitly disabled. This includes when the env var is unset. If this is intentional for development convenience, consider adding a note. For production, opt-in (=== "true") is generally safer to avoid accidental cron jobs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/index.ts` around lines 63 - 69, The current condition
process.env.ENABLE_SCHEDULER !== "false" enables the scheduler by default (unset
env = enabled); change it to an explicit opt-in by checking
process.env.ENABLE_SCHEDULER === "true" (or add a separate config flag) so
initScheduler() only runs when ENABLE_SCHEDULER is explicitly set to "true", and
update the accompanying console logs to reflect the opt-in behavior and avoid
accidental cron execution in production.

38-46: try/catch is effectively dead code.

Since runManualBatchScoring() is fire-and-forget (errors are caught by .catch(console.error) on Line 41), the catch block on Line 43 will never execute — no synchronous exception is possible here.

Simplified version
 app.post("/api/v1/scheduler/run", async (_req, res) => {
-  try {
-    // Run asynchronously, don't wait
-    runManualBatchScoring().catch(console.error);
-    res.json({ message: "Batch scoring triggered", status: "running" });
-  } catch (error) {
-    res.status(500).json({ message: "Failed to trigger batch scoring", error });
-  }
+  runManualBatchScoring().catch(console.error);
+  res.json({ message: "Batch scoring triggered", status: "running" });
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/index.ts` around lines 38 - 46, The try/catch is ineffective
because runManualBatchScoring() is fire-and-forget with its own .catch, so
either remove the try/catch or make the route truly await the task; change the
handler in app.post("/api/v1/scheduler/run") to await runManualBatchScoring()
(remove the inline .catch(console.error)) so the catch block around the await
can capture errors and return 500, or alternatively remove the try/catch
entirely and keep the fire-and-forget call but adjust the response to a
non-error status (e.g., return 202 Accepted) and ensure errors are logged inside
runManualBatchScoring or its .catch.
data-processing/dataProcessingServer/api/views.py (1)

72-86: Validation order: isinstance check should precede emptiness check.

If routes is a non-empty, non-list value (e.g., a dict {"a": 1}), it passes the not routes check (Line 76) and is then caught by isinstance (Line 82). This works, but checking the type first is more robust and gives a clearer error message.

♻️ Suggested reorder
         routes = body.get("routes", [])
         use_pathway = body.get("usePathway", False)
 
         # Validation
-        if not routes:
-            return JsonResponse({
-                "success": False,
-                "message": "No routes provided. 'routes' array is required."
-            }, status=400)
-
         if not isinstance(routes, list):
             return JsonResponse({
                 "success": False,
                 "message": "'routes' must be an array."
             }, status=400)
 
+        if not routes:
+            return JsonResponse({
+                "success": False,
+                "message": "No routes provided. 'routes' array is required."
+            }, status=400)
+
         if len(routes) > 10:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/views.py` around lines 72 - 86, The
validation should check the type of routes before checking emptiness: in the
view where you read routes = body.get("routes", []) and use JsonResponse for
errors, first verify isinstance(routes, list) and return the "'routes' must be
an array." 400 response if not, then check if routes is empty and return the "No
routes provided. 'routes' array is required." 400 response; update the order of
these two checks around the routes variable and keep the same JsonResponse
payloads.
data-processing/dataProcessingServer/api/pathway/pipeline.py (2)

134-138: pw.debug.table_from_markdown is fragile with JSON-encoded column values.

The weather_points and aqi_points columns contain JSON strings that may include characters that interfere with markdown table parsing (e.g., pipe | inside nested JSON, or any future formatting). While _markdown_from_data escapes |, this approach inherits all the limitations of markdown table parsing and pw.debug utilities (which are designed for testing, not production).

Since the Pathway path already has run_simple_batch as a fallback, this is acceptable for the hackathon, but consider using pw.debug.table_from_pandas with a DataFrame instead for more robust data ingestion if you continue with Pathway.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py` around lines
134 - 138, pw.debug.table_from_markdown is fragile for JSON-encoded column
values (e.g., weather_points/aqi_points) because markdown parsing can break on
embedded characters; replace the markdown-based ingestion with a robust
DataFrame-based call: build a pandas.DataFrame from table_data and call
pw.debug.table_from_pandas(..., schema=RouteInputSchema) (or the equivalent
Pathway API that accepts DataFrames) instead of
pw.debug.table_from_markdown/_markdown_from_data so JSON strings are preserved;
keep run_simple_batch fallback logic unchanged.

176-183: Silently skipping malformed score results can mask bugs.

When json.loads(row["score_json"]) fails (Line 182), the route is silently dropped via continue. Consider logging these failures so scoring issues in process_route_row can be diagnosed.

♻️ Add logging
         except (json.JSONDecodeError, KeyError):
+            import logging
+            logging.warning(f"Failed to parse score result for row: {row}")
             continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py` around lines
176 - 183, The loop that parses score JSON in pipeline.py is silently dropping
malformed rows (the for loop over results.iterrows() where
json.loads(row["score_json"]) is inside a try/except); modify this to catch the
JSONDecodeError/KeyError, log a descriptive error via the module logger (include
the exception, the offending row identifier or index and the raw score_json
value) and then continue, so issues in process_route_row and score generation
are visible; ensure you reference the same loop/variable names (results, row,
route_scores) when adding the logging call.
server/src/utils/scheduler/computeData.scheduler.ts (2)

186-191: Unbounded Route.find({}).lean() may cause memory pressure at scale.

This loads the entire routes collection into memory. Consider pagination or cursor-based iteration, and potentially filtering (e.g., routes updated more than N minutes ago, or only favorites) to bound the working set.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` around lines 186 - 191,
The current code calls Route.find({}).lean() which pulls the entire collection
into memory; change it to iterate in bounded batches or a cursor to avoid memory
pressure: replace the direct Route.find({}).lean() usage (the routes variable)
with a cursor-based iteration (e.g., Route.find(query).lean().cursor() or
batched queries using limit/skip or range queries on _id/updatedAt) and apply
optional filters (e.g., updatedAt > X minutes ago or isFavorite) to reduce the
working set; update the processing loop that consumes routes to handle one
document at a time (or batch) rather than assuming an in-memory array.

129-129: Traffic value hardcoded to 0 — batch scores will always show perfect traffic.

The TODO notes this, but it means batch-recomputed scores will diverge from the initial scores (which use real traffic data). This discrepancy could confuse users seeing score changes without understanding why.

Would you like help implementing traffic data retrieval for the batch scheduler?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/scheduler/computeData.scheduler.ts` at line 129, Replace the
hardcoded trafficValue: 0 with a real traffic lookup: locate where trafficValue
is set (the object literal with trafficValue) and call the service/helper that
returns traffic metrics for the batch/feature (e.g.,
TrafficService.getTrafficForFeature, metricsClient.fetchTraffic, or implement a
new helper fetchTrafficForBatch if none exists), await the result and assign
trafficValue to the returned number, default to 0 only when the call fails or
returns null, and add error logging and a small unit test to cover the
happy/error paths; keep the TODO and remove the hardcoded 0 once the lookup is
wired in.
server/src/Schema/breakPoints.ts (1)

35-39: Redundant single-field index on routeId.

The index: true on routeId (Line 39) creates a standalone index, but the compound index at Line 61 ({ routeId: 1, routeOptionIndex: 1 }) already covers queries filtering by routeId alone (MongoDB uses the compound index's left prefix). The standalone index just adds write overhead and storage without query benefit.

♻️ Remove the redundant index
     routeId: {
       type: Schema.Types.ObjectId,
       ref: "Route",
       required: true,
-      index: true,
     },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/Schema/breakPoints.ts` around lines 35 - 39, Remove the redundant
single-field index on routeId by deleting the index: true property from the
routeId field definition; keep the type, ref, and required settings intact. The
compound index declared on the schema (schema.index({ routeId: 1,
routeOptionIndex: 1 })) already covers queries on routeId via the left-prefix,
so leave that compound index as-is and remove only the standalone index flag
from the routeId field.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 262-312: The create_streaming_pipeline function currently calls
pw.io.jsonlines.read, pw.udf and other pw attributes without guarding for
missing Pathway, which will raise AttributeError when PATHWAY_AVAILABLE is
False; update create_streaming_pipeline to first check the PATHWAY_AVAILABLE
flag and either raise a clear, descriptive RuntimeError or return None
(consistent with project conventions) before any pw.* calls, and ensure
references to pw.io.jsonlines.read, pw.udf, and pw.io.jsonlines.write are only
executed after that guard so the function fails fast and clearly when Pathway is
unavailable.

In `@data-processing/dataProcessingServer/api/pathway/transformers.py`:
- Around line 70-88: The loop over weather_points currently increments
valid_points for any truthy main, which skews per-field averages; add per-field
counters (e.g., temp_count, humidity_count, pressure_count) and increment
temp_count only when temp is not None (alongside adding to total_temp and
raw_temp_sum), similarly for humidity and pressure; keep or adjust valid_points
if you still need an overall point count, and use the new per-field counts as
divisors when computing avg temperature/humidity/pressure so each average only
divides by the number of points that actually had that field; references: the
for loop over weather_points and variables
total_temp/raw_temp_sum/total_humidity/raw_humidity_sum/total_pressure/raw_pressure_sum,
and score helpers
calculate_temperature_score/calculate_humidity_score/calculate_pressure_score.

In `@data-processing/dataProcessingServer/api/views.py`:
- Around line 115-120: Replace the JsonResponse bodies that include str(e) with
a generic error message so internal details are not leaked: in the except block
shown (the handler returning JsonResponse with "Internal server error:
{str(e)}") and in the compute_single_score exception handler, remove str(e) from
the response and instead return JsonResponse({"success": False, "message":
"Internal server error"}) with status=500; keep traceback.print_exc() (or use
existing server-side logging) to record the full exception details for
debugging.

In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py`:
- Around line 33-40: INSTALLED_APPS is missing the new api app so Django won't
discover app-level features; open settings.py and add the 'api' app to the
INSTALLED_APPS list (or, if the app defines a custom AppConfig, add its full
path like 'api.apps.ApiConfig') so that models, signals, management commands and
app configs in the api module are registered.

In `@data-processing/dataProcessingServer/requirements.txt`:
- Around line 3-4: Update the Django version constraint in requirements.txt from
"Django>=4.2,<5.0" to "Django>=6.0,<7.0" so it matches the project's settings.py
(references to Django 6.0.2); locate the Django entry in requirements.txt and
replace the version specifier accordingly to ensure compatibility with the
project's configuration.
- Around line 1-25: Add an explicit constraint for urllib3 to requirements.txt
to ensure transitive dependencies use the patched release: add a line like
"urllib3>=2.6.0,<3.0" alongside the other top-level packages so
pathway/requests/boto3 cannot pull in vulnerable urllib3 <2.6.0; update the
comments accordingly and run dependency resolution/tests to verify no conflicts.

In `@server/src/index.ts`:
- Around line 37-56: The two admin/debug routes (POST "/api/v1/scheduler/run"
using runManualBatchScoring and GET "/api/v1/scheduler/pathway-health" using
checkPathwayHealth) are unprotected and leak the PATHWAY_URL; protect them by
applying your existing auth middleware or an admin-only guard to both route
handlers (e.g., require authentication and an admin role before calling
runManualBatchScoring or checkPathwayHealth), and change the health response to
avoid returning internal infrastructure details (return only a status like
"healthy"/"unhealthy" and omit or redact the PATHWAY_URL environment value).

In `@server/src/utils/scheduler/computeData.scheduler.ts`:
- Around line 270-274: The scheduler currently calls runBatchScoring on every
CRON_SCHEDULE without guarding against overlap; add a guard in initScheduler to
prevent concurrent executions by using a lock (e.g., a process-level boolean
flag like isRunning scoped near initScheduler or a distributed lock via
Redis/Redlock for multi-instance deployments) so that when the cron fires it
checks the lock and if held it logs/skips the run, otherwise acquires the lock,
awaits runBatchScoring(), and always releases the lock in a finally block (also
handle and log errors inside the try/catch). Reference initScheduler,
CRON_SCHEDULE and runBatchScoring when implementing the guard.

In `@server/src/utils/scheduler/pathwayClient.ts`:
- Around line 178-198: checkPathwayHealth lacks a timeout and can hang; add an
AbortController with the same 30s timeout logic used in sendToPathway, pass
controller.signal into fetch(url, { method: "GET", signal }), and ensure you
clear the timer after the request; update the catch to handle aborts similarly
(treat AbortError as a failed health check) and return false on timeout or
error.

---

Outside diff comments:
In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx:
- Around line 200-212: The selection loop uses the stale component state
`routes[i]` instead of the freshly fetched data passed into `fetchScores`,
causing durations to come from the previous render; update the duration lookup
to use the `routeData` array (i.e., replace `routes[i]?.trafficDuration ??
routes[i]?.duration` with `routeData[i]?.trafficDuration ??
routeData[i]?.duration`) inside the block that computes
`bestIndex`/`bestDuration` alongside `overallScores`, so it references the newly
fetched routes rather than component state.
- Around line 422-423: The code sets lastComputedScore to route.overallScore ||
Math.floor(Math.random() * 100), which can persist misleading random values;
change the fallback to a deterministic empty value (e.g., null or 0) or skip
persisting the score when route.overallScore is undefined/invalid. Locate the
assignment of lastComputedScore (the object using route.overallScore) in
page.tsx and replace the Math.random fallback with null (or 0 per domain rules),
and add a guard where the payload is saved to avoid writing records that have no
valid score.

In `@server/src/controllers/savedRoutes.controllers.ts`:
- Around line 66-142: The current flow creates a Route via Route.create
(newRoute) and later calls BreakPoint.insertMany; if insertMany fails the Route
remains orphaned—wrap the Route + breakpoint creation in a Mongoose
transaction/session (startSession, session.withTransaction) so both Route.create
and BreakPoint.insertMany run atomically, or if transactions are not available,
ensure you perform cleanup by deleting the created Route (e.g.,
Route.deleteOne({_id: newRoute._id})) inside the catch that handles failures
from BreakPoint.insertMany; update the code paths around Route.create,
BreakPoint.insertMany and the surrounding try/catch to either use session and
pass { session } to both operations or call delete on newRoute in the failure
branch.
- Around line 156-176: deleteRoute currently deletes the Route but leaves
orphaned BreakPoint documents; update deleteRoute to cascade-delete breakpoints
by calling BreakPoint.deleteMany({ routeId: route._id || id }) after
successfully finding/deleting the Route (or before returning) so associated
breakpoints are removed, and add/import the BreakPoint model if not already
imported; ensure the deleteMany runs only when a route was found and handle any
errors the same way as the existing catch block.

---

Duplicate comments:
In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py`:
- Around line 22-23: SECRET_KEY is hardcoded in settings.py; change it to read
from an environment variable (e.g., os.environ.get('DJANGO_SECRET_KEY')) and
avoid embedding defaults for production—if the env var is missing in prod, raise
an error; for local/dev only, fallback to a securely generated key via
django.core.management.utils.get_random_secret_key() or a clearly marked
dev-only branch. Update the SECRET_KEY assignment to use the env var and add a
clear failure or dev-only generation path so SECRET_KEY is never committed in
source control.

---

Nitpick comments:
In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 134-138: pw.debug.table_from_markdown is fragile for JSON-encoded
column values (e.g., weather_points/aqi_points) because markdown parsing can
break on embedded characters; replace the markdown-based ingestion with a robust
DataFrame-based call: build a pandas.DataFrame from table_data and call
pw.debug.table_from_pandas(..., schema=RouteInputSchema) (or the equivalent
Pathway API that accepts DataFrames) instead of
pw.debug.table_from_markdown/_markdown_from_data so JSON strings are preserved;
keep run_simple_batch fallback logic unchanged.
- Around line 176-183: The loop that parses score JSON in pipeline.py is
silently dropping malformed rows (the for loop over results.iterrows() where
json.loads(row["score_json"]) is inside a try/except); modify this to catch the
JSONDecodeError/KeyError, log a descriptive error via the module logger (include
the exception, the offending row identifier or index and the raw score_json
value) and then continue, so issues in process_route_row and score generation
are visible; ensure you reference the same loop/variable names (results, row,
route_scores) when adding the logging call.

In `@data-processing/dataProcessingServer/api/views.py`:
- Around line 72-86: The validation should check the type of routes before
checking emptiness: in the view where you read routes = body.get("routes", [])
and use JsonResponse for errors, first verify isinstance(routes, list) and
return the "'routes' must be an array." 400 response if not, then check if
routes is empty and return the "No routes provided. 'routes' array is required."
400 response; update the order of these two checks around the routes variable
and keep the same JsonResponse payloads.

In `@server/package.json`:
- Line 32: The package "@types/uuid" is a compile-time type package and must be
moved out of runtime dependencies; edit package.json to remove "@types/uuid"
from the dependencies section and add it to devDependencies (or run npm/yarn
with --save-dev to make the change), then update lockfile/install to ensure the
devDependency is recorded; target the dependency entry "@types/uuid" in
package.json to implement this change.

In `@server/src/index.ts`:
- Around line 63-69: The current condition process.env.ENABLE_SCHEDULER !==
"false" enables the scheduler by default (unset env = enabled); change it to an
explicit opt-in by checking process.env.ENABLE_SCHEDULER === "true" (or add a
separate config flag) so initScheduler() only runs when ENABLE_SCHEDULER is
explicitly set to "true", and update the accompanying console logs to reflect
the opt-in behavior and avoid accidental cron execution in production.
- Around line 38-46: The try/catch is ineffective because
runManualBatchScoring() is fire-and-forget with its own .catch, so either remove
the try/catch or make the route truly await the task; change the handler in
app.post("/api/v1/scheduler/run") to await runManualBatchScoring() (remove the
inline .catch(console.error)) so the catch block around the await can capture
errors and return 500, or alternatively remove the try/catch entirely and keep
the fire-and-forget call but adjust the response to a non-error status (e.g.,
return 202 Accepted) and ensure errors are logged inside runManualBatchScoring
or its .catch.

In `@server/src/Schema/breakPoints.ts`:
- Around line 35-39: Remove the redundant single-field index on routeId by
deleting the index: true property from the routeId field definition; keep the
type, ref, and required settings intact. The compound index declared on the
schema (schema.index({ routeId: 1, routeOptionIndex: 1 })) already covers
queries on routeId via the left-prefix, so leave that compound index as-is and
remove only the standalone index flag from the routeId field.

In `@server/src/utils/redis.ts`:
- Around line 3-6: The code currently constructs the Redis client with non-null
assertions on UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN which will
produce cryptic runtime errors if those env vars are missing; before creating
the Redis instance (the const redis = new Redis(...) line), add explicit checks
for process.env.UPSTASH_REDIS_REST_URL and process.env.UPSTASH_REDIS_REST_TOKEN
and throw a clear, descriptive Error (or call process.exit with a logged
message) if either is undefined so the application fails fast with a helpful
message rather than constructing Redis with undefined values.

In `@server/src/utils/scheduler/computeData.scheduler.ts`:
- Around line 186-191: The current code calls Route.find({}).lean() which pulls
the entire collection into memory; change it to iterate in bounded batches or a
cursor to avoid memory pressure: replace the direct Route.find({}).lean() usage
(the routes variable) with a cursor-based iteration (e.g.,
Route.find(query).lean().cursor() or batched queries using limit/skip or range
queries on _id/updatedAt) and apply optional filters (e.g., updatedAt > X
minutes ago or isFavorite) to reduce the working set; update the processing loop
that consumes routes to handle one document at a time (or batch) rather than
assuming an in-memory array.
- Line 129: Replace the hardcoded trafficValue: 0 with a real traffic lookup:
locate where trafficValue is set (the object literal with trafficValue) and call
the service/helper that returns traffic metrics for the batch/feature (e.g.,
TrafficService.getTrafficForFeature, metricsClient.fetchTraffic, or implement a
new helper fetchTrafficForBatch if none exists), await the result and assign
trafficValue to the returned number, default to 0 only when the call fails or
returns null, and add error logging and a small unit test to cover the
happy/error paths; keep the TODO and remove the hardcoded 0 once the lookup is
wired in.

…ute environmental scoring, integrated with client UI.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
client/app/(private)/home/routes/(from)/(to)/page.tsx (2)

441-443: ⚠️ Potential issue | 🟡 Minor

Remove debug console.log before merging

Line 442 logs the full server response to the browser console in all environments, including production. This can expose response structure and potentially sensitive user data.

🧹 Proposed fix
       const data = await response.json();
-      console.log("Save route response:", data);
-
       if (!response.ok) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx around lines 441 -
443, Remove the debug console.log call that prints the server response (the line
console.log("Save route response:", data) immediately after const data = await
response.json()); either delete that line or replace it with a
non-production-safe logger (e.g., gated by NODE_ENV or a debug-level logger) so
responses are not printed in production.

473-475: ⚠️ Potential issue | 🟡 Minor

Add scoresLoading to the canSave guard to prevent saving routes with null scores

The save button currently enables when routes arrive (routes.length > 0), but doesn't wait for the score computation to finish. A user clicking Save Route while scoresLoading is true will save with:

  • searchId as null (scores API hasn't resolved)
  • Every lastComputedScore as null (route scores are still undefined)

While the server has a fallback to re-compute, the saved route will permanently record null scores instead of the computed values.

Suggested fix
-          canSave={!isLoading && !error && routes.length > 0}
+          canSave={!isLoading && !scoresLoading && !error && routes.length > 0}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx around lines 473 -
475, The save button's enable guard should also block while score computation is
in progress: update the canSave prop passed where onSaveRoute={handleSaveClick}
to include the scoresLoading flag (in addition to !isLoading, !error, and
routes.length > 0) so the button is disabled until scoresLoading is false;
locate the usage in page.tsx and add !scoresLoading to the boolean expression to
prevent saving routes with null/undefined scores.
🧹 Nitpick comments (7)
server/src/controllers/savedRoutes.controllers.ts (2)

94-105: Silent fallback when breakpoint computation fails may mask data issues.

If computeBreakpoints throws (e.g., due to malformed route geometry), the route is saved without any breakpoints and the client receives a success response with no indication that breakpoints are missing. Consider including a flag or warning in the response so the caller knows the save was partial.

Also, the dynamic import() on line 98-99 adds unnecessary overhead given the comment acknowledges a static import works fine. A static import at the top of the file would be simpler and let bundlers/type-checkers handle it normally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 94 - 105, The
current code silently drops breakpoints when computeBreakpoints fails and uses a
dynamic import; change to a static import of computeBreakpoints at the top of
the module and modify the save flow around
routeBreakpointsData/computeBreakpoints so that if computeBreakpoints throws you
still save the route but set a clear partial-save indicator (e.g., add a boolean
property like partialSave or breakpointError and an optional warning message on
the response body) so the caller is aware breakpoints are missing; keep the
try/catch to prevent crashes but populate that flag and include the error
message (or a sanitized message) in the response and server log to aid
debugging.

164-172: Non-atomic cascade delete can leave orphaned breakpoints.

Route.findOneAndDelete and BreakPoint.deleteMany are two independent operations. If the second call fails, the route is gone but its breakpoints persist as orphans. If your MongoDB deployment supports replica sets, wrapping both in a session.withTransaction() ensures atomicity. Otherwise, reversing the order (delete breakpoints first, then route) is a safer fallback — orphaned routes are easier to detect and retry than orphaned breakpoints.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 164 - 172,
The current cascade delete uses Route.findOneAndDelete followed by
BreakPoint.deleteMany which is non-atomic and can leave orphaned BreakPoint
docs; modify the controller to start a mongoose session (await
mongoose.startSession()), detect if transactions are supported and run both
operations inside session.withTransaction(), passing { session } to
Route.findOneAndDelete and BreakPoint.deleteMany so they commit atomically, and
if transactions are not available fall back to deleting BreakPoint.deleteMany({
routeId: id }) first and then remove the Route (Route.findOneAndDelete({ _id:
id, userId })) so breakpoints are not orphaned; ensure session.endSession() is
called and handle errors to abort/roll back appropriately.
server/src/utils/redis.ts (1)

6-11: Top-level throw crashes the process on import even when Redis isn't needed.

Because savedRoutes.controllers.ts imports this module statically, any missing env var will prevent the entire server from starting — even for endpoints that never touch Redis. Consider deferring validation to first use via a lazy getter or factory function, so the server can still serve non-Redis routes when the config is absent (e.g., local dev without Upstash).

♻️ Lazy-init alternative
-const UPSTASH_URL = process.env.UPSTASH_REDIS_REST_URL;
-const UPSTASH_TOKEN = process.env.UPSTASH_REDIS_REST_TOKEN;
-
-if (!UPSTASH_URL || !UPSTASH_TOKEN) {
-  throw new Error(
-    "Missing Redis configuration: UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN " +
-      "must be set in environment variables."
-  );
-}
-
-const redis = new Redis({
-  url: UPSTASH_URL,
-  token: UPSTASH_TOKEN,
-});
-
-export default redis;
+let _redis: Redis | null = null;
+
+export function getRedis(): Redis {
+  if (!_redis) {
+    const url = process.env.UPSTASH_REDIS_REST_URL;
+    const token = process.env.UPSTASH_REDIS_REST_TOKEN;
+    if (!url || !token) {
+      throw new Error(
+        "Missing Redis configuration: UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN " +
+          "must be set in environment variables."
+      );
+    }
+    _redis = new Redis({ url, token });
+  }
+  return _redis;
+}
+
+export default { get redis() { return getRedis(); } };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/utils/redis.ts` around lines 6 - 11, The top-level throw in
server/src/utils/redis.ts (checking UPSTASH_URL / UPSTASH_TOKEN) crashes the
whole process on import; remove that immediate check and replace it with a lazy
factory/getter (e.g., export a getRedisClient or createRedisClient function)
that performs the env validation and constructs/caches the client on first call;
update callers (like savedRoutes.controllers.ts) to call the factory/getter when
they actually need Redis so missing env vars only cause an error at first-use
rather than on module import.
data-processing/dataProcessingServer/api/pathway/pipeline.py (1)

237-259: _markdown_from_data may produce broken table rows if values contain newlines.

While JSON-serialized strings from json.dumps won't have literal newlines, other string fields (e.g., route_id) could. A newline in any cell value will split the row across multiple lines, corrupting the Pathway markdown table. Consider stripping or escaping newlines alongside the existing pipe escaping on line 256.

Proposed fix
-                values.append(str(val).replace("|", "\\|"))
+                values.append(str(val).replace("|", "\\|").replace("\n", " "))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py` around lines
237 - 259, The _markdown_from_data function can produce broken markdown rows
when cell values contain newline characters; update _markdown_from_data so that
for each cell (inside the loop over headers for a given row) you normalize
newlines before escaping pipes — e.g., replace CR/LF and LF characters with a
single space or an escaped literal (like "\\n") — then apply the existing
pipe-escaping logic (str(val).replace("\n", " ").replace("\r", "").replace("|",
"\\|")) so no cell introduces extra lines in the produced markdown table.
data-processing/dataProcessingServer/dataProcessingServer/urls.py (1)

7-7: healthCheck path lacks a trailing slash and uses camelCase.

Other routes in this project (admin/, api/) use trailing slashes. Django's APPEND_SLASH middleware (enabled by default) will 301-redirect healthCheck/ → 404 because the pattern has no slash. Consider health-check/ for consistency.

Proposed fix
-    path('healthCheck', lambda request: HttpResponse('Hello World!')),
+    path('health-check/', lambda request: HttpResponse('Hello World!')),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/dataProcessingServer/urls.py` at line 7,
Update the URL pattern in urls.py: replace the path call that currently reads
path('healthCheck', lambda request: HttpResponse('Hello World!')) with a
kebab-case pattern including a trailing slash, e.g. path('health-check/', lambda
request: HttpResponse('Hello World!')), so the route name and trailing slash are
consistent with other routes (keep the same lambda view/HttpResponse).
data-processing/dataProcessingServer/requirements.txt (2)

15-28: Move dev/optional dependencies to a dedicated requirements-dev.txt.

Commented-out dependencies are invisible to tooling (pip-compile, Dependabot, OSV Scanner) and can't receive automated security updates. Splitting them into a separate file keeps production installs minimal and keeps dev tooling (pytest, pytest-django) verifiable.

♻️ Suggested split

requirements-dev.txt (new file):

# Optional: async WSGI servers
uvicorn>=0.24.0
gunicorn>=21.0.0

# Optional: MongoDB drivers
pymongo>=4.6.0
motor>=3.3.0

# Optional: Redis
redis>=5.0.0

# Testing
pytest>=7.4.0
pytest-django>=4.5.0

Then keep only production-required packages in requirements.txt (the current active entries).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/requirements.txt` around lines 15 - 28,
The commented dev/optional dependencies in requirements.txt are invisible to
tooling—create a new requirements-dev.txt containing the entries shown (uvicorn,
gunicorn, pymongo, motor, redis, pytest, pytest-django with the suggested
versions) and remove the commented lines from requirements.txt so it only lists
production deps; then update any CI/automation config
(pip-compile/Dependabot/OSV Scanner references or test install steps) to install
or reference requirements-dev.txt for development/test jobs instead of relying
on commented lines.

3-4: Django 6.0 constraint is correct — note implicit Python 3.12+ requirement.

Django 6.0 was released December 3, 2025, and the latest official version is 6.0.2, so Django>=6.0,<7.0 is valid and resolves the version mismatch flagged in the previous review.

One operational note: because Python 3.12 is now the minimum supported version for Django 6.0, any optional dependencies must also meet that requirement. Ensure your Dockerfile / CI matrix enforces python:3.12+; otherwise installs will silently fail on older Python runtimes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/requirements.txt` around lines 3 - 4,
requirements.txt pins Django to 6.x which requires Python 3.12+, so update your
build/CI setup to enforce that Python version: change your Dockerfile base image
to a Python 3.12+ image (refer to your Dockerfile) and update your CI workflow
matrix/job steps (e.g., the Python version entry in your CI config) to include
only python-3.12 or newer so installs of Django>=6.0 succeed; also ensure any
test runner or environment setup steps reference the same python version to
avoid silent install failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx:
- Line 78: The state variable searchId can become stale and corrupt save
payloads because it isn’t cleared when a new fetch begins and late/superseded
fetchScores responses can overwrite it; to fix, reset searchId to null at the
start of fetchRoutes (before calling fetchScores) and then ensure setSearchId
only applies to the latest request by adding a generation check (use a useRef
counter incremented at each fetchRoutes invocation and compare on response) or
using an AbortController to ignore out-of-order responses; update all places
that call setSearchId (including fetchScores success paths) to guard against
stale responses using the chosen mechanism.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 63-68: The current single try/except block parses both
weather_points and aqi_points so a JSONDecodeError in one wipes out the other;
change to parse each field independently (e.g., two separate try/excepts) so
json.loads(weather_points) is wrapped in its own try that only sets weather_data
= [] on error and likewise json.loads(aqi_points) is wrapped in its own try that
only sets aqi_data = [] on error; reference the weather_points/aqi_points
parsing in pipeline.py and ensure you don't clear valid data when the other
field fails (optionally log the specific JSONDecodeError per field).

In `@data-processing/dataProcessingServer/api/pathway/transformers.py`:
- Around line 219-239: The docstring examples for calculate_traffic_score are
inconsistent with the implemented power curve (penalty = math.pow(normalized,
0.7)); either update the formula to match the documented examples (change the
exponent from 0.7 to 0.5 in the penalty computation) or keep the current formula
and update the docstring example values to the actual outputs produced by
calculate_traffic_score (and any related tests/comments). Make the chosen change
in the calculate_traffic_score function and ensure the docstring and tests
reflect the final behavior.

In `@data-processing/dataProcessingServer/api/views.py`:
- Around line 159-163: The current guard `if not body:` incorrectly treats an
empty JSON object ({}) as missing input and returns 400, so change the check to
only reject truly missing bodies (e.g., `body is None` after parsing) or replace
it with explicit required-field validation: identify the required keys for
compute_route_score (use the same keys accessed via body.get(...) in
compute_route_score), iterate those keys to collect missing ones, and return a
400 only when required fields are absent, otherwise pass the (possibly empty)
body through to compute_route_score.

In `@data-processing/dataProcessingServer/requirements.txt`:
- Around line 6-7: The dependency constraint "pathway>=0.8.0" is too loose and
the package is BSL 1.1 with service-use restrictions; update the requirements
entry to add an upper bound (e.g., change the line "pathway>=0.8.0" to
"pathway>=0.8.0,<1.0") to prevent unexpected breaking upgrades, and
simultaneously flag the BSL 1.1 Additional Use Grant for legal review—either
obtain a commercial license or replace/remove Pathway if our streaming service
use (stream data processing) violates the grant before merging.

In `@server/src/controllers/savedRoutes.controllers.ts`:
- Around line 115-134: The loop that builds breakPointDocs in
savedRoutes.controllers.ts parses the suffix of keys like "point_*" into
pointIndex using parseInt(idxStr) which can yield NaN for non-numeric suffixes;
update the code (inside the Object.keys(rb).forEach block where rb,
newRoute._id, routeIndex and breakPointDocs are referenced) to call
parseInt(idxStr, 10), then guard with Number.isNaN(pointIndex) and return/skip
that key if true so you never push a document with pointIndex: NaN into MongoDB.
- Around line 79-91: The code uses unsanitized searchId from req.body directly
in the Redis key (redis.get(`route_search:${searchId}`)), which allows cache-key
injection; validate and/or normalize searchId before using it: ensure
req.body.searchId meets the expected format (e.g., UUID via a UUID regex or a
library like validate/uuid) and if it fails, skip the redis.get call or return a
400; alternatively derive a safe key by hashing the raw searchId (e.g., sha256)
and use that hashed value in the redis key (route_search:{hash}), and update the
code paths that read/write this cache (redis.get and any redis.set) to use the
same validated/hashed key; ensure any warning/log uses a sanitized value and do
not expose raw searchId in logs.

---

Outside diff comments:
In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx:
- Around line 441-443: Remove the debug console.log call that prints the server
response (the line console.log("Save route response:", data) immediately after
const data = await response.json()); either delete that line or replace it with
a non-production-safe logger (e.g., gated by NODE_ENV or a debug-level logger)
so responses are not printed in production.
- Around line 473-475: The save button's enable guard should also block while
score computation is in progress: update the canSave prop passed where
onSaveRoute={handleSaveClick} to include the scoresLoading flag (in addition to
!isLoading, !error, and routes.length > 0) so the button is disabled until
scoresLoading is false; locate the usage in page.tsx and add !scoresLoading to
the boolean expression to prevent saving routes with null/undefined scores.

---

Duplicate comments:
In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 262-322: The review contains a stray duplicate-review marker
([duplicate_comment]) alongside [approve_code_changes]; remove the duplicate
marker from the PR/review comment so only a single approval note remains—no
changes needed to the create_streaming_pipeline function or its
PATHWAY_AVAILABLE guard.

In `@data-processing/dataProcessingServer/api/pathway/transformers.py`:
- Around line 56-124: The per-field counters (temp_count, humidity_count,
pressure_count) correctly fix skewed averages in calculate_weather_score, but
tweak two small things: only increment valid_points when at least one of
temp/humidity/pressure is present (so the "no valid points" short-circuit is
accurate) and remove the duplicate review artifact/comment; locate and update
calculate_weather_score to check (temp is not None or humidity is not None or
pressure is not None) before valid_points += 1 and delete the stray duplicate
comment marker.

In `@data-processing/dataProcessingServer/dataProcessingServer/settings.py`:
- Around line 60-68: INSTALLED_APPS now includes the 'api' app so no code change
is required; confirm that the INSTALLED_APPS list contains 'api' (the symbol to
check) and remove the duplicate review marker/comment ([duplicate_comment]) from
the PR or reviewer notes so the approval ([approve_code_changes]) stands
cleanly.

---

Nitpick comments:
In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 237-259: The _markdown_from_data function can produce broken
markdown rows when cell values contain newline characters; update
_markdown_from_data so that for each cell (inside the loop over headers for a
given row) you normalize newlines before escaping pipes — e.g., replace CR/LF
and LF characters with a single space or an escaped literal (like "\\n") — then
apply the existing pipe-escaping logic (str(val).replace("\n", "
").replace("\r", "").replace("|", "\\|")) so no cell introduces extra lines in
the produced markdown table.

In `@data-processing/dataProcessingServer/dataProcessingServer/urls.py`:
- Line 7: Update the URL pattern in urls.py: replace the path call that
currently reads path('healthCheck', lambda request: HttpResponse('Hello
World!')) with a kebab-case pattern including a trailing slash, e.g.
path('health-check/', lambda request: HttpResponse('Hello World!')), so the
route name and trailing slash are consistent with other routes (keep the same
lambda view/HttpResponse).

In `@data-processing/dataProcessingServer/requirements.txt`:
- Around line 15-28: The commented dev/optional dependencies in requirements.txt
are invisible to tooling—create a new requirements-dev.txt containing the
entries shown (uvicorn, gunicorn, pymongo, motor, redis, pytest, pytest-django
with the suggested versions) and remove the commented lines from
requirements.txt so it only lists production deps; then update any CI/automation
config (pip-compile/Dependabot/OSV Scanner references or test install steps) to
install or reference requirements-dev.txt for development/test jobs instead of
relying on commented lines.
- Around line 3-4: requirements.txt pins Django to 6.x which requires Python
3.12+, so update your build/CI setup to enforce that Python version: change your
Dockerfile base image to a Python 3.12+ image (refer to your Dockerfile) and
update your CI workflow matrix/job steps (e.g., the Python version entry in your
CI config) to include only python-3.12 or newer so installs of Django>=6.0
succeed; also ensure any test runner or environment setup steps reference the
same python version to avoid silent install failures.

In `@server/src/controllers/savedRoutes.controllers.ts`:
- Around line 94-105: The current code silently drops breakpoints when
computeBreakpoints fails and uses a dynamic import; change to a static import of
computeBreakpoints at the top of the module and modify the save flow around
routeBreakpointsData/computeBreakpoints so that if computeBreakpoints throws you
still save the route but set a clear partial-save indicator (e.g., add a boolean
property like partialSave or breakpointError and an optional warning message on
the response body) so the caller is aware breakpoints are missing; keep the
try/catch to prevent crashes but populate that flag and include the error
message (or a sanitized message) in the response and server log to aid
debugging.
- Around line 164-172: The current cascade delete uses Route.findOneAndDelete
followed by BreakPoint.deleteMany which is non-atomic and can leave orphaned
BreakPoint docs; modify the controller to start a mongoose session (await
mongoose.startSession()), detect if transactions are supported and run both
operations inside session.withTransaction(), passing { session } to
Route.findOneAndDelete and BreakPoint.deleteMany so they commit atomically, and
if transactions are not available fall back to deleting BreakPoint.deleteMany({
routeId: id }) first and then remove the Route (Route.findOneAndDelete({ _id:
id, userId })) so breakpoints are not orphaned; ensure session.endSession() is
called and handle errors to abort/roll back appropriately.

In `@server/src/utils/redis.ts`:
- Around line 6-11: The top-level throw in server/src/utils/redis.ts (checking
UPSTASH_URL / UPSTASH_TOKEN) crashes the whole process on import; remove that
immediate check and replace it with a lazy factory/getter (e.g., export a
getRedisClient or createRedisClient function) that performs the env validation
and constructs/caches the client on first call; update callers (like
savedRoutes.controllers.ts) to call the factory/getter when they actually need
Redis so missing env vars only cause an error at first-use rather than on module
import.

…Pathway with Python fallback and integrate with client and server components.
@kaihere14 kaihere14 merged commit ed2e6a4 into kaihere14:main Feb 21, 2026
4 of 6 checks passed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/src/controllers/savedRoutes.controllers.ts (2)

174-182: ⚠️ Potential issue | 🟡 Minor

findOneAndDelete completes before deleteMany; a failure on line 182 leaves orphaned BreakPoint documents.

If BreakPoint.deleteMany throws after the route is already deleted, the outer catch returns 500 but the route is gone — orphaned breakpoints accumulate silently. Wrapping deleteMany in its own try-catch (similar to the suggested fix for insertMany above) prevents a stale-data cleanup failure from masking a successful deletion to the client.

🛡️ Proposed fix
-    // Cascade-delete associated breakpoints
-    await BreakPoint.deleteMany({ routeId: route._id });
+    // Cascade-delete associated breakpoints (best-effort)
+    try {
+      await BreakPoint.deleteMany({ routeId: route._id });
+    } catch (bpErr) {
+      console.error("Failed to cascade-delete breakpoints for route:", route._id, bpErr);
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 174 - 182,
The deletion currently calls Route.findOneAndDelete(...) before awaiting
BreakPoint.deleteMany(...), so if BreakPoint.deleteMany throws the controller
returns 500 while the Route is already removed, leaving orphaned BreakPoint
docs; fix by first deleting the BreakPoints in a guarded try-catch and
log/handle any breakpoint-delete errors separately, then delete the Route (or
alternatively perform Route.findOneAndDelete first but wrap
BreakPoint.deleteMany in its own try-catch so a failure doesn’t override the
successful route deletion). Specifically, adjust the controller around
Route.findOneAndDelete and BreakPoint.deleteMany: ensure BreakPoint.deleteMany({
routeId: route._id }) is wrapped in its own try-catch and does not cause the
endpoint to return 500 after a successful Route.findOneAndDelete, and include
contextual logging of the error for easier debugging.

66-73: ⚠️ Potential issue | 🟠 Major

Route is committed to MongoDB before insertMany; a failure returns 500 while the route document already exists.

Route.create on line 66 commits the document unconditionally. If BreakPoint.insertMany (line 151) subsequently throws, the outer catch block returns 500 to the client who will assume the save failed and may retry — creating a duplicate route. Because breakpoints are already treated as recomputable (see the fallback on lines 101-113), the simplest fix is to isolate the insertMany in its own try-catch so that a breakpoint persistence failure still returns 201.

🛡️ Proposed fix
-      if (breakPointDocs.length > 0) {
-        await BreakPoint.insertMany(breakPointDocs);
-      }
+      if (breakPointDocs.length > 0) {
+        try {
+          await BreakPoint.insertMany(breakPointDocs);
+        } catch (bpErr) {
+          // Non-fatal: breakpoints can be recomputed on next access.
+          console.error("Failed to persist breakpoints (route already saved):", bpErr);
+        }
+      }

Also applies to: 150-152

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 66 - 73, The
Route document is being created with Route.create(...) before attempting to
persist breakpoints with BreakPoint.insertMany(...), so if insertMany fails the
outer catch returns 500 despite the route already existing; change the
controller so that BreakPoint.insertMany is called inside its own try-catch
(surround the BreakPoint.insertMany call), log the insert error and swallow it
(or attach a non-fatal warning to the response) so the outer flow still returns
201 after successful Route.create, and do the same for the other
BreakPoint.insertMany usage; locate the Route.create call and the
BreakPoint.insertMany calls in the saved route controller and implement the
isolated try-catch around each insertMany.
🧹 Nitpick comments (4)
server/src/controllers/savedRoutes.controllers.ts (1)

102-112: !routeBreakpointsData is always false; the dynamic import can be replaced with a static import.

routeBreakpointsData is initialised to [] on line 76 and is never assigned null or undefined, so the !routeBreakpointsData guard is dead code. Additionally, the inline comment acknowledges that a static import is equally valid — a dynamic import() adds unnecessary indirection and loses the return-type inference for computeBreakpoints (requiring the unsafe as RouteBreakpoints[] cast on line 108).

♻️ Proposed refactor

At the top of the file, add a static import:

+import { computeBreakpoints } from "../utils/compute/breakPoint.compute.js";

Then simplify the fallback block:

-    // Fallback: Re-compute if cache miss or no searchId
-    if (!routeBreakpointsData || routeBreakpointsData.length === 0) {
-      try {
-        // Dynamically import to avoid circular dep issues if any,
-        // though static import is fine here too.
-        const { computeBreakpoints } =
-          await import("../utils/compute/breakPoint.compute.js");
-        routeBreakpointsData = computeBreakpoints(routes) as RouteBreakpoints[];
-      } catch (e) {
-        console.error("Failed to re-compute breakpoints during save:", e);
-        // If this fails, we save the route without breakpoints rather than crashing
-      }
-    }
+    // Fallback: Re-compute if cache miss or no searchId
+    if (routeBreakpointsData.length === 0) {
+      try {
+        routeBreakpointsData = computeBreakpoints(routes) as RouteBreakpoints[];
+      } catch (e) {
+        console.error("Failed to re-compute breakpoints during save:", e);
+        // Non-fatal: route is saved without breakpoints
+      }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 102 - 112,
routeBreakpointsData is initialized to [] so the !routeBreakpointsData check is
dead and the dynamic import of computeBreakpoints is unnecessary; statically
import computeBreakpoints at the top of the file, change the guard to only check
routeBreakpointsData.length === 0, remove the dynamic import block and call
computeBreakpoints(routes) directly (no unsafe "as RouteBreakpoints[]" cast),
and keep or simplify error handling around the direct call if you still want to
swallow failures; referenced symbols: routeBreakpointsData and
computeBreakpoints.
data-processing/dataProcessingServer/api/pathway/pipeline.py (1)

149-164: Move compute_score_udf to module scope to avoid repeated UDF re-registration.

The @pw.udf-decorated compute_score_udf is redefined on every call to run_batch_pipeline. Since it doesn't close over any local state (it delegates directly to the module-level process_route_row), there is no reason to define it inside the function. Repeated re-decoration on each request may accumulate UDF objects in Pathway's internal registry.

♻️ Proposed refactor
+# Define UDF at module level so it is registered only once
+if PATHWAY_AVAILABLE and pw is not None:
+    `@pw.udf`
+    def _compute_score_udf(
+        route_id: str,
+        route_index: int,
+        distance: float,
+        duration: float,
+        travel_mode: str,
+        weather_points: str,
+        aqi_points: str,
+        traffic_value: float,
+        last_computed_score: Optional[float]
+    ) -> str:
+        return process_route_row(
+            route_id, route_index, distance, duration, travel_mode,
+            weather_points, aqi_points, traffic_value, last_computed_score
+        )
+

 def run_batch_pipeline(routes_data):
     ...
-    `@pw.udf`
-    def compute_score_udf(...) -> str:
-        return process_route_row(...)
-
     result_table = input_table.select(
-        score_json=compute_score_udf(
+        score_json=_compute_score_udf(
             input_table.route_id, ...
         )
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py` around lines
149 - 164, Move the `@pw.udf-decorated` compute_score_udf out of
run_batch_pipeline into module scope so it is defined once (it simply delegates
to process_route_row); remove the inner definition from run_batch_pipeline and
have the pipeline call the module-level compute_score_udf. Ensure the module
still imports pw and that compute_score_udf signature matches
process_route_row's parameters (route_id, route_index, distance, duration,
travel_mode, weather_points, aqi_points, traffic_value, last_computed_score) so
calls from run_batch_pipeline remain unchanged.
data-processing/dataProcessingServer/api/pathway/transformers.py (1)

97-108: valid_points guard misses the case where main exists but contains no recognized fields.

If every weather point has a non-empty main dict that happens to omit temp, humidity, and pressure (e.g., {"main": {"wind": 5}}), valid_points > 0 but all per-field counts are 0. The function bypasses the "no data" guard and returns {temperature: 0.0, …, overall: 0.0} — indistinguishable from genuinely poor conditions.

♻️ Proposed fix
-    if valid_points == 0:
+    if temp_count == 0 and humidity_count == 0 and pressure_count == 0:
         return {
             "temperature": 0.0,
             "humidity": 0.0,
             "pressure": 0.0,
             "overall": 0.0,
             "details": None
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/transformers.py` around
lines 97 - 108, The guard using valid_points misses the case where per-field
counts are zero; update the early-return condition to also check that at least
one of temp_count, humidity_count, or pressure_count is > 0 (e.g., treat
(temp_count + humidity_count + pressure_count) == 0 as "no data") so the
function returns the no-data payload (temperature/humidity/pressure 0.0, overall
0.0, details None) when main exists but contains no recognized fields; locate
and modify the block that computes temp_score/humidity_score/pressure_score
using variables valid_points, temp_count, humidity_count, and pressure_count to
implement this additional check.
client/app/(private)/home/routes/(from)/(to)/page.tsx (1)

22-41: RouteData is duplicated — import it from RouteComparisonPanel.tsx instead.

The local definition (Lines 22–41) is structurally identical to the exported RouteData in client/components/routes/RouteComparisonPanel.tsx (Lines 8–27). If either drifts the two become silently incompatible.

♻️ Proposed refactor
-type RouteData = {
-  distance: number; // in meters
-  duration: number; // in seconds (base, without traffic)
-  trafficDuration?: number;
-  trafficFactor?: number;
-  geometry: {
-    coordinates: [number, number][];
-    type: string;
-  };
-  overallScore?: number;
-  weatherScore?: number;
-  aqiScore?: {
-    aqi: number;
-    score: number;
-    category: string;
-  };
-  trafficScore?: number;
-  pollutionReductionPct?: number;
-  exposureWarning?: string;
-};
+import { type RouteData } from "@/components/routes/RouteComparisonPanel";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx around lines 22 - 41,
Remove the duplicate local RouteData type and instead import the shared type
exported from RouteComparisonPanel (use the exported RouteData type); update any
local references to use the imported type, and if RouteComparisonPanel does not
currently export the type as a named export, add an export (e.g., export type
RouteData = ...) in RouteComparisonPanel so this file can import it—then delete
the local definition in page.tsx and import RouteData at the top.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx:
- Around line 180-182: The stale-response finally block unconditionally calls
setScoresLoading(false) causing the spinner to flicker when a newer fetch is
still in flight; inside the finally(s) in the score-fetch flow (the block that
references generation and fetchGenerationRef.current and calls
setScoresLoading), add a guard that only calls setScoresLoading(false) if
generation === fetchGenerationRef.current, and apply the same guard to the other
finally block(s) around lines that currently set scoresLoading(false) so only
the most recent generation clears the loading state.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 10-11: Update the misleading NOTE comment/docstring that currently
states "Pathway requires Linux" to reflect that Pathway is supported on both
Linux and macOS; keep the guidance that Windows users should use
run_simple_batch() or run Pathway inside a VM/Docker/WSL for production. Locate
the comment referencing Pathway and run_simple_batch() in pipeline.py and change
the text to mention macOS explicitly while preserving the existing Windows and
production recommendations.
- Around line 143-146: The pw.debug.table_from_markdown calls are incorrectly
using the default whitespace-splitting behavior which breaks JSON-containing
columns; update both calls to pw.debug.table_from_markdown (the one that builds
input_table using _markdown_from_data and RouteInputSchema and the similar call
inside run_batch_from_json) to pass split_on_whitespace=False so JSON strings
(e.g., weather_points/aqi_points) are not split into separate columns.

In `@data-processing/dataProcessingServer/api/views.py`:
- Around line 14-15: Add a minimal shared API key guard to the compute endpoints
by validating a request header (e.g., "X-API-Key") against an environment
variable (e.g., COMPUTE_API_KEY) and returning HTTP 403 when missing or
mismatched; implement this check for both view functions compute_scores and
compute_single_score either as a small decorator applied above each view or as
the first lines inside each function, and ensure the check runs before any heavy
processing while preserving existing `@csrf_exempt/`@require_http_methods
behavior.

In `@data-processing/dataProcessingServer/requirements.txt`:
- Around line 3-4: The requirements.txt line "Django>=6.0,<7.0" can cause
installations to fail on Python 3.10/3.11 because Django 6.0+ requires Python
≥3.12; update the repository to pin the runtime Python version so deployments
use a compatible interpreter. Fix by adding one of the suggested runtime
declarations: create a .python-version with "3.12" (or the minimum 3.12+ you
support), or add a Pipfile with a [requires] section specifying python_version =
"3.12", or add a python_requires = ">=3.12" entry in your packaging metadata
(setup.cfg/pyproject.toml) so installs for the package that depends on
Django>=6.0,<7.0 will be blocked on older Python versions.

In `@server/src/controllers/savedRoutes.controllers.ts`:
- Around line 81-83: The uuidRegex constant has a typo in the third character
class (`[0-9-a-f]`) that incorrectly allows a hyphen; update the pattern used in
the uuidRegex declaration to use `[0-9a-f]` (i.e., remove the stray hyphen) so
all five UUID segments strictly match hexadecimal characters (keep the existing
i flag and overall structure of uuidRegex).

---

Outside diff comments:
In `@server/src/controllers/savedRoutes.controllers.ts`:
- Around line 174-182: The deletion currently calls Route.findOneAndDelete(...)
before awaiting BreakPoint.deleteMany(...), so if BreakPoint.deleteMany throws
the controller returns 500 while the Route is already removed, leaving orphaned
BreakPoint docs; fix by first deleting the BreakPoints in a guarded try-catch
and log/handle any breakpoint-delete errors separately, then delete the Route
(or alternatively perform Route.findOneAndDelete first but wrap
BreakPoint.deleteMany in its own try-catch so a failure doesn’t override the
successful route deletion). Specifically, adjust the controller around
Route.findOneAndDelete and BreakPoint.deleteMany: ensure BreakPoint.deleteMany({
routeId: route._id }) is wrapped in its own try-catch and does not cause the
endpoint to return 500 after a successful Route.findOneAndDelete, and include
contextual logging of the error for easier debugging.
- Around line 66-73: The Route document is being created with Route.create(...)
before attempting to persist breakpoints with BreakPoint.insertMany(...), so if
insertMany fails the outer catch returns 500 despite the route already existing;
change the controller so that BreakPoint.insertMany is called inside its own
try-catch (surround the BreakPoint.insertMany call), log the insert error and
swallow it (or attach a non-fatal warning to the response) so the outer flow
still returns 201 after successful Route.create, and do the same for the other
BreakPoint.insertMany usage; locate the Route.create call and the
BreakPoint.insertMany calls in the saved route controller and implement the
isolated try-catch around each insertMany.

---

Nitpick comments:
In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx:
- Around line 22-41: Remove the duplicate local RouteData type and instead
import the shared type exported from RouteComparisonPanel (use the exported
RouteData type); update any local references to use the imported type, and if
RouteComparisonPanel does not currently export the type as a named export, add
an export (e.g., export type RouteData = ...) in RouteComparisonPanel so this
file can import it—then delete the local definition in page.tsx and import
RouteData at the top.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py`:
- Around line 149-164: Move the `@pw.udf-decorated` compute_score_udf out of
run_batch_pipeline into module scope so it is defined once (it simply delegates
to process_route_row); remove the inner definition from run_batch_pipeline and
have the pipeline call the module-level compute_score_udf. Ensure the module
still imports pw and that compute_score_udf signature matches
process_route_row's parameters (route_id, route_index, distance, duration,
travel_mode, weather_points, aqi_points, traffic_value, last_computed_score) so
calls from run_batch_pipeline remain unchanged.

In `@data-processing/dataProcessingServer/api/pathway/transformers.py`:
- Around line 97-108: The guard using valid_points misses the case where
per-field counts are zero; update the early-return condition to also check that
at least one of temp_count, humidity_count, or pressure_count is > 0 (e.g.,
treat (temp_count + humidity_count + pressure_count) == 0 as "no data") so the
function returns the no-data payload (temperature/humidity/pressure 0.0, overall
0.0, details None) when main exists but contains no recognized fields; locate
and modify the block that computes temp_score/humidity_score/pressure_score
using variables valid_points, temp_count, humidity_count, and pressure_count to
implement this additional check.

In `@server/src/controllers/savedRoutes.controllers.ts`:
- Around line 102-112: routeBreakpointsData is initialized to [] so the
!routeBreakpointsData check is dead and the dynamic import of computeBreakpoints
is unnecessary; statically import computeBreakpoints at the top of the file,
change the guard to only check routeBreakpointsData.length === 0, remove the
dynamic import block and call computeBreakpoints(routes) directly (no unsafe "as
RouteBreakpoints[]" cast), and keep or simplify error handling around the direct
call if you still want to swallow failures; referenced symbols:
routeBreakpointsData and computeBreakpoints.

Comment on lines +180 to +182
// If a newer fetchRoutes was triggered while we were waiting,
// discard this stale response entirely.
if (generation !== fetchGenerationRef.current) return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

scoresLoading briefly resets to false when a stale response's finally block fires while the current request is still in flight.

When two score fetches are in flight (gen 1 then gen 2), gen 1's stale response returns first, hits the early return at Line 182, then the finally on Line 249 still fires and sets scoresLoading(false) — even though gen 2 is still pending. The spinner disappears until gen 2 finishes.

🛡️ Proposed fix
-      if (generation !== fetchGenerationRef.current) return;
+      if (generation !== fetchGenerationRef.current) {
+        setScoresLoading(false);
+        return;
+      }

Then guard the finally:

     } catch (err) {
       console.error("Error fetching scores:", err);
     } finally {
-      setScoresLoading(false);
+      if (generation === fetchGenerationRef.current) {
+        setScoresLoading(false);
+      }
     }

Also applies to: 246-250

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/app/`(private)/home/routes/(from)/(to)/page.tsx around lines 180 -
182, The stale-response finally block unconditionally calls
setScoresLoading(false) causing the spinner to flicker when a newer fetch is
still in flight; inside the finally(s) in the score-fetch flow (the block that
references generation and fetchGenerationRef.current and calls
setScoresLoading), add a guard that only calls setScoresLoading(false) if
generation === fetchGenerationRef.current, and apply the same guard to the other
finally block(s) around lines that currently set scoresLoading(false) so only
the most recent generation clears the loading state.

Comment on lines +10 to +11
NOTE: Pathway requires Linux. On Windows, use run_simple_batch() which
uses direct Python computation. For production, run in Docker/WSL.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstring incorrectly limits Pathway to Linux — macOS is also supported.

Pathway is available on macOS and Linux; users of other systems should run Pathway on a Virtual Machine.

-NOTE: Pathway requires Linux. On Windows, use run_simple_batch() which
-uses direct Python computation. For production, run in Docker/WSL.
+NOTE: Pathway is available on Linux and macOS. On Windows, use run_simple_batch() which
+uses direct Python computation. For production on Windows, run in Docker/WSL.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
NOTE: Pathway requires Linux. On Windows, use run_simple_batch() which
uses direct Python computation. For production, run in Docker/WSL.
NOTE: Pathway is available on Linux and macOS. On Windows, use run_simple_batch() which
uses direct Python computation. For production on Windows, run in Docker/WSL.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py` around lines 10
- 11, Update the misleading NOTE comment/docstring that currently states
"Pathway requires Linux" to reflect that Pathway is supported on both Linux and
macOS; keep the guidance that Windows users should use run_simple_batch() or run
Pathway inside a VM/Docker/WSL for production. Locate the comment referencing
Pathway and run_simple_batch() in pipeline.py and change the text to mention
macOS explicitly while preserving the existing Windows and production
recommendations.

Comment on lines +143 to +146
input_table = pw.debug.table_from_markdown(
_markdown_from_data(table_data),
schema=RouteInputSchema
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -name "pipeline.py" -path "*/pathway/*" | head -5

Repository: kaihere14/BreathClean

Length of output: 127


🏁 Script executed:

cat -n data-processing/dataProcessingServer/api/pathway/pipeline.py | sed -n '140,170p'

Repository: kaihere14/BreathClean

Length of output: 1160


🏁 Script executed:

cat -n data-processing/dataProcessingServer/api/pathway/pipeline.py | sed -n '240,270p'

Repository: kaihere14/BreathClean

Length of output: 1085


🏁 Script executed:

grep -n "_markdown_from_data" data-processing/dataProcessingServer/api/pathway/pipeline.py

Repository: kaihere14/BreathClean

Length of output: 163


🏁 Script executed:

grep -n "RouteInputSchema" data-processing/dataProcessingServer/api/pathway/pipeline.py | head -10

Repository: kaihere14/BreathClean

Length of output: 206


🏁 Script executed:

cat -n data-processing/dataProcessingServer/api/pathway/pipeline.py | sed -n '1,50p'

Repository: kaihere14/BreathClean

Length of output: 1833


🏁 Script executed:

grep -n "table_data\|weather_points\|aqi_points" data-processing/dataProcessingServer/api/pathway/pipeline.py | head -20

Repository: kaihere14/BreathClean

Length of output: 1068


🏁 Script executed:

grep -n "table_from_markdown\|table_from_pandas\|split_on_whitespace" data-processing/dataProcessingServer/api/pathway/pipeline.py

Repository: kaihere14/BreathClean

Length of output: 116


🏁 Script executed:

python3 << 'EOF'
import sys
try:
    import pathway as pw
    # Check if table_from_markdown has split_on_whitespace parameter
    import inspect
    sig = inspect.signature(pw.debug.table_from_markdown)
    print("table_from_markdown signature:")
    print(sig)
    print("\nParameters:")
    for param_name, param in sig.parameters.items():
        print(f"  {param_name}: {param.annotation if param.annotation != inspect.Parameter.empty else 'no annotation'}")
except ImportError:
    print("Pathway not available in this environment")
except Exception as e:
    print(f"Error: {e}")
EOF

Repository: kaihere14/BreathClean

Length of output: 106


🏁 Script executed:

find . -name "*.py" -type f | xargs grep -l "table_from_markdown\|table_from_pandas" | head -10

Repository: kaihere14/BreathClean

Length of output: 127


🏁 Script executed:

grep -r "split_on_whitespace" . --include="*.py" 2>/dev/null

Repository: kaihere14/BreathClean

Length of output: 47


🌐 Web query:

Pathway table_from_markdown split_on_whitespace parameter documentation

💡 Result:

In Pathway pw.debug.table_from_markdown(...), the parameter:

  • split_on_whitespace: bool = True — when True (default), the markdown rows are parsed by splitting on whitespace. Set split_on_whitespace=False if you need string cell values that contain spaces (so whitespace inside values isn’t used as a delimiter). [1]

Source: [1]


🌐 Web query:

Pathway Python library table_from_markdown API reference

💡 Result:

pw.debug.table_from_markdown (Pathway Python) API reference:

Docs page: Pathway API Docs → pw.debugtable_from_markdown.[1]

Signature

pw.debug.table_from_markdown(
    table_def,
    id_from=None,
    unsafe_trusted_ids=False,
    schema=None,
    *,
    split_on_whitespace=True,
) -> pw.Table
```[1][2]

**Parameters**
- `table_def` (`str`): Markdown-like table text (header row + data rows).[1][2]
- `id_from` (`list[str] | None`): Column(s) to build row ids from (primary key). Mutually exclusive with `schema`.[2]
- `unsafe_trusted_ids` (`bool`): Passed through to the underlying datasource options.[1][2]
- `schema` (`type[pw.Schema] | None`): Explicit schema; if set, must match the tables ordinary (non-pseudo) columns; mutually exclusive with `id_from`.[2]
- `split_on_whitespace` (`bool`, keyword-only): Default `True`. If `True`, also splits columns on whitespace; set `False` if you need string cells containing spaces.[1][2]

**Special pseudo-columns**
- `__time__`: If present, rows are split into batches by this timestamp column.[1][2]
- `__diff__`: If present, event type: `1` = insert, `-1` = delete.[1][2]

**Returns**
- A `pw.Table`.[2]

Sources: Pathway `pw.debug` docs[1] and the implementation docstring/signature in `pathway/debug/__init__.py`[2].

---

🏁 Script executed:

```shell
cat -n data-processing/dataProcessingServer/api/pathway/pipeline.py | sed -n '1,35p'

Repository: kaihere14/BreathClean

Length of output: 1268


🏁 Script executed:

cat -n data-processing/dataProcessingServer/api/pathway/pipeline.py | sed -n '100,200p'

Repository: kaihere14/BreathClean

Length of output: 4071


🏁 Script executed:

python3 << 'EOF'
try:
    import pathway as pw
    # Check if table_from_pandas exists
    if hasattr(pw.debug, 'table_from_pandas'):
        import inspect
        sig = inspect.signature(pw.debug.table_from_pandas)
        print("table_from_pandas exists!")
        print("Signature:", sig)
    else:
        print("table_from_pandas not found in pw.debug")
        print("Available methods in pw.debug:")
        print([m for m in dir(pw.debug) if not m.startswith('_')])
except ImportError:
    print("Pathway not available in this environment")
EOF

Repository: kaihere14/BreathClean

Length of output: 106


🏁 Script executed:

python3 << 'EOF'
# Simulate the markdown parsing behavior
test_markdown = """route_id | weather_points
route1 | [{"main": {"temp": 22}}]
"""

# With split_on_whitespace=True (default), splitting happens like this:
lines = test_markdown.strip().split('\n')
header_line = lines[0]
data_line = lines[1]

print("Original markdown:")
print(test_markdown)

print("\nHeader line:", header_line)
print("Data line:", data_line)

print("\n--- With split_on_whitespace=True (current behavior) ---")
# Simulating default behavior - split on whitespace creates extra columns
parts = data_line.split('|')
print(f"Parts split by '|': {parts}")
for part in parts:
    tokens = part.strip().split()
    print(f"  Tokens in '{part.strip()}': {tokens}")

print("\n--- With split_on_whitespace=False (proposed fix) ---")
print("Values would be preserved as-is, no extra splitting on whitespace")
EOF

Repository: kaihere14/BreathClean

Length of output: 577


Add split_on_whitespace=False to table_from_markdown call — JSON columns with spaces are being whitespace-split into misaligned columns.

pw.debug.table_from_markdown splits column values on whitespace by default. The weather_points and aqi_points columns contain JSON strings (e.g., [{"main": {"temp": 22, "humidity": 55}}]) that include spaces. Without split_on_whitespace=False, each space-separated token becomes a separate column value, corrupting the entire table structure. The Pathway batch path will silently produce corrupted data for any route with non-empty environmental data.

Fix: Add the parameter to the table_from_markdown call at line 143:

    input_table = pw.debug.table_from_markdown(
        _markdown_from_data(table_data),
        schema=RouteInputSchema,
+       split_on_whitespace=False
    )

Also applies to line 291 (similar call in run_batch_from_json).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
input_table = pw.debug.table_from_markdown(
_markdown_from_data(table_data),
schema=RouteInputSchema
)
input_table = pw.debug.table_from_markdown(
_markdown_from_data(table_data),
schema=RouteInputSchema,
split_on_whitespace=False
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/pathway/pipeline.py` around lines
143 - 146, The pw.debug.table_from_markdown calls are incorrectly using the
default whitespace-splitting behavior which breaks JSON-containing columns;
update both calls to pw.debug.table_from_markdown (the one that builds
input_table using _markdown_from_data and RouteInputSchema and the similar call
inside run_batch_from_json) to pass split_on_whitespace=False so JSON strings
(e.g., weather_points/aqi_points) are not split into separate columns.

Comment on lines +14 to +15
@csrf_exempt
@require_http_methods(["POST"])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

No authentication on compute endpoints — add at minimum a shared API key check.

Both compute_scores and compute_single_score are @csrf_exempt with no authentication whatsoever. Anyone who can reach the Django server can fire arbitrarily many scoring requests without restriction, leading to CPU exhaustion or DoS. Even if this server is only reachable internally, service-to-service auth (e.g., a pre-shared key in a request header validated against an env var) is a standard baseline:

🔒 Minimal API key guard (apply to both views)
+import os
+from django.conf import settings
+
+def _require_api_key(request):
+    expected = os.environ.get("INTERNAL_API_KEY", "")
+    provided = request.headers.get("X-Internal-Api-Key", "")
+    if not expected or provided != expected:
+        return JsonResponse({"success": False, "message": "Unauthorized"}, status=401)
+    return None

 `@csrf_exempt`
 `@require_http_methods`(["POST"])
 def compute_scores(request):
+    auth_error = _require_api_key(request)
+    if auth_error:
+        return auth_error
     ...

Also applies to: 123-124

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/api/views.py` around lines 14 - 15, Add
a minimal shared API key guard to the compute endpoints by validating a request
header (e.g., "X-API-Key") against an environment variable (e.g.,
COMPUTE_API_KEY) and returning HTTP 403 when missing or mismatched; implement
this check for both view functions compute_scores and compute_single_score
either as a small decorator applied above each view or as the first lines inside
each function, and ensure the check runs before any heavy processing while
preserving existing `@csrf_exempt/`@require_http_methods behavior.

Comment on lines +3 to +4
# Django (must match version referenced in settings.py — Django 6.0.2)
Django>=6.0,<7.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Django 6.0 requires Python 3.12+ — ensure runtime is pinned accordingly.

Django 6.0 supports Python 3.12, 3.13, and 3.14 and drops 3.10/3.11. If the deployment environment runs Python 3.10 or 3.11, pip install -r requirements.txt will fail. Consider adding a .python-version file, a Pipfile [requires] block, or a python_requires entry to enforce ≥ 3.12 explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@data-processing/dataProcessingServer/requirements.txt` around lines 3 - 4,
The requirements.txt line "Django>=6.0,<7.0" can cause installations to fail on
Python 3.10/3.11 because Django 6.0+ requires Python ≥3.12; update the
repository to pin the runtime Python version so deployments use a compatible
interpreter. Fix by adding one of the suggested runtime declarations: create a
.python-version with "3.12" (or the minimum 3.12+ you support), or add a Pipfile
with a [requires] section specifying python_version = "3.12", or add a
python_requires = ">=3.12" entry in your packaging metadata
(setup.cfg/pyproject.toml) so installs for the package that depends on
Django>=6.0,<7.0 will be blocked on older Python versions.

Comment on lines +81 to +83
const uuidRegex =
/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9-a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in UUID regex makes the third segment too permissive.

The third group is [0-9-a-f] instead of [0-9a-f]. In a JS character class (non-Unicode mode) [0-9-a-f] is parsed as: range 0-9, literal hyphen -, range a-f. This means a - character would satisfy [0-9-a-f]{4}, allowing strings like a-b4 to pass as the third UUID segment and slip into the Redis key.

🛡️ Proposed fix
-      const uuidRegex =
-        /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9-a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
+      const uuidRegex =
+        /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/controllers/savedRoutes.controllers.ts` around lines 81 - 83, The
uuidRegex constant has a typo in the third character class (`[0-9-a-f]`) that
incorrectly allows a hyphen; update the pattern used in the uuidRegex
declaration to use `[0-9a-f]` (i.e., remove the stray hyphen) so all five UUID
segments strictly match hexadecimal characters (keep the existing i flag and
overall structure of uuidRegex).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants