Skip to content

Conversation

@sweetmantech
Copy link
Collaborator

@sweetmantech sweetmantech commented Nov 14, 2025

Summary by CodeRabbit

  • New Features

    • Added POST /artist/socials/scrape to trigger batch scraping and return per-profile results (runId, datasetId, error).
  • Improvements

    • Standardized GET /artist/socials route behavior with clearer success/error responses and a new supported flag on scrape results.
    • Batch scraping now aggregates results across profiles.
  • Bug Fixes

    • Improved input validation and consistent 400/500 responses for bad requests and unexpected errors.

@coderabbitai
Copy link

coderabbitai bot commented Nov 14, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Refactors GET handler typing to RequestHandler, re-exports controller handlers, adds POST /artist/socials/scrape handler to validate artist_account_id, fetch account socials, call a new batch scraper, and returns aggregated scrape results; introduces batch scraper and updates scrape result typing.

Changes

Cohort / File(s) Summary
GET handler refactor
controllers/ArtistSocialsController/getArtistSocialsHandler.ts
Change handler signature to RequestHandler (typed import), adjust imports/paths, update route comment to /artist/socials, ensure early returns after 400/500 responses and explicit return after success.
Controller index exports
controllers/ArtistSocialsController/index.ts
Add re-exports for getArtistSocialsHandler and postArtistSocialsScrapeHandler.
New POST scrape handler
controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts
New RequestHandler that validates body.artist_account_id, calls getAccountSocials, maps socials to inputs and invokes scrapeProfileUrlBatch, returns aggregated results or appropriate 400/500 JSON responses, and logs unexpected errors.
Routes updated
routes.ts
Import both handlers from controllers/ArtistSocialsController and register GET /artist/socials and new POST /artist/socials/scrape routes (remove prior as any cast).
Scrape result typing & implementation
lib/apify/scrapeProfileUrl.ts
Introduce ScrapeProfileResult = ProfileScrapeResult & { supported: boolean }, remove supported from ProfileScrapeResult, update scrapeProfileUrl return type to `Promise<ScrapeProfileResult
Batch scraper
lib/apify/scrapeProfileUrlBatch.ts
Add scrapeProfileUrlBatch(inputs) to run scrapeProfileUrl in parallel (Promise.all), normalize inputs, filter nulls, and return array of profile scrape results.
Manifest
package.json
Manifest file present in diff (no detailed changes summarized).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant getAccountSocials
    participant scrapeProfileUrlBatch
    participant Response

    Client->>Server: POST /artist/socials/scrape { artist_account_id }
    Server->>Server: Validate artist_account_id
    alt invalid
        Server->>Response: 400 { error: "artist_account_id required" }
        Response-->>Client: 400
    else valid
        Server->>getAccountSocials: fetch socials for account
        alt fetch error
            Server->>Response: 500 { error }
            Response-->>Client: 500
        else socials found
            Server->>scrapeProfileUrlBatch: scrape batch(inputs)
            scrapeProfileUrlBatch-->>Server: [ { runId?, datasetId?, error? , supported } ... ]
            Server->>Response: 200 [ aggregated results ]
            Response-->>Client: 200
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Pay attention to:
    • postArtistSocialsScrapeHandler.ts: input validation, concurrency/error aggregation, and logging.
    • lib/apify/scrapeProfileUrl.ts and scrapeProfileUrlBatch.ts: type changes (new ScrapeProfileResult) and consistency of returned shapes.
    • routes.ts and controller exports: correct imports/registration and removal of prior casts.

Possibly related PRs

Poem

🐰 I hop through routes with tidy care,
I check each id and launch the snare.
I call the scrapers, batch by batch,
Collect the runs, then send them back.
Happy hops for every successful snatch!

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The pull request title is vague and uses non-descriptive formatting (lowercase, abbreviated jira-like identifiers) that does not clearly convey what changes were made. Revise the title to clearly describe the main change, such as 'Add API endpoint for scraping artist social media profiles' or 'Implement batch processing for artist socials scraping'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sweetmantech/myc-3422-api-apiartistsocialsscrape

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (1)

36-39: Consider consistent response structure.

The empty array response is inconsistent with error responses that include {status, message}. Consider returning a structured response for consistency:

     if (!socials.length) {
-      res.json([]);
+      res.status(200).json({ status: "success", results: [] });
       return;
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a876ae9 and 377ebef.

📒 Files selected for processing (4)
  • controllers/ArtistSocialsController/getArtistSocialsHandler.ts (3 hunks)
  • controllers/ArtistSocialsController/index.ts (1 hunks)
  • controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (1 hunks)
  • routes.ts (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (2)
lib/supabase/getAccountSocials.ts (1)
  • getAccountSocials (9-64)
lib/apify/scrapeProfileUrl.ts (1)
  • scrapeProfileUrl (54-97)
controllers/ArtistSocialsController/getArtistSocialsHandler.ts (1)
lib/supabase/getArtistSocials.ts (1)
  • getArtistSocials (40-146)
🔇 Additional comments (9)
controllers/ArtistSocialsController/getArtistSocialsHandler.ts (3)

1-2: LGTM: Type import and path update.

The use of type import for RequestHandler follows TypeScript best practices, and the updated import path reflects the file reorganization.


4-8: LGTM: Standardized handler type and corrected documentation.

The handler now uses Express's RequestHandler type for consistency, and the route documentation correctly reflects the actual endpoint path.


13-51: LGTM: Explicit returns improve control flow.

Adding explicit return statements after each response prevents accidental fall-through and makes the control flow clearer.

routes.ts (2)

10-13: LGTM: Clean barrel import pattern.

The consolidated import of both handlers from ArtistSocialsController follows the established pattern and improves organization.


100-101: LGTM: Improved type safety and new scrape endpoint.

Removing the as any cast improves type safety for the GET route, and the new POST /artist/socials/scrape endpoint is properly wired.

controllers/ArtistSocialsController/index.ts (1)

1-2: LGTM: Clean barrel export.

This index file follows the standard barrel export pattern, enabling cleaner imports in other files.

controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (3)

16-24: LGTM: Proper input validation.

The handler correctly validates that artist_account_id is present and is a string, returning a 400 error with a clear message if validation fails.


26-34: LGTM: Error handling for fetching socials.

The handler properly checks the status from getAccountSocials and returns an appropriate 500 error if the fetch fails.


66-73: LGTM: Proper error handling.

The catch block appropriately logs the error for debugging while returning a generic error message to the client for security.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (1)

38-59: Add social identifiers to map results back to their sources.

The results array contains only runId, datasetId, and error, with no way to determine which social account each result corresponds to. This issue was flagged in previous review comments and remains unresolved. Additionally, the type annotation uses ProfileScrapeResult[] instead of ScrapeProfileResult[], losing the supported field.

Apply this diff to fix both issues:

+type ArtistSocialScrapeResult = {
+  social_id: number;
+  username: string | null;
+  profile_url: string | null;
+  runId: string | null;
+  datasetId: string | null;
+  error: string | null;
+  supported: boolean;
+};
+
-    const results: ProfileScrapeResult[] = await Promise.all(
+    const results: ArtistSocialScrapeResult[] = await Promise.all(
       socials.map(async (social) => {
         const scrapeResult = await scrapeProfileUrl(
           social.profile_url ?? null,
           social.username ?? ""
         );
 
         if (!scrapeResult) {
           return {
+            social_id: social.id,
+            username: social.username,
+            profile_url: social.profile_url,
             runId: null,
             datasetId: null,
             error: "Unsupported or missing profile URL",
+            supported: false,
           };
         }
 
         return {
+          social_id: social.id,
+          username: social.username,
+          profile_url: social.profile_url,
           runId: scrapeResult.runId,
           datasetId: scrapeResult.datasetId,
           error: scrapeResult.error,
+          supported: scrapeResult.supported,
         };
       })
     );
🧹 Nitpick comments (2)
lib/apify/scrapeProfileUrl.ts (2)

21-23: Clarify the purpose of the supported field.

The supported field is always set to true in all return paths (lines 82, 90, 97), making it redundant. If the intent is to indicate whether a platform is supported, consider returning { supported: false } instead of null when no platform matches (line 71). Otherwise, if the field serves no functional purpose, consider removing it.

Can you clarify the intended behavior? If supported should vary based on platform availability, apply this diff:

   if (!platform) {
-    return null;
+    return {
+      runId: null,
+      datasetId: null,
+      error: "Unsupported platform",
+      supported: false,
+    };
   }

Otherwise, if it's always true, the field adds no value and could be removed.


60-72: Consider eliminating null returns for unsupported platforms.

The function returns null when no platform is found (line 71), but with the new supported field, this creates two ways to indicate lack of support: null or supported: false. This forces consumers to handle both cases.

Consider always returning ScrapeProfileResult and using the supported field consistently:

   if (!platform) {
-    return null;
+    return {
+      runId: null,
+      datasetId: null,
+      error: "Unsupported platform",
+      supported: false,
+    };
   }

Then update the return type to Promise<ScrapeProfileResult> (non-nullable) for a cleaner API.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 377ebef and 94f1eed.

📒 Files selected for processing (2)
  • controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (1 hunks)
  • lib/apify/scrapeProfileUrl.ts (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (2)
lib/supabase/getAccountSocials.ts (1)
  • getAccountSocials (9-64)
lib/apify/scrapeProfileUrl.ts (2)
  • ProfileScrapeResult (15-19)
  • scrapeProfileUrl (57-100)
🔇 Additional comments (4)
lib/apify/scrapeProfileUrl.ts (1)

74-99: LGTM: Consistent handling of supported flag.

All return paths correctly include supported: true when a platform is matched. The implementation is consistent, though the utility of the field depends on addressing the earlier comments about unsupported platforms.

controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (3)

8-21: LGTM: Proper handler signature and input validation.

The handler correctly uses RequestHandler type and validates the required artist_account_id parameter with appropriate error handling.


23-36: LGTM: Proper error handling for fetching socials.

The handler correctly handles both error states and empty results from getAccountSocials.


63-70: LGTM: Appropriate error handling.

The catch block properly logs errors and returns a generic 500 response without exposing internal details.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (1)

35-42: Include social identifiers in scrape results to enable result mapping.

The batch results lack identifiers to map each runId/datasetId/error back to its corresponding social account. While Promise.all preserves order, relying on positional mapping is fragile—clients cannot correlate results to socials without explicit identifiers like social_id, username, or profile_url.

Consider enriching the batch results with identifiers after scraping:

     const results = await scrapeProfileUrlBatch(
       socials.map((social) => ({
         profileUrl: social.profile_url,
         username: social.username,
       }))
     );

-    res.json(results);
+    const enrichedResults = results.map((result, index) => ({
+      social_id: socials[index].id,
+      username: socials[index].username,
+      profile_url: socials[index].profile_url,
+      ...result,
+    }));
+
+    res.json(enrichedResults);

Alternatively, extend scrapeProfileUrlBatch to accept and preserve arbitrary metadata (e.g., a generic context field) so identifiers can be passed through the batch operation.

🧹 Nitpick comments (1)
lib/apify/scrapeProfileUrlBatch.ts (1)

21-27: Consider preserving the supported field in batch results.

The function filters for ScrapeProfileResult (which includes a supported boolean indicating platform support), but the mapping drops this field and returns ProfileScrapeResult[]. Clients lose visibility into whether each profile URL was supported by a platform scraper or simply returned null for other reasons.

If the supported field is useful to consumers, consider updating the return type and mapping:

 export const scrapeProfileUrlBatch = async (
   inputs: ScrapeProfileUrlBatchInput[]
-): Promise<ProfileScrapeResult[]> => {
+): Promise<ScrapeProfileResult[]> => {
   const results = await Promise.all(
     inputs.map(({ profileUrl, username }) =>
       scrapeProfileUrl(profileUrl ?? null, username ?? "")
     )
   );

   return results
-    .filter((result): result is ScrapeProfileResult => result !== null)
-    .map(({ runId, datasetId, error }) => ({
-      runId,
-      datasetId,
-      error,
-    }));
+    .filter((result): result is ScrapeProfileResult => result !== null);
 };

Alternatively, if the supported field is intentionally omitted for batch results, document this design decision.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 94f1eed and 22a97a7.

📒 Files selected for processing (2)
  • controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (1 hunks)
  • lib/apify/scrapeProfileUrlBatch.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
lib/apify/scrapeProfileUrlBatch.ts (1)
lib/apify/scrapeProfileUrl.ts (3)
  • ProfileScrapeResult (15-19)
  • scrapeProfileUrl (57-100)
  • ScrapeProfileResult (21-23)
controllers/ArtistSocialsController/postArtistSocialsScrapeHandler.ts (3)
controllers/ArtistSocialsController/index.ts (1)
  • postArtistSocialsScrapeHandler (2-2)
lib/supabase/getAccountSocials.ts (1)
  • getAccountSocials (9-64)
lib/apify/scrapeProfileUrlBatch.ts (1)
  • scrapeProfileUrlBatch (12-28)
🔇 Additional comments (1)
lib/apify/scrapeProfileUrlBatch.ts (1)

17-17: Empty string defaults are validated downstream, but upstream filtering may be better.

The platform scrapers (Instagram, TikTok) validate the handle parameter and throw "Invalid {Platform} handle" errors when empty strings are passed (after trimming). These errors are caught and returned in the result's error field, so failures are not silent—they're explicitly reported.

However, this still results in failed scrape attempts for any input with a missing username. Consider filtering out inputs with missing usernames upstream in scrapeProfileUrlBatch.ts rather than defaulting to empty strings, which would prevent unnecessary API calls and error handling.

@sweetmantech sweetmantech merged commit 538b565 into main Nov 14, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants