Skip to content

Conversation

@dushyantzz
Copy link

@dushyantzz dushyantzz commented Oct 2, 2025

Fixes #738

Solution: Filter Inferred Intent Confidence Scores

Problem

The frontend displayed intent confidence scores for intents that users never defined. Ludwig NLU infers intents from its training data, causing confusion when high confidence scores appear for non-existent user intents.

Example: User enters "Bonsoir" → Ludwig returns greetings_goodevening with 99.68% confidence → User never created this intent → Confusion!


Solution

Implemented server-side filtering in the /nlpsample/message endpoint to return only user-defined entities and values.

What Changed

Modified: api/src/nlp/controllers/nlp-sample.controller.ts

  • Added NlpValueService dependency injection
  • Implemented filtering logic in message() endpoint that:
    • Retrieves user's NLP configuration
    • Validates each predicted entity exists in user's config
    • For trait entities (like intent), validates the predicted value exists
    • Filters out non-user-defined entities/values

Added Tests: api/src/nlp/controllers/nlp-sample.controller.spec.ts

  • Test for filtering non-user-defined intent values
  • Test for preserving user-defined entities

How It Works

User Input → Ludwig NLU → Raw Prediction → Backend Filter → Only User-Defined → Frontend

Filtering Process:

  1. Get user's NLP entity configuration (cached)
  2. For each predicted entity:
    • Check if entity exists in user's config
    • For trait entities: validate the value exists
    • Filter out if not found
  3. Return only validated entities

Impact

Before ❌

  • Shows confidence for intents user never created
  • Misleading high confidence scores
  • Users confused about intent origin

After ✅

  • Shows only user-defined intents
  • Accurate representation of user's configuration
  • Clear, honest UX

Testing

Run tests:

cd api
npm test -- nlp-sample.controller.spec.ts

Manual test:

  1. Create intent value greeting
  2. Enter "Bonsoir" → Only language shown (no inferred intent)
  3. Enter "Hello" → Your greeting intent shown with confidence ✅

Files Modified

  • api/src/nlp/controllers/nlp-sample.controller.ts (~45 lines)
  • api/src/nlp/controllers/nlp-sample.controller.spec.ts (~90 lines)

Benefits

✅ Eliminates user confusion
✅ Accurate confidence scores
✅ Backward compatible
✅ Well-tested
✅ Performance optimized (uses caching)


Summary

This fix filters out Ludwig NLU's inferred intents that don't exist in the user's configuration. Users now see confidence scores only for intents they've explicitly defined, providing an accurate representation of their chatbot's NLP capabilities.

Summary by CodeRabbit

  • Bug Fixes

    • NLP message responses now include only user-configured intents and entities, excluding inferred or unsupported items.
    • Trait-like entities are validated against allowed values to prevent incorrect matches.
    • Improves accuracy and consistency of detected entities in responses.
  • Tests

    • Added comprehensive tests to verify filtering behavior for mixed and fully user-defined entities, ensuring correct propagation and validation.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 2, 2025

Walkthrough

Adds runtime filtering in NlpSampleController.message to return only user-defined entities and validated trait values, introducing NlpValueService dependency. Updates constructor accordingly. Adds tests that mock predictions to verify inferred intents are excluded and user-defined entities/values are preserved.

Changes

Cohort / File(s) Summary
Controller logic and dependency
api/src/nlp/controllers/nlp-sample.controller.ts
Filters NLU entities against user-defined mappings; validates trait values via NlpValueService; updates constructor to inject NlpValueService; returns only non-null filtered entities.
Unit tests for filtering behavior
api/src/nlp/controllers/nlp-sample.controller.spec.ts
Adds tests for message endpoint to ensure inferred intents are excluded while preserving user-defined entities; covers scenarios with mixed and all user-defined entities; mocks helper predictions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant Cn as NlpSampleController
  participant H as NlpHelper
  participant M as NlpMappingStore
  participant V as NlpValueService

  C->>Cn: POST /nlp/sample/message (text)
  Cn->>H: predict(text)
  H-->>Cn: entities (intent, traits, others)

  Cn->>M: fetch user-defined NLP mappings
  M-->>Cn: mappings (intents, traits, values)

  rect rgba(200,230,255,0.3)
  note right of Cn: Filter phase
  loop for each entity
    alt trait-like entity
      Cn->>V: get allowed values for trait
      V-->>Cn: allowed values
      Cn-->>Cn: keep only if value in allowed set
    else other entity (e.g., intent)
      Cn-->>Cn: keep only if entity/value defined by user
    end
  end
  end

  Cn-->>C: filtered entities (non-null only)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I twitch my whiskers, sift the streams,
From weeds of guess to gardened themes.
Intent that wandered? I let it go—
Only planted seeds may grow.
With traits confirmed, I thump in cheer:
Clean carrots of data, crisp and clear. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description does not follow the repository’s template headings and structure: it lacks the required “Motivation” header, a “Type of change” section with the appropriate checkboxes, and the “Checklist” section is entirely missing despite detailed other sections being present. Update the description to match the template by adding a “# Motivation” section that summarizes the change and fixes, a “# Type of change” section with the relevant checkbox marked, and the “# Checklist” section with each prerequisite item reviewed and completed.
✅ Passed checks (4 passed)
Check name Status Explanation
Title Check ✅ Passed The title references filtering inferred intent confidence scores, which aligns with the main change, but the leading phrase “Created solution” is extraneous and reduces clarity.
Linked Issues Check ✅ Passed The controller changes implement server-side filtering of NLP predictions to include only user-defined intents and values, and the added tests validate non-user-defined intents are excluded and user-defined entities are preserved, directly addressing the requirements of issue #738 to prevent displaying confidence for undefined intents and ensure API responses only surface configured values.
Out of Scope Changes Check ✅ Passed All modifications are confined to the NLP sample controller and its spec to implement and test filtering logic for user-defined intents and values, with no unrelated files or features altered outside the scope of the linked issue’s objectives.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
api/src/nlp/controllers/nlp-sample.controller.spec.ts (2)

458-518: Consider cleanup after test data creation.

The test creates a new NLP value (greeting) in the database. If this data persists across tests, it could cause test pollution or flakiness. Consider:

  • Moving this setup to a dedicated beforeEach or beforeAll hook with corresponding cleanup
  • Or ensuring the test suite cleans up created entities in afterEach

Example cleanup pattern:

+  let createdValueId: string;
+
   it('should filter out intent values that are not user-defined', async () => {
     const intentEntity = await nlpEntityService.findOne({ name: 'intent' });
-    await nlpValueService.create({
+    const createdValue = await nlpValueService.create({
       entity: intentEntity!.id,
       value: 'greeting',
       expressions: [],
     });
+    createdValueId = createdValue.id;
     
     // ... rest of test
   });
+
+  afterEach(async () => {
+    if (createdValueId) {
+      await nlpValueService.deleteOne(createdValueId);
+      createdValueId = null;
+    }
+  });

520-545: Strengthen test assertions with content verification.

The test only verifies result.entities.length === 2 but doesn't validate the actual entity content. This makes the test less robust and harder to debug if it fails.

Apply this diff to add content assertions:

     const result = await nlpSampleController.message('Hello');
 
     expect(result.entities).toHaveLength(2);
+    expect(result.entities).toEqual(
+      expect.arrayContaining([
+        expect.objectContaining({
+          entity: 'intent',
+          value: 'greeting',
+        }),
+        expect.objectContaining({
+          entity: 'language',
+          value: 'en',
+        }),
+      ]),
+    );
   });
api/src/nlp/controllers/nlp-sample.controller.ts (2)

256-258: Return type could be strengthened with explicit typing.

The return statement filters out null values but TypeScript may not infer the narrowed type correctly. Consider adding an explicit type assertion or type guard for clarity.

     return {
-      entities: filteredEntities.filter((e) => e !== null),
+      entities: filteredEntities.filter((e): e is NonNullable<typeof e> => e !== null),
     };

This ensures the returned entities array has the correct non-nullable type.


216-259: Consider documenting the filtering behavior.

The filtering logic has several implicit behaviors that might not be obvious to future maintainers:

  • Entities not in nlpMap are silently excluded
  • Trait entities with invalid values are silently excluded
  • Non-trait entities pass through if they exist in nlpMap

Consider adding inline comments to clarify the filtering rules:

     const filteredEntities = await Promise.all(
       prediction.entities.map(async (entity) => {
         const nlpEntity = nlpMap.get(entity.entity);
 
+        // Filter out entities that don't exist in user's configuration
         if (!nlpEntity) {
           return null;
         }
 
+        // For trait entities (e.g., intent), validate that the value is user-defined
         if (nlpEntity.lookups?.includes('trait')) {
           const entityValues = await this.nlpValueService.find({
             entity: nlpEntity.id,
           });
 
           const valueExists = entityValues.some(
             (v) => v.value === entity.value,
           );
 
+          // Filter out trait values that user hasn't defined
           if (!valueExists) {
             return null;
           }
         }
 
+        // Keep entities that exist in config and have valid values
         return entity;
       }),
     );
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0108036 and 1df9409.

📒 Files selected for processing (2)
  • api/src/nlp/controllers/nlp-sample.controller.spec.ts (1 hunks)
  • api/src/nlp/controllers/nlp-sample.controller.ts (4 hunks)

Comment on lines +224 to +253
const filteredEntities = await Promise.all(
prediction.entities.map(async (entity) => {
const nlpEntity = nlpMap.get(entity.entity);

// If entity doesn't exist in user's configuration, exclude it
if (!nlpEntity) {
return null;
}

// For trait entities (like intent), check if the value exists
if (nlpEntity.lookups?.includes('trait')) {
// Get all values for this entity
const entityValues = await this.nlpValueService.find({
entity: nlpEntity.id,
});

// Check if the predicted value exists in user-defined values
const valueExists = entityValues.some(
(v) => v.value === entity.value,
);

// If value doesn't exist, exclude this entity
if (!valueExists) {
return null;
}
}

return entity;
}),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential N+1 query performance issue in entity filtering.

The filtering logic calls this.nlpValueService.find({ entity: nlpEntity.id }) inside a Promise.all(prediction.entities.map(...)), which means for each trait entity in the prediction, a separate database query is executed. If predictions contain many trait entities, this could cause performance degradation.

Consider optimizing by:

  1. Pre-fetching all entity values in a single query before the loop
  2. Building a Map<entityId, Set> for O(1) lookups

Apply this optimization:

     const nlpMap = await this.nlpEntityService.getNlpMap();
+
+    // Pre-fetch all values for trait entities in one query
+    const traitEntityIds = Array.from(nlpMap.values())
+      .filter((entity) => entity.lookups?.includes('trait'))
+      .map((entity) => entity.id);
+    
+    const allEntityValues = traitEntityIds.length > 0
+      ? await this.nlpValueService.find({
+          entity: { $in: traitEntityIds },
+        })
+      : [];
+    
+    // Build a lookup map: entityId -> Set of valid values
+    const entityValueMap = new Map<string, Set<string>>();
+    for (const value of allEntityValues) {
+      if (!entityValueMap.has(value.entity)) {
+        entityValueMap.set(value.entity, new Set());
+      }
+      entityValueMap.get(value.entity)!.add(value.value);
+    }
 
     const filteredEntities = await Promise.all(
       prediction.entities.map(async (entity) => {
         const nlpEntity = nlpMap.get(entity.entity);
 
         if (!nlpEntity) {
           return null;
         }
 
         if (nlpEntity.lookups?.includes('trait')) {
-          const entityValues = await this.nlpValueService.find({
-            entity: nlpEntity.id,
-          });
-
-          const valueExists = entityValues.some(
-            (v) => v.value === entity.value,
-          );
+          const validValues = entityValueMap.get(nlpEntity.id);
+          const valueExists = validValues?.has(entity.value) ?? false;
 
           if (!valueExists) {
             return null;
           }
         }
 
         return entity;
       }),
     );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const filteredEntities = await Promise.all(
prediction.entities.map(async (entity) => {
const nlpEntity = nlpMap.get(entity.entity);
// If entity doesn't exist in user's configuration, exclude it
if (!nlpEntity) {
return null;
}
// For trait entities (like intent), check if the value exists
if (nlpEntity.lookups?.includes('trait')) {
// Get all values for this entity
const entityValues = await this.nlpValueService.find({
entity: nlpEntity.id,
});
// Check if the predicted value exists in user-defined values
const valueExists = entityValues.some(
(v) => v.value === entity.value,
);
// If value doesn't exist, exclude this entity
if (!valueExists) {
return null;
}
}
return entity;
}),
);
const nlpMap = await this.nlpEntityService.getNlpMap();
// Pre-fetch all values for trait entities in one query
const traitEntityIds = Array.from(nlpMap.values())
.filter((entity) => entity.lookups?.includes('trait'))
.map((entity) => entity.id);
const allEntityValues = traitEntityIds.length > 0
? await this.nlpValueService.find({
entity: { $in: traitEntityIds },
})
: [];
// Build a lookup map: entityId -> Set of valid values
const entityValueMap = new Map<string, Set<string>>();
for (const value of allEntityValues) {
if (!entityValueMap.has(value.entity)) {
entityValueMap.set(value.entity, new Set());
}
entityValueMap.get(value.entity)!.add(value.value);
}
const filteredEntities = await Promise.all(
prediction.entities.map(async (entity) => {
const nlpEntity = nlpMap.get(entity.entity);
// If entity doesn't exist in user's configuration, exclude it
if (!nlpEntity) {
return null;
}
// For trait entities (like intent), check if the value exists
if (nlpEntity.lookups?.includes('trait')) {
const validValues = entityValueMap.get(nlpEntity.id);
const valueExists = validValues?.has(entity.value) ?? false;
// If value doesn't exist, exclude this entity
if (!valueExists) {
return null;
}
}
return entity;
}),
);
🤖 Prompt for AI Agents
In api/src/nlp/controllers/nlp-sample.controller.ts around lines 224 to 253, the
current loop issues a DB query per trait entity causing an N+1 problem; instead,
before the Promise.all loop collect all nlpEntity ids that have lookups
including 'trait', fetch all values for those entities in a single call (e.g.,
this.nlpValueService.find({ entity: { $in: [...] } })), build a Map<entityId,
Set<value>> for O(1) membership checks, and then inside the mapping use the
precomputed Map to decide whether to return the entity or null without any
additional DB calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🤔 [ISSUE] - Frontend UI Displays Confidence Scores for Ludwig NLU Inferred Intents, Even When User Intents Are Undefined

1 participant