Created solution Filter Inferred Intent Confidence Scores #1409

dushyantzz · 2025-10-02T10:35:52Z

Fixes #738

Solution: Filter Inferred Intent Confidence Scores

Problem

The frontend displayed intent confidence scores for intents that users never defined. Ludwig NLU infers intents from its training data, causing confusion when high confidence scores appear for non-existent user intents.

Example: User enters "Bonsoir" → Ludwig returns greetings_goodevening with 99.68% confidence → User never created this intent → Confusion!

Solution

Implemented server-side filtering in the /nlpsample/message endpoint to return only user-defined entities and values.

What Changed

Modified: api/src/nlp/controllers/nlp-sample.controller.ts

Added NlpValueService dependency injection
Implemented filtering logic in message() endpoint that:
- Retrieves user's NLP configuration
- Validates each predicted entity exists in user's config
- For trait entities (like intent), validates the predicted value exists
- Filters out non-user-defined entities/values

Added Tests: api/src/nlp/controllers/nlp-sample.controller.spec.ts

Test for filtering non-user-defined intent values
Test for preserving user-defined entities

How It Works

User Input → Ludwig NLU → Raw Prediction → Backend Filter → Only User-Defined → Frontend

Filtering Process:

Get user's NLP entity configuration (cached)
For each predicted entity:
- Check if entity exists in user's config
- For trait entities: validate the value exists
- Filter out if not found
Return only validated entities

Impact

Before ❌

Shows confidence for intents user never created
Misleading high confidence scores
Users confused about intent origin

After ✅

Shows only user-defined intents
Accurate representation of user's configuration
Clear, honest UX

Testing

Run tests:

cd api
npm test -- nlp-sample.controller.spec.ts

Manual test:

Create intent value greeting
Enter "Bonsoir" → Only language shown (no inferred intent)
Enter "Hello" → Your greeting intent shown with confidence ✅

Files Modified

api/src/nlp/controllers/nlp-sample.controller.ts (~45 lines)
api/src/nlp/controllers/nlp-sample.controller.spec.ts (~90 lines)

Benefits

✅ Eliminates user confusion
✅ Accurate confidence scores
✅ Backward compatible
✅ Well-tested
✅ Performance optimized (uses caching)

Summary

This fix filters out Ludwig NLU's inferred intents that don't exist in the user's configuration. Users now see confidence scores only for intents they've explicitly defined, providing an accurate representation of their chatbot's NLP capabilities.

Summary by CodeRabbit

Bug Fixes
- NLP message responses now include only user-configured intents and entities, excluding inferred or unsupported items.
- Trait-like entities are validated against allowed values to prevent incorrect matches.
- Improves accuracy and consistency of detected entities in responses.
Tests
- Added comprehensive tests to verify filtering behavior for mixed and fully user-defined entities, ensuring correct propagation and validation.

coderabbitai · 2025-10-02T10:35:59Z

Walkthrough

Adds runtime filtering in NlpSampleController.message to return only user-defined entities and validated trait values, introducing NlpValueService dependency. Updates constructor accordingly. Adds tests that mock predictions to verify inferred intents are excluded and user-defined entities/values are preserved.

Changes

Cohort / File(s)	Summary
Controller logic and dependency `api/src/nlp/controllers/nlp-sample.controller.ts`	Filters NLU entities against user-defined mappings; validates trait values via `NlpValueService`; updates constructor to inject `NlpValueService`; returns only non-null filtered entities.
Unit tests for filtering behavior `api/src/nlp/controllers/nlp-sample.controller.spec.ts`	Adds tests for message endpoint to ensure inferred intents are excluded while preserving user-defined entities; covers scenarios with mixed and all user-defined entities; mocks helper predictions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant Cn as NlpSampleController
  participant H as NlpHelper
  participant M as NlpMappingStore
  participant V as NlpValueService

  C->>Cn: POST /nlp/sample/message (text)
  Cn->>H: predict(text)
  H-->>Cn: entities (intent, traits, others)

  Cn->>M: fetch user-defined NLP mappings
  M-->>Cn: mappings (intents, traits, values)

  rect rgba(200,230,255,0.3)
  note right of Cn: Filter phase
  loop for each entity
    alt trait-like entity
      Cn->>V: get allowed values for trait
      V-->>Cn: allowed values
      Cn-->>Cn: keep only if value in allowed set
    else other entity (e.g., intent)
      Cn-->>Cn: keep only if entity/value defined by user
    end
  end
  end

  Cn-->>C: filtered entities (non-null only)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I twitch my whiskers, sift the streams,
From weeds of guess to gardened themes.
Intent that wandered? I let it go—
Only planted seeds may grow.
With traits confirmed, I thump in cheer:
Clean carrots of data, crisp and clear. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The pull request description does not follow the repository’s template headings and structure: it lacks the required “Motivation” header, a “Type of change” section with the appropriate checkboxes, and the “Checklist” section is entirely missing despite detailed other sections being present.	Update the description to match the template by adding a “# Motivation” section that summarizes the change and fixes, a “# Type of change” section with the relevant checkbox marked, and the “# Checklist” section with each prerequisite item reviewed and completed.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title references filtering inferred intent confidence scores, which aligns with the main change, but the leading phrase “Created solution” is extraneous and reduces clarity.
Linked Issues Check	✅ Passed	The controller changes implement server-side filtering of NLP predictions to include only user-defined intents and values, and the added tests validate non-user-defined intents are excluded and user-defined entities are preserved, directly addressing the requirements of issue #738 to prevent displaying confidence for undefined intents and ensure API responses only surface configured values.
Out of Scope Changes Check	✅ Passed	All modifications are confined to the NLP sample controller and its spec to implement and test filtering logic for user-defined intents and values, with no unrelated files or features altered outside the scope of the linked issue’s objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

api/src/nlp/controllers/nlp-sample.controller.spec.ts (2)
458-518: Consider cleanup after test data creation.

The test creates a new NLP value (greeting) in the database. If this data persists across tests, it could cause test pollution or flakiness. Consider:

Moving this setup to a dedicated beforeEach or beforeAll hook with corresponding cleanup

Or ensuring the test suite cleans up created entities in afterEach

Example cleanup pattern:
+  let createdValueId: string;
+
   it('should filter out intent values that are not user-defined', async () => {
     const intentEntity = await nlpEntityService.findOne({ name: 'intent' });
-    await nlpValueService.create({
+    const createdValue = await nlpValueService.create({
       entity: intentEntity!.id,
       value: 'greeting',
       expressions: [],
     });
+    createdValueId = createdValue.id;
     
     // ... rest of test
   });
+
+  afterEach(async () => {
+    if (createdValueId) {
+      await nlpValueService.deleteOne(createdValueId);
+      createdValueId = null;
+    }
+  });
520-545: Strengthen test assertions with content verification.

The test only verifies result.entities.length === 2 but doesn't validate the actual entity content. This makes the test less robust and harder to debug if it fails.

Apply this diff to add content assertions:
     const result = await nlpSampleController.message('Hello');
 
     expect(result.entities).toHaveLength(2);
+    expect(result.entities).toEqual(
+      expect.arrayContaining([
+        expect.objectContaining({
+          entity: 'intent',
+          value: 'greeting',
+        }),
+        expect.objectContaining({
+          entity: 'language',
+          value: 'en',
+        }),
+      ]),
+    );
   });
api/src/nlp/controllers/nlp-sample.controller.ts (2)
256-258: Return type could be strengthened with explicit typing.

The return statement filters out null values but TypeScript may not infer the narrowed type correctly. Consider adding an explicit type assertion or type guard for clarity.
     return {
-      entities: filteredEntities.filter((e) => e !== null),
+      entities: filteredEntities.filter((e): e is NonNullable<typeof e> => e !== null),
     };
This ensures the returned entities array has the correct non-nullable type.

216-259: Consider documenting the filtering behavior.

The filtering logic has several implicit behaviors that might not be obvious to future maintainers:

Entities not in nlpMap are silently excluded

Trait entities with invalid values are silently excluded

Non-trait entities pass through if they exist in nlpMap

Consider adding inline comments to clarify the filtering rules:
     const filteredEntities = await Promise.all(
       prediction.entities.map(async (entity) => {
         const nlpEntity = nlpMap.get(entity.entity);
 
+        // Filter out entities that don't exist in user's configuration
         if (!nlpEntity) {
           return null;
         }
 
+        // For trait entities (e.g., intent), validate that the value is user-defined
         if (nlpEntity.lookups?.includes('trait')) {
           const entityValues = await this.nlpValueService.find({
             entity: nlpEntity.id,
           });
 
           const valueExists = entityValues.some(
             (v) => v.value === entity.value,
           );
 
+          // Filter out trait values that user hasn't defined
           if (!valueExists) {
             return null;
           }
         }
 
+        // Keep entities that exist in config and have valid values
         return entity;
       }),
     );

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0108036 and 1df9409.

📒 Files selected for processing (2)

api/src/nlp/controllers/nlp-sample.controller.spec.ts (1 hunks)
api/src/nlp/controllers/nlp-sample.controller.ts (4 hunks)

coderabbitai · 2025-10-02T10:39:09Z

api/src/nlp/controllers/nlp-sample.controller.ts

+    const filteredEntities = await Promise.all(
+      prediction.entities.map(async (entity) => {
+        const nlpEntity = nlpMap.get(entity.entity);
+
+        // If entity doesn't exist in user's configuration, exclude it
+        if (!nlpEntity) {
+          return null;
+        }
+
+        // For trait entities (like intent), check if the value exists
+        if (nlpEntity.lookups?.includes('trait')) {
+          // Get all values for this entity
+          const entityValues = await this.nlpValueService.find({
+            entity: nlpEntity.id,
+          });
+
+          // Check if the predicted value exists in user-defined values
+          const valueExists = entityValues.some(
+            (v) => v.value === entity.value,
+          );
+
+          // If value doesn't exist, exclude this entity
+          if (!valueExists) {
+            return null;
+          }
+        }
+
+        return entity;
+      }),
+    );


⚠️ Potential issue | 🟠 Major

Potential N+1 query performance issue in entity filtering.

The filtering logic calls this.nlpValueService.find({ entity: nlpEntity.id }) inside a Promise.all(prediction.entities.map(...)), which means for each trait entity in the prediction, a separate database query is executed. If predictions contain many trait entities, this could cause performance degradation.

Consider optimizing by:

Pre-fetching all entity values in a single query before the loop

Building a Map<entityId, Set> for O(1) lookups

Apply this optimization:

const nlpMap = await this.nlpEntityService.getNlpMap(); + + // Pre-fetch all values for trait entities in one query + const traitEntityIds = Array.from(nlpMap.values()) + .filter((entity) => entity.lookups?.includes('trait')) + .map((entity) => entity.id); + + const allEntityValues = traitEntityIds.length > 0 + ? await this.nlpValueService.find({ + entity: { $in: traitEntityIds }, + }) + : []; + + // Build a lookup map: entityId -> Set of valid values + const entityValueMap = new Map<string, Set<string>>(); + for (const value of allEntityValues) { + if (!entityValueMap.has(value.entity)) { + entityValueMap.set(value.entity, new Set()); + } + entityValueMap.get(value.entity)!.add(value.value); + } const filteredEntities = await Promise.all( prediction.entities.map(async (entity) => { const nlpEntity = nlpMap.get(entity.entity); if (!nlpEntity) { return null; } if (nlpEntity.lookups?.includes('trait')) { - const entityValues = await this.nlpValueService.find({ - entity: nlpEntity.id, - }); - - const valueExists = entityValues.some( - (v) => v.value === entity.value, - ); + const validValues = entityValueMap.get(nlpEntity.id); + const valueExists = validValues?.has(entity.value) ?? false; if (!valueExists) { return null; } } return entity; }), );

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const filteredEntities = await Promise.all(

prediction.entities.map(async (entity) => {

const nlpEntity = nlpMap.get(entity.entity);

// If entity doesn't exist in user's configuration, exclude it

if (!nlpEntity) {

return null;

}

// For trait entities (like intent), check if the value exists

if (nlpEntity.lookups?.includes('trait')) {

// Get all values for this entity

const entityValues = await this.nlpValueService.find({

entity: nlpEntity.id,

});

// Check if the predicted value exists in user-defined values

const valueExists = entityValues.some(

(v) => v.value === entity.value,

);

// If value doesn't exist, exclude this entity

if (!valueExists) {

return null;

}

}

return entity;

}),

);

const nlpMap = await this.nlpEntityService.getNlpMap();

// Pre-fetch all values for trait entities in one query

const traitEntityIds = Array.from(nlpMap.values())

.filter((entity) => entity.lookups?.includes('trait'))

.map((entity) => entity.id);

const allEntityValues = traitEntityIds.length > 0

? await this.nlpValueService.find({

entity: { $in: traitEntityIds },

})

: [];

// Build a lookup map: entityId -> Set of valid values

const entityValueMap = new Map<string, Set<string>>();

for (const value of allEntityValues) {

if (!entityValueMap.has(value.entity)) {

entityValueMap.set(value.entity, new Set());

}

entityValueMap.get(value.entity)!.add(value.value);

}

const filteredEntities = await Promise.all(

prediction.entities.map(async (entity) => {

const nlpEntity = nlpMap.get(entity.entity);

// If entity doesn't exist in user's configuration, exclude it

if (!nlpEntity) {

return null;

}

// For trait entities (like intent), check if the value exists

if (nlpEntity.lookups?.includes('trait')) {

const validValues = entityValueMap.get(nlpEntity.id);

const valueExists = validValues?.has(entity.value) ?? false;

// If value doesn't exist, exclude this entity

if (!valueExists) {

return null;

}

}

return entity;

}),

);

🤖 Prompt for AI Agents

In api/src/nlp/controllers/nlp-sample.controller.ts around lines 224 to 253, the current loop issues a DB query per trait entity causing an N+1 problem; instead, before the Promise.all loop collect all nlpEntity ids that have lookups including 'trait', fetch all values for those entities in a single call (e.g., this.nlpValueService.find({ entity: { $in: [...] } })), build a Map<entityId, Set<value>> for O(1) membership checks, and then inside the mapping use the precomputed Map to decide whether to return the entity or null without any additional DB calls.

Created solution Filter Inferred Intent Confidence Scores

1df9409

coderabbitai bot reviewed Oct 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Created solution Filter Inferred Intent Confidence Scores #1409

Created solution Filter Inferred Intent Confidence Scores #1409

Uh oh!

dushyantzz commented Oct 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Created solution Filter Inferred Intent Confidence Scores #1409

Are you sure you want to change the base?

Created solution Filter Inferred Intent Confidence Scores #1409

Uh oh!

Conversation

dushyantzz commented Oct 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Solution: Filter Inferred Intent Confidence Scores

Problem

Solution

What Changed

How It Works

Impact

Before ❌

After ✅

Testing

Files Modified

Benefits

Summary

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dushyantzz commented Oct 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 2, 2025 •

edited

Loading