Skip to content

Fix ContextRelevancy score#159

Merged
Qard merged 1 commit intomainfrom
fix-context-relevancy-scoring
Jan 13, 2026
Merged

Fix ContextRelevancy score#159
Qard merged 1 commit intomainfrom
fix-context-relevancy-scoring

Conversation

@Qard
Copy link
Contributor

@Qard Qard commented Jan 12, 2026

Fixes #80

@Qard Qard requested review from ankrgyl and ibolmo January 12, 2026 22:16
@Qard Qard self-assigned this Jan 12, 2026
@Qard Qard added the bug Something isn't working label Jan 12, 2026
@github-actions
Copy link

github-actions bot commented Jan 12, 2026

Braintrust eval report

Autoevals (fix-context-relevancy-scoring-1768258918)

Score Average Improvements Regressions
NumericDiff 73.8% (+2pp) 7 🟢 1 🔴
Time_to_first_token 1.48tok (+0.1tok) 16 🟢 102 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (+0$) - -
Duration 1.5s (-1.27s) 109 🟢 110 🔴
Llm_duration 3.04s (+0.17s) 14 🟢 105 🔴

@github-actions
Copy link

Braintrust eval report

Autoevals (fix-context-relevancy-scoring-1768256188)

Score Average Improvements Regressions
NumericDiff 73.8% (+0pp) 2 🟢 -
Time_to_first_token 1.49tok (+0.1tok) 15 🟢 103 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (0$) 1 🟢 -
Duration 1.5s (+0.04s) 60 🟢 159 🔴
Llm_duration 3.07s (+0.17s) 10 🟢 109 🔴

@Qard Qard force-pushed the fix-context-relevancy-scoring branch from 18fe998 to 1c6bf5b Compare January 12, 2026 22:18
@Qard Qard merged commit 0e5793b into main Jan 13, 2026
7 checks passed
@Qard Qard deleted the fix-context-relevancy-scoring branch January 13, 2026 18:20
@github-actions
Copy link

github-actions bot commented Jan 13, 2026

Braintrust eval report

Autoevals (main-1768328420)

Score Average Improvements Regressions
NumericDiff 72.9% (0pp) 3 🟢 2 🔴
Time_to_first_token 1.32tok (-0.08tok) 97 🟢 22 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (+0$) - -
Duration 3.22s (+0.2s) 142 🟢 77 🔴
Llm_duration 2.72s (-0.19s) 106 🟢 13 🔴

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Context Relevancy issue with score not between 0 and 1.

2 participants