Skip to content

VER-226: [Backend] Improve Confidence Score Handling in Analysis Prompt #201

@nhphong

Description

@nhphong

We want the LLM to provide more reliable confidence scores in its analysis, so that the confidence scores accurately reflect the likelihood of content containing false or misleading claims.

Acceptance Criteria:

  1. Revise Confidence Scoring Framework:
    • Update the prompt to include a detailed explanation of what the confidence score should represent, focusing on the degree of certainty that content contains demonstrably false or misleading claims.
    • Include a clear scoring framework with distinct categories and examples:
      • High (80-100): For content with specific false claims.
      • Medium (40-79): For misleading claims.
      • Low (1-39): For controversial opinions without false factual claims.
      • Zero (0): For content with no misleading claims.
  2. Require Evidence for Non-Zero Scores:
    • Ensure the prompt requires identification and explanation of specific false claims along with credible sources for any non-zero confidence score.
  3. Self-Validation Check:
    • Add a self-validation section to the prompt where the LLM must verify its scores by checking:
      • Identification of false claims.
      • Explanation of why claims are false.
      • Consistency between explanations and scores.
  4. Implement Self-Review and Validation Process:
    • Introduce a structured self-review process within Stage 3, where the LLM validates confidence scores and aligns them with disinformation categories.
    • Include specific verification questions to ensure consistency and defensibility of the scores.
  5. Consider Enhancements to Stage 1:
    • Explore adding self-reflection instructions to the Stage 1 prompt for better initial analysis, though confidence scoring is not part of this stage.

Tasks:

  • Revise the confidence scoring section of the Stage 3 prompt with the proposed changes.
  • Implement the self-validation check and self-review process in Stage 3.
  • Discuss and decide on potential modifications to the Stage 1 prompt for self-reflection enhancements.
  • Test the updated prompts to ensure they produce expected results and adjust as necessary.

Notes:

  • This update aims to address the problem of unreliable confidence scores by providing clearer instructions and validation steps for the LLM.
  • The changes are based on suggestions and discussions from the Slack conversation, focusing on improving the accuracy and reliability of confidence scores.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions