Skip to content

[ENHANCEMENT] Improve prompt consistency to reduce hallucinated and inconsistent LLM outputs#419

Open
Lochit-Vinay wants to merge 11 commits intofireform-core:mainfrom
Lochit-Vinay:prompt-consistency
Open

[ENHANCEMENT] Improve prompt consistency to reduce hallucinated and inconsistent LLM outputs#419
Lochit-Vinay wants to merge 11 commits intofireform-core:mainfrom
Lochit-Vinay:prompt-consistency

Conversation

@Lochit-Vinay
Copy link
Copy Markdown

Closes #418

Context

While working with the extraction pipeline and trying out different inputs, I noticed that the current prompts sometimes lead to slightly inconsistent outputs from the LLM.

In particular, there were cases where:

  • values were inferred even when not clearly present in the input
  • some fields were filled inconsistently
  • the output format was mostly correct but not always strict
    The earlier prompt improvements helped structure the output better, but there were still edge cases where the model “over-interprets” the input.

What I changed

This PR focuses on tightening the prompt instructions so the model behaves more predictably.

  • Added stricter rules to explicitly prevent inference or guessing
  • Ensured that missing fields return empty strings instead of assumed values
  • Clarified that no extra fields should be added beyond the defined schema
  • Strengthened the JSON requirement (output must be valid and directly parsable)
  • Added a negative example to show what incorrect output looks like and why

Why this helps

These changes make the extraction more deterministic and reduce cases where the model tries to be “helpful” by adding or modifying information.

This should improve:

  • consistency of extracted data
  • reliability of downstream processing
  • alignment with the expected schema

Testing

Due to the current database setup not being fully configured in this repository, I wasn’t able to run a complete end-to-end API test.
Instead, I focused on validating the prompt generation directly:

  • Ran the build_extraction_prompt function with different sample inputs

Verified that:

  • strict constraints are clearly included in the prompt
  • negative examples are present and correctly formatted
  • input text is injected properly into the final prompt
  • overall structure enforces non-inferential extraction

This ensures that the LLM receives clearer and stricter instructions, which helps reduce hallucination and improves consistency in the output.


@Lochit-Vinay
Copy link
Copy Markdown
Author

Happy to refine the prompt further if there are specific edge cases or scenarios you'd like me to cover.

@Lochit-Vinay
Copy link
Copy Markdown
Author

Resolved merge conflicts with latest main.

Also ensured:

  • LLM loop aligns with refactored prompt builder
  • No duplicate prompt construction logic remains
  • Code paths remain consistent with earlier refactor PRs

Happy to make further refinements if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] Improve prompt consistency to reduce hallucinated and inconsistent LLM outputs

2 participants