Self-reported confidence scoring of LLMs to gastroenterology board exam self-assessment questions

This repo contains the code for extraction and analysis of confidence scores of LLMs (self-reported). The code for generating answers and the accuracy of answers is available at VLM-LLM-in-Gastroenterology

the pre-print of the paper is availabale at arxiv link

Team:

Nariman Naderi, Seyed Amir Ahmad Safavi-Naini, Thomas Savage, Ali Soroush

If you use this code or data in your research, please cite our paper:

@misc{naderi2025selfreportedconfidencelargelanguage,
      title={Self-Reported Confidence of Large Language Models in Gastroenterology: Analysis of Commercial, Open-Source, and Quantized Models}, 
      author={Nariman Naderi and Seyed Amir Ahmad Safavi-Naini and Thomas Savage and Zahra Atf and Peter Lewis and Girish Nadkarni and Ali Soroush},
      year={2025},
      eprint={2503.18562},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.18562}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
0_data		0_data
1_confidence extraction pipeline		1_confidence extraction pipeline
2_analysis_and_figures		2_analysis_and_figures
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-reported confidence scoring of LLMs to gastroenterology board exam self-assessment questions

Team:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Self-reported confidence scoring of LLMs to gastroenterology board exam self-assessment questions

Team:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages