This repo contains the code for extraction and analysis of confidence scores of LLMs (self-reported). The code for generating answers and the accuracy of answers is available at VLM-LLM-in-Gastroenterology
the pre-print of the paper is availabale at arxiv link
Nariman Naderi, Seyed Amir Ahmad Safavi-Naini, Thomas Savage, Ali Soroush
If you use this code or data in your research, please cite our paper:
@misc{naderi2025selfreportedconfidencelargelanguage,
title={Self-Reported Confidence of Large Language Models in Gastroenterology: Analysis of Commercial, Open-Source, and Quantized Models},
author={Nariman Naderi and Seyed Amir Ahmad Safavi-Naini and Thomas Savage and Zahra Atf and Peter Lewis and Girish Nadkarni and Ali Soroush},
year={2025},
eprint={2503.18562},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.18562},
}