Conference proceeding
Calibrating Large Language Models with Sample Consistency
Proceedings of the ... AAAI Conference on Artificial Intelligence, v 39(18), pp 19260-19268
11 Apr 2025
Abstract
Accurately gauging the confidence level of Large Language Models' (LLMs) predictions is pivotal for their reliable application. However, LLMs are often inherently uncalibrated and elude conventional calibration techniques due to their proprietary nature and massive scale. In this work, we derive model confidence from the distribution of multiple randomly sampled generations, using three measures of consistency. We extensively evaluate eleven open and closed-source models on nine reasoning datasets. Results show that consistency-based calibration methods outperform existing post-hoc approaches in terms of calibration error. Meanwhile, we find that factors such as intermediate explanations, model scaling, and larger sample sizes enhance calibration, while instruction-tuning makes calibration more difficult. Moreover, confidence scores obtained from consistency can potentially enhance model performance. Finally, we offer guidance on choosing suitable consistency metrics for calibration, tailored to model characteristics such as the exposure to instruction-tuning and RLHF.
Code - https://github.com/veronica320/Calibrating-LLMs-with-Consistency
Extended version - https://arxiv.org/abs/2402.13904
Metrics
6 Record Views
Details
- Title
- Calibrating Large Language Models with Sample Consistency
- Creators
- Qing Lyu - University of PennsylvaniaKumar Shridhar - Board of the Swiss Federal Institutes of TechnologyChaitanya Malaviya - University of PennsylvaniaLi Zhang - University of PennsylvaniaYanai Elazar - Allen InstituteNiket Tandon - Allen InstituteMarianna Apidianaki - University of PennsylvaniaMrinmaya Sachan - Board of the Swiss Federal Institutes of TechnologyChris Callison-Burch - University of Pennsylvania
- Contributors
- T Walsh (Editor)J Shah (Editor)Z Kolter (Editor)
- Publication Details
- Proceedings of the ... AAAI Conference on Artificial Intelligence, v 39(18), pp 19260-19268
- Series
- AAAI Conference on Artificial Intelligence
- Publisher
- Association for the Advancement Artificial Intelligence
- Number of pages
- 9
- Grant note
- 2022-22072200005 / Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via the HIATUS Program HR00112520300 / Defense Advanced Research Projects Agency's (DARPA) SciFy program; United States Department of Defense; Defense Advanced Research Projects Agency (DARPA)
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Computer Science
- Web of Science ID
- WOS:001477525800069
- Scopus ID
- 2-s2.0-105003908255
- Other Identifier
- 991022123354804721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Computer Science, Artificial Intelligence
- Computer Science, Interdisciplinary Applications
- Computer Science, Theory & Methods