Computer Science, Interdisciplinary Applications Computer Science, Theory & Methods Science & Technology Artificial Intelligence or Cybernetics Computer Science Technology
Large Language Models (LLMs) show promising results in language generation and instruction following but frequently "hallucinate", making their outputs less reliable. Despite Uncertainty Quantification's (UQ) potential solutions, implementing it accurately within LLMs is challenging. Our research introduces a simple heuristic: not all tokens in auto-regressive LLM text equally represent the underlying meaning, as "linguistic redundancy" often allows a few keywords to convey the essence of long sentences. However, current methods underestimate this inequality when assessing uncertainty, causing tokens with limited semantics to be equally or excessively weighted in UQ. To correct this, we propose Shifting Attention to more Relevant (SAR) components at both token- and sentence-levels for better UQ. We conduct extensive experiments involving a range of popular "off-the-shelf" LLMs, such as Vicuna, WizardLM, and LLaMA-2-chat, with model sizes extending up to 33B parameters. We evaluate various free-form question-answering tasks, encompassing domains such as reading comprehension, science Q&A, and medical Q&A. Our experimental results, coupled with a comprehensive demographic analysis, demonstrate the superior performance of SAR. The code is available at https://github.com/jinhaoduan/SAR.
Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
Creators
Jinhao Duan - Drexel University
Hao Cheng - University of Hong Kong
Shiqi Wang - Art Institute of Wisconsin
Alex Zavalny - Drexel University
Chenan Wang - Drexel University
Renjing Xu - University of Hong Kong
Bhavya Kailkhura - Lawrence Livermore National Laboratory
Kaidi Xu (Corresponding Author) - Drexel University
Contributors
L W Ku (Editor)
A Martins (Editor)
Srikumar (Editor)
Publication Details
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, v 1, pp 5050-5063
Conference
Annual Meeting of the Association for Computational Linguistics, 62 (Bangkok, Thailand, 11 Aug 2024–16 Aug 2024)
Publisher
Assoc Computational Linguistics-Acl
Number of pages
14
Grant note
2319242 / National Science Foundation; National Science Foundation (NSF)
23-ERD-030 (LLNL-CONF-851171) / LLNL-LDRD Program
DE-AC52-07NA27344 / U.S. Department of Energy by Lawrence Livermore National Laboratory; United States Department of Energy (DOE)
Resource Type
Conference proceeding
Language
English
Academic Unit
Computer Science
Web of Science ID
WOS:001356729805012
Scopus ID
2-s2.0-85204438471
Other Identifier
991022035263804721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool: