Published, Version of Record (VoR)CC BY V4.0, Open
Abstract
Computer Science, Artificial Intelligence Computer Science, Interdisciplinary Applications Language & Linguistics Linguistics Science & Technology Computer Science Social Sciences Technology
Script learning studies how stereotypical events unfold, enabling machines to reason about narratives with implicit information. Previous works mostly consider a script as a linear sequence of events while ignoring the potential branches that arise due to people's circumstantial choices. We hence propose Choice-75, the first benchmark that challenges intelligent systems to make decisions given descriptive scenarios, containing 75 scripts and more than 600 scenarios. We also present preliminary results with current large language models (LLM). Although they demonstrate overall decent performance, there is still notable headroom in hard scenarios.
Choice-75: A Dataset on Decision Branching in Script Learning
Creators
Zhaoyi Joey Hou - University of Pittsburgh
Li Zhang - University of Pennsylvania
Chris Callison-Burch - University of Pennsylvania
Contributors
N Calzolari (Editor)
M Y Kan (Editor)
Hoste (Editor)
A Lenci (Editor)
S Sakti (Editor)
N Xue (Editor)
Publication Details
PROCEEDINGS OF THE 2024 JOINT INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS, LANGUAGE RESOURCES AND EVALUATION, LREC-COLING 2024, pp 3215-3223
Conference
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (Torino, Italy, 20 May 2024–25 May 2024)
Series
International Conference on Computational Linguistics Language Resources and Evaluation
Publisher
Assoc Computational Linguistics-Acl
Number of pages
9
Grant note
2022-22072200005 / Office of the Director of National Intelligence (ODNI) via the IARPA HIATUS Program
FA8750-23-C-0507 / AFRL; United States Department of Defense; US Air Force Research Laboratory
1928631 / NSF; National Science Foundation (NSF)
FA8750-19-2-1004 / DARPA KAIROS Program; United States Department of Defense