TurnaboutLLM: A Deductive Reasoning Benchmark from Detective Games

Yuan Yuan; Muyu He; Muhammad Adil Shahid; Jiani Huang; Ziyang Li; Li Zhang

doi:10.48550/arxiv.2505.15712

Back

Preprint

TurnaboutLLM: A Deductive Reasoning Benchmark from Detective Games

Yuan Yuan, Muyu He, Muhammad Adil Shahid, Jiani Huang, Ziyang Li and Li Zhang

ArXiv.org

22 Sep 2025

DOI: https://doi.org/10.48550/arxiv.2505.15712

Files and links (1)

url

https://arxiv.org/pdf/2505.15712View

Open

Abstract

Computer Science - Computation and Language

This paper introduces TurnaboutLLM, a novel framework and dataset for evaluating the deductive reasoning abilities of Large Language Models (LLMs) by leveraging the interactive gameplay of detective games Ace Attorney and Danganronpa. The framework tasks LLMs with identifying contradictions between testimonies and evidences within long narrative contexts, a challenging task due to the large answer space and diverse reasoning types presented by its questions. We evaluate twelve state-of-the-art LLMs on the dataset, hinting at limitations of popular strategies for enhancing deductive reasoning such as extensive thinking and Chain-of-Thought prompting. The results also suggest varying effects of context size, the number of reasoning step and answer space size on model performance. Overall, TurnaboutLLM presents a substantial challenge for LLMs' deductive reasoning abilities in complex, narrative-rich environments.

Metrics

7 Record Views

Details

Title: TurnaboutLLM: A Deductive Reasoning Benchmark from Detective Games
Creators: Yuan Yuan - University of Pennsylvania
Muyu He - University of Pennsylvania
Muhammad Adil Shahid - University of Pennsylvania
Jiani Huang - University of Pennsylvania
Ziyang Li - University of Pennsylvania
Li Zhang - Drexel University, Computer Science
Publication Details: ArXiv.org
Resource Type: Preprint
Language: English
Academic Unit: Computer Science
Other Identifier: 991022122862004721

TurnaboutLLM: A Deductive Reasoning Benchmark from Detective Games

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media