What characteristics make ChatGPT effective for software issue resolution? An empirical study of task, project, and conversational signals in GitHub issues

Ramtin Ehsani; Sakshi Pathak; Esteban Parra; Sonia Haiduc; Preetha Chatterjee

doi:10.1007/s10664-025-10745-8

Back

What characteristics make ChatGPT effective for software issue resolution? An empirical study of task, project, and conversational signals in GitHub issues

Journal article

Open access

Peer reviewed

What characteristics make ChatGPT effective for software issue resolution? An empirical study of task, project, and conversational signals in GitHub issues

Ramtin Ehsani, Sakshi Pathak, Esteban Parra, Sonia Haiduc and Preetha Chatterjee

Empirical software engineering, v 31(1), 22

18 Nov 2025

DOI: https://doi.org/10.1007/s10664-025-10745-8

Featured in Collection : Research Supported by Drexel Libraries' OA Programs

Files and links (1)

url

https://doi.org/10.1007/s10664-025-10745-8View

Published, Version of Record (VoR)Open Access via Drexel Libraries Read and Publish Program 2025CC BY V4.0, Open

Abstract

Issue resolution

Large language models

GitHub

Conversation analysis

Artificial Intelligence or Cybernetics

Counseling

Conversational large-language models (LLMs), such as ChatGPT, are extensively used for issue resolution tasks, particularly for generating ideas to implement new features or resolve bugs. However, not all developer-LLM conversations are useful for effective issue resolution and it is still unknown what makes some of these conversations not helpful. In this paper, we analyze 686 developer-ChatGPT conversations shared within GitHub issue threads to identify characteristics that make these conversations effective for issue resolution. First, we empirically analyze the conversations and their corresponding issue threads to distinguish helpful from unhelpful conversations. We begin by categorizing the types of tasks developers seek help with (e.g., code generation, bug identification and fixing, test generation), to better understand the scenarios in which ChatGPT is most effective. Next, we examine a wide range of conversational, project, and issue-related metrics to uncover statistically significant factors associated with helpful conversations. Finally, we identify common deficiencies in unhelpful ChatGPT responses to highlight areas that could inform the design of more effective developer-facing tools. We found that only 62% of the ChatGPT conversations were helpful for successful issue resolution. Among different tasks related to issue resolution, ChatGPT was most helpful in assisting with code generation, and tool/library/API recommendations, but struggled with generating code explanations. Our conversational metrics reveal that helpful conversations are shorter, more readable, and exhibit higher semantic and linguistic alignment. Our project metrics reveal that larger, more popular projects and experienced developers benefit more from ChatGPT’s assistance. Our issue metrics indicate that ChatGPT is more effective on simpler issues characterized by limited developer activity and faster resolution times. These typically involve well-scoped technical problems such as compilation errors and tool feature requests. In contrast, it performs less effectively on complex issues that demand deep project-specific understanding, such as system-level code debugging and refactoring. The most common deficiencies in unhelpful ChatGPT responses include incorrect information and lack of comprehensiveness. Our findings have wide implications including guiding developers on effective interaction strategies for issue resolution, informing the development of tools or frameworks to support optimal prompt design, and providing insights on fine-tuning LLMs for issue resolution tasks.

Metrics

7 Record Views

See more details

Details

Title: What characteristics make ChatGPT effective for software issue resolution? An empirical study of task, project, and conversational signals in GitHub issues
Creators: Ramtin Ehsani (Corresponding Author) - Drexel University, College of Computing and Informatics
Sakshi Pathak - Drexel University
Esteban Parra - Belmont University
Sonia Haiduc - Florida State University
Preetha Chatterjee - Drexel University, Computer Science
Publication Details: Empirical software engineering, v 31(1), 22
Publisher: Springer Nature
Number of pages: 36
Resource Type: Journal article
Language: English
Academic Unit: Computer Science; College of Computing and Informatics
Web of Science ID: WOS:001616768100001
Scopus ID: 2-s2.0-105022134374
Other Identifier: 991022132354304721

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: Domestic collaboration
Web of Science research areas: Computer Science, Software Engineering

What characteristics make ChatGPT effective for software issue resolution? An empirical study of task, project, and conversational signals in GitHub issues

Files and links (1)

Abstract

Metrics

Details

InCites Highlights

Drexel University Social media