On the Lack of Consensus Among Technical Debt Detection Tools

Jason Lefever; Yuanfang Cai; Humberto Cervantes; Rick Kazman; Hongzhou Fang; IEEE COMP SOC

doi:10.1109/ICSE-SEIP52600.2021.00021

Back

On the Lack of Consensus Among Technical Debt Detection Tools

Conference proceeding

Open access

On the Lack of Consensus Among Technical Debt Detection Tools

Jason Lefever, Yuanfang Cai, Humberto Cervantes, Rick Kazman, Hongzhou Fang and IEEE COMP SOC

2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp 121-130

May 2021

DOI: https://doi.org/10.1109/ICSE-SEIP52600.2021.00021

Files and links (1)

url

http://arxiv.org/abs/2103.04506View

SubmittedOpen Access (License Unspecified), Open

Abstract

Benchmark testing

Complexity theory

Maintenance engineering

Size measurement

Software Analysis

Software engineering

Software Maintainability

Software measurement

Technical Debt

Tools

A vigorous and growing set of technical debt analysis tools have been developed in recent years-both research tools and industrial products-such as Structure 101, SonarQube, and DV8. Each of these tools identifies problematic files using their own definitions and measures. But to what extent do these tools agree with each other in terms of the files that they identify as problematic? If the top-ranked files reported by these tools are largely consistent, then we can be confident in using any of these tools. Otherwise, a problem of accuracy arises. In this paper, we report the results of an empirical study analyzing 10 projects using multiple tools. Our results show that: 1) these tools report very different results even for the most common measures, such as size, complexity, file cycles, and package cycles. 2) These tools also differ dramatically in terms of the set of problematic files they identify, since each implements its own definitions of "problematic". After normalizing by size, the most problematic file sets that the tools identify barely overlap. 3) Our results show that code-based measures, other than size and complexity, do not even moderately correlate with a file's change-proneness or error-proneness. In contrast, co-change-related measures performed better. Our results suggest that, to identify files with true technical debt-those that experience excessive changes or bugs-co-change information must be considered. Code-based measures are largely ineffective at pinpointing true debt. Finally, this study reveals the need for the community to create benchmarks and data sets to assess the accuracy of software analysis tools in terms of commonly used measures.

Metrics

9 Record Views

30 citations in Web of Science

30 citations in Scopus

Details

Title: On the Lack of Consensus Among Technical Debt Detection Tools
Creators: Jason Lefever - Drexel University
Yuanfang Cai - Drexel University
Humberto Cervantes - UAM Iztapalapa
Rick Kazman - University of Hawai’i
Hongzhou Fang - Drexel University
IEEE COMP SOC
Publication Details: 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp 121-130
Publisher: IEEE
Grant note: National Science Foundation (10.13039/100000001)
Resource Type: Conference proceeding
Language: English
Academic Unit: Computer Science
Web of Science ID: WOS:000684234800013
Scopus ID: 2-s2.0-85115672324
Other Identifier: 991019167525604721

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: Domestic collaboration; International collaboration
Web of Science research areas: Computer Science, Theory & Methods

On the Lack of Consensus Among Technical Debt Detection Tools

Files and links (1)

Abstract

Metrics

Details

InCites Highlights

Drexel University Social media