Logo image
Prevalence and severity of design anti-patterns in open source programs—A large-scale study
Journal article   Peer reviewed

Prevalence and severity of design anti-patterns in open source programs—A large-scale study

Alan Liu, Jason Lefever, Yi Han and Yuanfang Cai
Information and software technology, v 170, 107429
Jun 2024

Abstract

Anti-pattern Software design Software evolution Software maintenance Technical debt
Design anti-patterns can be symptoms of problems that lead to long-term maintenance difficulty. How should development teams prioritize their treatment? Which ones are more severe and deserve more attention? Does the impact of anti-patterns and general maintenance efforts differ with different programming languages? In this study, we assess the prevalence and severity of anti-patterns in different programming languages and the impact of dynamic typing in Python, as well as the impact scopes of prevalent anti-patterns that manifest the violation of design principles. We conducted a large-scale study of anti-patterns using 1717 open-source projects written in Java, C/C++, and Python. For the 288 Python projects, we extracted both explicit and dynamic dependencies and compared how the detected anti-patterns and maintenance costs changed. Finally, we removed anti-patterns involving five or fewer files to assess the impact of trivial anti-patterns. The results reveal that 99.55% of these projects contain anti-patterns. Modularity Violation – frequent co-changes among seemingly unrelated files – is most prevalent (detected in 83.54% of all projects) and costly (incurred 61.55% of maintenance effort on average). Unstable Interface and Crossing, caused by influential but unstable files, although not as prevalent, tend to incur severe maintenance costs. Duck typing in Python incurs more anti-patterns, and the churn spent on Python files multiplies that of C/C++ and Java files. Several prevalent anti-patterns have a large portion of trivial instances, meaning that these common symptoms are usually not harmful. Implicit and visible dependencies are the most expensive to maintain, and dynamic typing in Python exacerbates the issue. Influential but unstable files need to be monitored and rectified early to prevent the accumulation of high maintenance costs. The violations of design principles are widespread, but many are not high-maintenance.

Metrics

15 Record Views
1 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
International collaboration
Web of Science research areas
Computer Science, Information Systems
Computer Science, Software Engineering
Logo image