Logo image
Data Augmentation for Improving Emotion Recognition in Software Engineering Communication
Conference proceeding   Open access

Data Augmentation for Improving Emotion Recognition in Software Engineering Communication

Mia Mohammad Imran, Yashasvi Jain, Preetha Chatterjee, Kostadin Damevski and ASSOC COMPUTING MACHINERY
Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, pp 1-13
10 Oct 2022
url
https://doi.org/10.1145/3551349.3556925View
Published, Version of Record (VoR) Open

Abstract

General and reference General and reference -- Cross-computing tools and techniques Human-centered computing Social and professional topics Social and professional topics -- Professional topics Software and its engineering Software and its engineering -- Software creation and management Software and its engineering -- Software creation and management -- Collaboration in software development Software and its engineering -- Software creation and management -- Software verification and validation Software and its engineering -- Software creation and management -- Software verification and validation -- Software defect analysis Software and its engineering -- Software notations and tools Software and its engineering -- Software notations and tools -- Software configuration management and version control systems
Emotions (e.g., Joy, Anger) are prevalent in daily software engineering (SE) activities, and are known to be significant indicators of work productivity (e.g., bug fixing efficiency). Recent studies have shown that directly applying general purpose emotion classification tools to SE corpora is not effective. Even within the SE domain, tool performance degrades significantly when trained on one communication channel and evaluated on another (e.g, StackOverflow vs. GitHub comments). Retraining a tool with channel-specific data takes significant effort since manually annotating a large dataset of ground truth data is expensive. In this paper, we address this data scarcity problem by automatically creating new training data using a data augmentation technique. Based on an analysis of the types of errors made by popular SE-specific emotion recognition tools, we specifically target our data augmentation strategy in order to improve the performance of emotion recognition. Our results show an average improvement of 9.3% in micro F1-Score for three existing emotion classification tools (ESEM-E, EMTk, SEntiMoji) when trained with our best augmentation strategy.

Metrics

8 Record Views
20 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Automation & Control Systems
Computer Science, Software Engineering
Computer Science, Theory & Methods
Logo image