Real vs. simulated: Questions on the capability of simulated datasets on building fault detection for energy efficiency from a data-driven perspective

Jiajing Huang; Jin Wen; Hyunsoo Yoon; Ojas Pradhan; Teresa Wu; Zheng O'Neill; Kasim Selcuk Candan

doi:10.1016/j.enbuild.2022.111872

Back

Real vs. simulated: Questions on the capability of simulated datasets on building fault detection for energy efficiency from a data-driven perspective

Journal article

Open access

Peer reviewed

Real vs. simulated: Questions on the capability of simulated datasets on building fault detection for energy efficiency from a data-driven perspective

Jiajing Huang, Jin Wen, Hyunsoo Yoon, Ojas Pradhan, Teresa Wu, Zheng O'Neill and Kasim Selcuk Candan

Energy and buildings, v 259, 111872

15 Mar 2022

DOI: https://doi.org/10.1016/j.enbuild.2022.111872

Featured in Collection : UN Sustainable Development Goals @ Drexel

Files and links (1)

url

https://doi.org/10.1016/j.enbuild.2022.111872View

Accepted (AM)Open Access (Publisher-Specific), Open

Abstract

Building AFDD

Machine learning

Real

Similarity

Simulated

Literature on building Automatic Fault Detection and Diagnosis (AFDD) mainly focuses on simulated system data due to high expenses and difficulties of obtaining and analyzing real building data. There is a lack of validation on performances and scalabilities of data-driven AFDD approaches using simulated data and how it compares to that from real building data. In this study, we conduct two sets of experiments to seek answers to this question. We first evaluate data-driven fault detection strategies on real and simulated building data separately. We observe that the fault detection performances are not affected by fault detection strategies, sizes of training data, and the number of cross-validation folds when training and blind test data come from the same data source, namely, simulated or real building data. Next, we conduct a cross-dataset study, that is, develop the model using simulated data and tested on real building data. The results indicate the model trained on simulated data is not generalized to be applied for real building data for fault detection. Kolmogorov-Smirnov Test is conducted to confirm that there exist statistical differences between the simulated and real building data and identify a subset of features with similarities between the two datasets. Using the subset of the feature, cross-dataset experiments show fault detection improvements on most fault cases. We conclude that even if the system produces simulated data with the same fault symptoms from physical analysis perspectives, not all features from simulated datasets may not be beneficial for AFDD but only a subset of features contains valuable information from a machine learning perspective.

Metrics

15 Record Views

33 citations in Web of Science

25 citations in Scopus

Details

Title: Real vs. simulated: Questions on the capability of simulated datasets on building fault detection for energy efficiency from a data-driven perspective
Creators: Jiajing Huang - Arizona State University
Jin Wen - Drexel University
Hyunsoo Yoon - Yonsei University
Ojas Pradhan - Drexel University
Teresa Wu - Arizona State University
Zheng O'Neill - Texas A&M University
Kasim Selcuk Candan - School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ 85281, USA
Publication Details: Energy and buildings, v 259, 111872
Publisher: Elsevier
Resource Type: Journal article
Language: English
Academic Unit: Civil, Architectural, and Environmental Engineering
Web of Science ID: WOS:000753993200010
Scopus ID: 2-s2.0-85123865522
Other Identifier: 991019168813304721

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: Domestic collaboration; International collaboration
Web of Science research areas: Construction & Building Technology; Energy & Fuels; Engineering, Civil

Real vs. simulated: Questions on the capability of simulated datasets on building fault detection for energy efficiency from a data-driven perspective

Files and links (1)

Abstract

Metrics

Details

UN Sustainable Development Goals (SDGs)

InCites Highlights

Drexel University Social media