Logo image
Predicting cost performance in road projects with limited data: Exploring synthetic data generation using CTGAN
Journal article   Open access   Peer reviewed

Predicting cost performance in road projects with limited data: Exploring synthetic data generation using CTGAN

Ali Foroutan Mirhosseini, Kelly Pitera, James Odeck and Amirreza Rouhi
Research in transportation economics, v 117, 101755
01 Jun 2026
Featured in Collection :   Drexel's Newest Publications
url
https://doi.org/10.1016/j.retrec.2026.101755View
Published, Version of Record (VoR)CC BY V4.0 Open

Abstract

Business & Economics Science & Technology Economics Social Sciences Technology Transportation
In regions with scarce data, such as Norway, predicting cost performance in large-scale road (LSR) projects presents a unique challenge due to the high risk of cost overruns and their significant economic implications. This study aims to develop a data-driven framework for predicting cost performance in LSR projects by combining synthetic data generation and machine learning models. The approach employs synthetic data generation via Conditional Generative Adversarial Networks (CTGAN) to enhance the data pool and improve predictive accuracy. By integrating 173 synthetically generated samples with 52 actual project samples, a robust dataset of 225 road projects was created. Three machine learning classifiers (i.e., XGBoost, MLP, and SVM) were applied to this enriched dataset. The models achieved an average accuracy of 0.76 and an F1 score of 0.74 when tested against real-world data, demonstrating substantial alignment with actual project outcomes. Further validation with 5fold cross-validation on the combined datasets confirmed the consistency of these results, with similar accuracy and F1 scores. This research highlights the effectiveness of synthetic data in overcoming the limitations of small datasets and underscores its potential to substantially improve decision-making in highway engineering by providing more accurate, data-driven insights for project planning, design, and management.

Metrics

1 Record Views

Details

Logo image