Journal article
Predicting cost performance in road projects with limited data: Exploring synthetic data generation using CTGAN
Research in transportation economics, v 117, 101755
01 Jun 2026
Featured in Collection : UN Sustainable Development Goals @ Drexel
Abstract
In regions with scarce data, such as Norway, predicting cost performance in large-scale road (LSR) projects presents a unique challenge due to the high risk of cost overruns and their significant economic implications. This study aims to develop a data-driven framework for predicting cost performance in LSR projects by combining synthetic data generation and machine learning models. The approach employs synthetic data generation via Conditional Generative Adversarial Networks (CTGAN) to enhance the data pool and improve predictive accuracy. By integrating 173 synthetically generated samples with 52 actual project samples, a robust dataset of 225 road projects was created. Three machine learning classifiers (i.e., XGBoost, MLP, and SVM) were applied to this enriched dataset. The models achieved an average accuracy of 0.76 and an F1 score of 0.74 when tested against real-world data, demonstrating substantial alignment with actual project outcomes. Further validation with 5fold cross-validation on the combined datasets confirmed the consistency of these results, with similar accuracy and F1 scores. This research highlights the effectiveness of synthetic data in overcoming the limitations of small datasets and underscores its potential to substantially improve decision-making in highway engineering by providing more accurate, data-driven insights for project planning, design, and management.
Metrics
4 Record Views
Details
- Title
- Predicting cost performance in road projects with limited data: Exploring synthetic data generation using CTGAN
- Creators
- Ali Foroutan Mirhosseini - Norwegian Public Roads AdministrationKelly Pitera - Norwegian University of Science and TechnologyJames Odeck - Norwegian Public Roads AdministrationAmirreza Rouhi - Drexel University, Electrical and Computer Engineering
- Publication Details
- Research in transportation economics, v 117, 101755
- Publisher
- Elsevier
- Number of pages
- 13
- Resource Type
- Journal article
- Language
- English
- Academic Unit
- Electrical and Computer Engineering
- Web of Science ID
- WOS:001713220000001
- Scopus ID
- 2-s2.0-105033501628
- Other Identifier
- 991022170455304721
UN Sustainable Development Goals (SDGs)
This publication has contributed to the advancement of the following goals:
Source: SDGs in the Output
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Economics
- Transportation