Dissertation
Machine learning models for detecting pesticides in chlorinated drinking water distribution systems
Doctor of Philosophy (Ph.D.), Drexel University
Sep 2023
DOI:
https://doi.org/10.17918/00001876
Abstract
This dissertation encompasses anomaly detection, pesticide kinetics, and optimized machine learning models. It demonstrates the potential of machine learning techniques in predicting the occurrence of pesticides in chlorinated drinking water distribution systems. By using data from online sensors that capture water quality parameters from a prototype drinking water system, in which pesticides such as glyphosate, Dicamba, and Aldicarb were introduced, regression models were successfully created that were capable of identifying anomalies associated with pesticide presence or their byproducts in an aqueous environment rich in chlorine. In total, six regression algorithms-Decision Tree Regressor (DTR), Support Vector Regression (SVR), KNN Regressor (KNNR), Random Forest Regressor (RFR), Gradient Boosting Regressor (GBR), and Ada Boost Regressor (ABR)-were created and assessed as prediction models for leveraging historical data to uncover patterns between input features and target variables, to predict pesticide presence in the piped system ultimately. This thesis project employed data preprocessing, division into training and testing subsets, feature selection, and hyperparameter tuning methods to enhance model accuracy and efficiency. For example, the evaluation of the six regression models was assessed through metrics such as R-squared (R2), root mean square error (RMSE), and mean absolute error (MAE). The random forest regression (RFR) and gradient boosting regressor (GBR) algorithms demonstrate promising R2 scores for predicting glyphosate and Aldicarb. For Dicamba, meticulous hyperparameter tuning resulted in the RFR algorithm achieving an impressive R2 score of 0.87, closely followed by KNNR and GBR with R2 scores of 0.86, underscoring the critical role of hyperparameter tuning in optimizing the predictive capabilities of regression algorithms. By effectively integrating machine learning techniques with water quality data, this dissertation also offers valuable insights into water quality assessment, emphasizing the potential for early detection 15 of pesticide contamination within drinking water distribution systems. In addition to machine learning, early logistic regression detection was used to develop a model for identifying potential pesticide events. While the models developed in this dissertation have the potential to contribute significantly to the field of innovative contamination detection, promoting public health, and advancing sustainability, it is essential to acknowledge the limitation of the datasets used to develop the models and the interpretability of results of the models, to assess their potential benefits and shortcomings.
Metrics
78 File views/ downloads
61 Record Views
Details
- Title
- Machine learning models for detecting pesticides in chlorinated drinking water distribution systems
- Creators
- Angela Maria Fasnacht
- Contributors
- Christopher Sales (Advisor)
- Awarding Institution
- Drexel University
- Degree Awarded
- Doctor of Philosophy (Ph.D.)
- Publisher
- Drexel University; Philadelphia, Pennsylvania
- Number of pages
- 290 pages
- Resource Type
- Dissertation
- Language
- English
- Academic Unit
- Civil/Architectural/Environmental Engineering (1970-2026); College of Engineering (1970-2026); Drexel University
- Other Identifier
- 991021230305504721