A semi-automatic computer program GA/CA (genetic algorithm/correlation analysis) is developed in this project for the classification of chemical compounds using mass spectra. The program uses a genetic algorithm as the optimization method and correlation analysis as the evaluation method. In performing a classification, the GA/CA program searches for a group of mass peaks that best discriminate the substructure of interest using the mass spectra of known compounds, and then uses the search results on unknowns for prediction. The GA/CA program is able to perform the classification using mass spectra, neutral loss spectra and parent loss spectra, as well as perform data preprocessing techniques, such as intensity exponent scaling and thresholding. The GA/CA program is successfully used in two tests using library spectra: classification of lower aromatic compounds, and chlorine containing compounds. The chromosomes developed by the GA/CA program showed 100% prediction accuracy for the test compounds in both classification experiments. In the classification of carbamates, the best chromosomes developed by the GA/CA program result from use of the neutral loss spectra, which show a prediction accuracy of 93% on the test set. The prediction accuracy increased when the individual results obtained by use of mass spectra, neutral loss spectra and parent loss spectra are combined together. The GA/CA was also used for identification of the metabolites of the carbamate methyl thiophanate from LC-MS/MS data. Chromosomes were developed by the GA/CA program using spectra collected in the laboratory. The results showed that the GA/CA program identified three of the known metabolites correctly and one metabolite incorrectly. The GA/CA program also identified another possible metabolite that was not identified in a previous metabolic study. The GA/CA needs to be rewritten as a completely automatic program so that it can handle a larger number of spectral data and run for a large number of generations.
Metrics
19 File views/ downloads
14 Record Views
Details
Title
Development of a genetic algorithm-correlation analysis (GA/CA) program for classification of chemical compounds using mass spectral data
Creators
Fang Li - DU
Contributors
Kevin Glenn Owens (Advisor) - Drexel University (1970-)
Awarding Institution
Drexel University
Degree Awarded
Doctor of Philosophy (Ph.D.)
Publisher
Drexel University; Philadelphia, Pennsylvania
Resource Type
Dissertation
Language
English
Academic Unit
College of Arts and Sciences; Chemistry; Drexel University
Other Identifier
2803; 991014632706904721
Research Home Page
Browse by research and academic units
Learn about the ETD submission process at Drexel
Learn about the Libraries’ research data management services