Conference proceeding
Feature subset selection for inferring relative importance of taxonomy
Proceedings of the 5th ACM Conference on bioinformatics, computational biology, and health informatics, pp 673-679
20 Sep 2014
Abstract
Examining the bacterial or functional differences between multiple habitats/populations/phenotypes plays an important role in making inferences about the roles that the taxonomy and functional profiles can take on in microbial ecology. It is therefore important to the field of comparative metagenomics, using α - & β -diversity, that methods or algorithms can detect the importance of particular subsets of variables that best differentiate the multiple phenotypes in the data. Given todays genomic data deluge efficient methods that can carry out these inferences cannot be understated enough. We assume observations are collected from a multitude of different environments (e.g., males vs. females, control vs. stimulus, etc.), and each observation is comprised of hundreds or thousands of different taxa/functional features (i.e., 16S or whole genome shotgun). Our goal in this work is to examine the role, assumptions, and inferences that feature subset selection can provide the field of microbial ecology and comparative metagenomics. Specifically we examine feature subset selection algorithms using embedded and filter approaches to infer taxa importance on data collected from the human gut microbiome We compare several widely adopted approaches from machine learning including greedy algorithms and l 1 regularization methods, as well as some software tools provided with QIIME, on data collected from the American Gut Project and other canonical studies of the human gut microbiome. We find that there are very few OTUs that carry information in regards to predicting the sex of a gut sample, and that Bacteroidetes is quite frequently found in the top ranked OTUs.
Metrics
11 Record Views
1 citations in Scopus
Details
- Title
- Feature subset selection for inferring relative importance of taxonomy
- Creators
- Gregory Ditzler - Drexel UniversityGail Rosen - Drexel University
- Publication Details
- Proceedings of the 5th ACM Conference on bioinformatics, computational biology, and health informatics, pp 673-679
- Conference
- 5th ACM Conference on bioinformatics, computational biology, and health informatics, 5th (Newport Beach, California, United States, 20 Sep 2014–23 Sep 2014)
- Series
- BCB '14
- Publisher
- Association for Computing Machinery (ACM)
- Number of pages
- 7
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Electrical and Computer Engineering
- Scopus ID
- 2-s2.0-84920749750
- Other Identifier
- 991019174750604721