Logo image
Feature subset selection for inferring relative importance of taxonomy
Conference proceeding

Feature subset selection for inferring relative importance of taxonomy

Gregory Ditzler and Gail Rosen
Proceedings of the 5th ACM Conference on bioinformatics, computational biology, and health informatics, pp 673-679
20 Sep 2014

Abstract

Examining the bacterial or functional differences between multiple habitats/populations/phenotypes plays an important role in making inferences about the roles that the taxonomy and functional profiles can take on in microbial ecology. It is therefore important to the field of comparative metagenomics, using α - & β -diversity, that methods or algorithms can detect the importance of particular subsets of variables that best differentiate the multiple phenotypes in the data. Given todays genomic data deluge efficient methods that can carry out these inferences cannot be understated enough. We assume observations are collected from a multitude of different environments (e.g., males vs. females, control vs. stimulus, etc.), and each observation is comprised of hundreds or thousands of different taxa/functional features (i.e., 16S or whole genome shotgun). Our goal in this work is to examine the role, assumptions, and inferences that feature subset selection can provide the field of microbial ecology and comparative metagenomics. Specifically we examine feature subset selection algorithms using embedded and filter approaches to infer taxa importance on data collected from the human gut microbiome We compare several widely adopted approaches from machine learning including greedy algorithms and l 1 regularization methods, as well as some software tools provided with QIIME, on data collected from the American Gut Project and other canonical studies of the human gut microbiome. We find that there are very few OTUs that carry information in regards to predicting the sex of a gut sample, and that Bacteroidetes is quite frequently found in the top ranked OTUs.

Metrics

11 Record Views
1 citations in Scopus

Details

Logo image