Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using alpha- & beta-diversity. Feature subset selection -a sub-field of machine learning -can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome.
Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets.
Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license.
SC004335 / DoE; United States Department of Energy (DOE)
1120622 / NSF; National Science Foundation (NSF)
Drexel's University Research Computing Facility
Resource Type
Journal article
Language
English
Academic Unit
Electrical and Computer Engineering
Web of Science ID
WOS:000364115500001
Scopus ID
2-s2.0-84946416256
Other Identifier
991019169598304721
UN Sustainable Development Goals (SDGs)
This publication has contributed to the advancement of the following goals:
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool: