Information Theoretic Feature Selection for High Dimensional Metagenomic Data

Gregory Ditzler; Gail Rosen; Robi Polikar; IEEE

doi:10.1109/GENSIPS.2012.6507749

Back

Conference proceeding

Information Theoretic Feature Selection for High Dimensional Metagenomic Data

Gregory Ditzler, Gail Rosen, Robi Polikar and IEEE

2012 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS (GENSIPS), pp 143-146

01 Jan 2012

DOI: https://doi.org/10.1109/GENSIPS.2012.6507749

Featured in Collection : UN Sustainable Development Goals @ Drexel

Additional Links

Abstract

Life Sciences & Biomedicine

Mathematical & Computational Biology

Mathematics

Mathematics, Applied

Physical Sciences

Science & Technology

Extremely high dimensional data sets are common in genomic classification scenarios, but they are particularly prevalent in metagenomic studies that represent samples as abundances of taxonomic units. Furthermore, the data dimensionality is typically much larger than the number of observations collected for each instance, a phenomenon known as curse of dimensionality, a particularly challenging problem for most machine learning algorithms. The biologists collecting and analyzing data need efficient methods to determine relationships between classes in a data set and the variables that are capable of differentiating between multiple groups in a study. The most common methods of metagenomic data analysis are those characterized by alpha-and beta-diversity tests; however, neither of these tests allow scientists to identify the organisms that are most responsible for differentiating between different categories in a study. In this paper, we present an analysis of information theoretic feature selection methods for improving the classification accuracy with metagenomic data.

Metrics

11 Record Views

2 citations in Web of Science

7 citations in Scopus

Details

Title: Information Theoretic Feature Selection for High Dimensional Metagenomic Data
Creators: Gregory Ditzler - Drexel University
Gail Rosen - Drexel University
Robi Polikar - Rowan University
IEEE
Publication Details: 2012 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS (GENSIPS), pp 143-146
Series: IEEE International Workshop on Genomic Signal Processing and Statistics
Publisher: IEEE
Number of pages: 4
Resource Type: Conference proceeding
Language: English
Academic Unit: Electrical and Computer Engineering
Web of Science ID: WOS:000323215000038
Scopus ID: 2-s2.0-84877822554
Other Identifier: 991019170331404721

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas: Mathematical & Computational Biology; Mathematics, Applied

Information Theoretic Feature Selection for High Dimensional Metagenomic Data

Additional Links

Abstract

Metrics

Details

UN Sustainable Development Goals (SDGs)

InCites Highlights

Drexel University Social media