Journal article
A Sequential Learning Approach for Scaling Up Filter-Based Feature Subset Selection
IEEE transaction on neural networks and learning systems, v 29(6), pp 2530-2544
Jun 2018
PMID: 28504951
Featured in Collection : UN Sustainable Development Goals @ Drexel
Abstract
Increasingly, many machine learning applications are now associated with very large data sets whose sizes were almost unimaginable just a short time ago. As a result, many of the current algorithms cannot handle, or do not scale to, today's extremely large volumes of data. Fortunately, not all features that make up a typical data set carry information that is relevant or useful for prediction, and identifying and removing such irrelevant features can significantly reduce the total data size. The unfortunate dilemma, however, is that some of the current data sets are so large that common feature selection algorithms-whose very goal is to reduce the dimensionality-cannot handle such large data sets, creating a vicious cycle. We describe a sequential learning framework for feature subset selection (SLSS) that can scale with both the number of features and the number of observations. The proposed framework uses multiarm bandit algorithms to sequentially search a subset of variables, and assign a level of importance for each feature. The novel contribution of SLSS is its ability to naturally scale to large data sets, evaluate such data in a very small amount of time, and be performed independently of the optimization of any classifier to reduce unnecessary complexity. We demonstrate the capabilities of SLSS on synthetic and real-world data sets.
Metrics
Details
- Title
- A Sequential Learning Approach for Scaling Up Filter-Based Feature Subset Selection
- Creators
- Gregory Ditzler - Drexel UniversityRobi Polikar - Rowan UniversityGail Rosen - Drexel University
- Publication Details
- IEEE transaction on neural networks and learning systems, v 29(6), pp 2530-2544
- Publisher
- IEEE
- Grant note
- #1429467; #1310496; #1120622 / NSF (10.13039/100000148)
- Resource Type
- Journal article
- Language
- English
- Academic Unit
- Electrical and Computer Engineering
- Web of Science ID
- WOS:000432398300039
- Scopus ID
- 2-s2.0-85018915227
- Other Identifier
- 991019168655904721
UN Sustainable Development Goals (SDGs)
This publication has contributed to the advancement of the following goals:
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- Web of Science research areas
- Computer Science, Artificial Intelligence
- Computer Science, Hardware & Architecture
- Computer Science, Theory & Methods
- Engineering, Electrical & Electronic