Journal article
Study design in high-dimensional classification analysis
Biostatistics (Oxford, England), v 17(4), pp 722-736
01 Oct 2016
PMID: 27154835
Featured in Collection : UN Sustainable Development Goals @ Drexel
Abstract
Advances in high throughput technology have accelerated the use of hundreds to millions of biomarkers to construct classifiers that partition patients into different clinical conditions. Prior to classifier development in actual studies, a critical need is to determine the sample size required to reach a specified classification precision. We develop a systematic approach for sample size determination in high-dimensional (large p small n) classification analysis. Our method utilizes the probability of correct classification (PCC) as the optimization objective function and incorporates the higher criticism thresholding procedure for classifier development. Further, we derive the theoretical bound of maximal PCC gain from feature augmentation (e.g. when molecular and clinical predictors are combined in classifier development). Our methods are motivated and illustrated by a study using proteomics markers to classify post-kidney transplantation patients into stable and rejecting classes.
Metrics
Details
- Title
- Study design in high-dimensional classification analysis
- Creators
- Brisa N. Sanchez - University of MichiganMeihua Wu - Gilead Sciences (United States)Peter X. K. Song - University of MichiganWen Wang - University of Michigan
- Publication Details
- Biostatistics (Oxford, England), v 17(4), pp 722-736
- Publisher
- Oxford Univ Press
- Number of pages
- 15
- Grant note
- U54DK083912 / NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES; United States Department of Health & Human Services; National Institutes of Health (NIH) - USA; NIH National Institute of Diabetes & Digestive & Kidney Diseases (NIDDK) DMS-1513595 / NSF; National Science Foundation (NSF) R21DA024273; U54-DK-083912-05 / NIH; United States Department of Health & Human Services; National Institutes of Health (NIH) - USA P30ES017885 / NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES; United States Department of Health & Human Services; National Institutes of Health (NIH) - USA; NIH National Institute of Environmental Health Sciences (NIEHS) R21DA024273 / NATIONAL INSTITUTE ON DRUG ABUSE; United States Department of Health & Human Services; National Institutes of Health (NIH) - USA; NIH National Institute on Drug Abuse (NIDA); European Commission
- Resource Type
- Journal article
- Language
- English
- Academic Unit
- Urban Health Collaborative; Epidemiology and Biostatistics
- Web of Science ID
- WOS:000386970600009
- Scopus ID
- 2-s2.0-84995542942
- Other Identifier
- 991020100065504721
UN Sustainable Development Goals (SDGs)
This publication has contributed to the advancement of the following goals:
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Industry collaboration
- Domestic collaboration
- Web of Science research areas
- Mathematical & Computational Biology
- Statistics & Probability