Logo image
Impact of sample size on false alarm and missed detection rates in PCA-based anomaly detection
Conference proceeding

Impact of sample size on false alarm and missed detection rates in PCA-based anomaly detection

Ni An and Steven Weber
2017 51st Annual Conference on Information Sciences and Systems (CISS), pp 1-6
Mar 2017

Abstract

Computer Science, Information Systems Engineering, Electrical & Electronic Science & Technology Eigenvalues and eigenfunctions anomaly detection Gaussian distribution Computer Science Engineering Technology
Principal component analysis (PCA) is widely used for anomaly detection, specifically by computing the distances of each point in the dataset to the (sample) subspace spanned by the principal components of the sample covariance matrix. Points with distances above a threshold are labeled as anomalous. Although it is typically unknown in practice, the distribution from which the sample is drawn has its own (population) subspace, spanned by the principal components of the distribution's covariance matrix, and the thresholded distances of each point in the dataset from this subspace determine the true labels of the points. It follows that the sample covariance matrix produces false alarms (points labeled anomalous that are not) and missed detections (points labeled non-anomalous that are). In this paper we study how the false alarm rate (FAR) and missed detection rate (MDR) depend upon the number of samples, focusing on the case where the sample points are drawn from a normal distribution with a covariance matrix having a single spike.

Metrics

10 Record Views
7 citations in Scopus

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#6 Clean Water and Sanitation

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas
Computer Science, Information Systems
Engineering, Electrical & Electronic
Logo image