Principal component analysis (PCA) is not only a fundamental dimension reduction method, but is also a widely used network anomaly detection technique. Traditionally, PCA is performed in a centralized manner, which has poor scalability for large distributed systems, on account of the large network bandwidth cost required to gather the distributed state at a fusion center. Consequently, several recent works have proposed various distributed PCA algorithms aiming to reduce the communication overhead incurred by PCA without losing its inferential power. This paper evaluates the tradeoff between communication cost and solution quality of two distributed PCA algorithms on a real domain name system (DNS) query dataset from a large network. We also apply the distributed PCA algorithm in the area of network anomaly detection and demonstrate that the detection accuracy of both distributed PCA-based methods has little degradation in quality, yet achieves significant savings in communication bandwidth.
Metrics
4 Record Views
2 citations in Scopus
Details
Title
On the performance overhead tradeoff of distributed principal component analysis via data partitioning
Creators
Ni An - Drexel University
Steven Weber - Drexel University
Publication Details
2016 Annual Conference on Information Science and Systems (CISS), pp 578-583
Conference
2016 Annual Conference on Information Science and Systems (CISS) (Princeton, NJ, 16 Mar 2016–18 Mar 2016)
Publisher
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Number of pages
6
Resource Type
Conference proceeding
Language
English
Academic Unit
Electrical and Computer Engineering
Web of Science ID
WOS:000386277800102
Scopus ID
2-s2.0-84992411225
Other Identifier
991019170442304721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool: