Logo image
On the performance overhead tradeoff of distributed principal component analysis via data partitioning
Conference proceeding   Open access

On the performance overhead tradeoff of distributed principal component analysis via data partitioning

Ni An and Steven Weber
2016 Annual Conference on Information Science and Systems (CISS), pp 578-583
01 Mar 2016
url
https://ieeexplore.ieee.org/document/7460567View
Published, Version of Record (VoR)Open Access (License Unspecified) Open

Abstract

Bandwidths Communication Cost control Algorithms
Principal component analysis (PCA) is not only a fundamental dimension reduction method, but is also a widely used network anomaly detection technique. Traditionally, PCA is performed in a centralized manner, which has poor scalability for large distributed systems, on account of the large network bandwidth cost required to gather the distributed state at a fusion center. Consequently, several recent works have proposed various distributed PCA algorithms aiming to reduce the communication overhead incurred by PCA without losing its inferential power. This paper evaluates the tradeoff between communication cost and solution quality of two distributed PCA algorithms on a real domain name system (DNS) query dataset from a large network. We also apply the distributed PCA algorithm in the area of network anomaly detection and demonstrate that the detection accuracy of both distributed PCA-based methods has little degradation in quality, yet achieves significant savings in communication bandwidth.

Metrics

4 Record Views
2 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas
Computer Science, Information Systems
Information Science & Library Science
Logo image