Journal article
Towards effective document clustering: A constrained K-means based approach
Information processing & management, v 44(4), pp 1397-1409
2008
Abstract
Document clustering is an important tool for document collection organization and browsing. In real applications, some limited knowledge about cluster membership of a small number of documents is often available, such as some pairs of documents belonging to the same cluster. This kind of prior knowledge can be served as constraints for the clustering process. We integrate the constraints into the trace formulation of the sum of square Euclidean distance function of
K-
means. Then,the combined criterion function is transformed into trace maximization, which is further optimized by eigen-decomposition. Our experimental evaluation shows that the proposed semi-supervised clustering method can achieve better performance, compared to three existing methods.
Metrics
Details
- Title
- Towards effective document clustering: A constrained K-means based approach
- Creators
- Guobiao Hu - Fudan UniversityShuigeng Zhou - Fudan UniversityJihong Guan - Tongji UniversityXiaohua Hu - Drexel University
- Publication Details
- Information processing & management, v 44(4), pp 1397-1409
- Publisher
- Elsevier
- Resource Type
- Journal article
- Language
- English
- Academic Unit
- Information Science
- Web of Science ID
- WOS:000257276800001
- Scopus ID
- 2-s2.0-44449138321
- Other Identifier
- 991019167431604721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Computer Science, Information Systems
- Information Science & Library Science