Web Clustering based on the Information of Sibling Pages

Caimei Lu; Xiaodan Zhang; Jung-ran Park; Xiaohua Hu; Tingting He

doi:10.1109/GRC.2008.4664743

Back

Conference proceeding

Web Clustering based on the Information of Sibling Pages

Caimei Lu, Xiaodan Zhang, Jung-ran Park, Xiaohua Hu and Tingting He

2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, pp 480-485

01 Jan 2008

DOI: https://doi.org/10.1109/GRC.2008.4664743

Additional Links

Abstract

Computer Science, Artificial Intelligence

Computer Science, Interdisciplinary Applications

Computer Science, Theory & Methods

Science & Technology

Computer Science

Technology

This paper is dedicated to investigating the value of information from sibling pages for web page clustering. We use a link-based clustering algorithm to examine the usefulness of sibling links for improving clustering quality. The algorithm is extended by two types of edge weighting techniques. The results of the experiments conducted on WebKB4 dataset prove that: (1) using information from sibling pages can significantly improve clustering quality; (2) sib;ling pages are more useful than parent and child pages in enhancing clustering performance; (3) weighting and pruning sibling links can not improve the clustering quality. We also conducted an experiment on the citation dataset Cora7. The results indicate that sibling links are not more useful than the direct citation links when used to cluster collections of research papers.

Metrics

9 Record Views

4 citations in Web of Science

6 citations in Scopus

Details

Title: Web Clustering based on the Information of Sibling Pages
Creators: Caimei Lu - Drexel University
Xiaodan Zhang - Drexel University
Jung-ran Park - Drexel University
Xiaohua Hu - Drexel University
Tingting He - Central China Normal University
Publication Details: 2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, pp 480-485
Publisher: IEEE
Number of pages: 2
Resource Type: Conference proceeding
Language: English
Academic Unit: Information Science
Web of Science ID: WOS:000263829500109
Scopus ID: 2-s2.0-57949107288
Other Identifier: 991019168574104721

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: Domestic collaboration; International collaboration
Web of Science research areas: Computer Science, Artificial Intelligence; Computer Science, Interdisciplinary Applications; Computer Science, Theory & Methods

Web Clustering based on the Information of Sibling Pages

Additional Links

Abstract

Metrics

Details

InCites Highlights

Drexel University Social media