Extracting a website's content structure from its link structure

Nan Liu; Christopher C. Yang

doi:10.1145/1099554.1099660

Back

Conference proceeding

Extracting a website's content structure from its link structure

Nan Liu and Christopher C. Yang

Proceedings of the 14th ACM international conference on Information and knowledge management, pp 345-346

31 Oct 2005

DOI: https://doi.org/10.1145/1099554.1099660

Additional Links

Abstract

Information systems -- Information retrieval

Information systems -- Information retrieval -- Information retrieval query processing

Information systems -- Information retrieval -- Retrieval models and ranking

Hierarchical models are commonly used to organize a Website's content. A Website's content structure can be represented by a topic hierarchy, a directed tree rooted at a Website's homepage in which the vertices and edges correspond to Web pages and hyperlinks. In this work, we propose an algorithm for extracting a Website's topic hierarchy from its link structure. The proposed algorithm consists of a construction stage and a refining stage, in which we analyze the semantic relationships between web pages based on link structure, web page content and directory structure. We've done extensive experiments using different Websites and obtained very promising results.

Metrics

7 Record Views

Details

Title: Extracting a website's content structure from its link structure
Creators: Nan Liu - Chinese University of Hong Kong
Christopher C. Yang - Chinese University of Hong Kong
Publication Details: Proceedings of the 14th ACM international conference on Information and knowledge management, pp 345-346
Conference: CIKM05: Conference on Information and Knowledge Management (Bremen, Germany, 31 Oct 2005–05 Nov 2005)
Series: ACM Conferences
Publisher: ACM
Number of pages: 2
Resource Type: Conference proceeding
Language: English
Academic Unit: Information Science (Informatics)
Other Identifier: 991021861112804721

Extracting a website's content structure from its link structure

Additional Links

Abstract

Metrics

Details

Drexel University Social media