The evaluation of sentence similarity measures

Palakorn Achananuparp; Xiaohua Hu; Xiajiong Shen

doi:10.1007/978-3-540-85836-2_29

Back

The evaluation of sentence similarity measures

Conference proceeding

Peer reviewed

The evaluation of sentence similarity measures

Palakorn Achananuparp, Xiaohua Hu and Xiajiong Shen

DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, v 5182, pp 305-316

01 Jan 2008

DOI: https://doi.org/10.1007/978-3-540-85836-2_29

Additional Links

Abstract

Computer Science

Computer Science, Artificial Intelligence

Computer Science, Information Systems

Computer Science, Theory & Methods

Science & Technology

Technology

The ability to accurately judge the similarity between natural language sentences is critical to the performance of several applications such as text mining, question answering, and text summarization. Given two sentences, an effective similarity measure should be able to determine whether the sentences are semantically equivalent or not, taking into account the variability of natural language expression. That is, the correct similarity judgment should be made even if the sentences do not share similar surface form. In this work, we evaluate fourteen existing text similarity measures which have been used to calculate similarity score between sentences in many text applications. The evaluation is conducted on three different data sets, TREC9 question variants, Microsoft Research paraphrase corpus, and the third recognizing textual entailment data set.

Metrics

15 Record Views

121 citations in Web of Science

186 citations in Scopus

See more details

Details

Title: The evaluation of sentence similarity measures
Creators: Palakorn Achananuparp - Drexel University
Xiaohua Hu - Drexel University
Xiajiong Shen - College of Computer and Information Engineering, Hehan University, Henan, China
Contributors: I Y Song (Editor)
J Eder (Editor)
T M Nguyen (Editor)
Publication Details: DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, v 5182, pp 305-316
Series: Lecture Notes in Computer Science
Publisher: Springer Nature
Number of pages: 3
Grant note: 240205; 240196 / PA Dept of Health Tobacco Settlement Formula IIS 0448023; CCF 0514679 / NSF Career; National Science Foundation (NSF); NSF - Office of the Director (OD) 239667 / PA Dept of Health
Resource Type: Conference proceeding
Language: English
Academic Unit: Information Science
Web of Science ID: WOS:000259488400029
Scopus ID: 2-s2.0-52949135206
Other Identifier: 991019170380104721

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: International collaboration
Web of Science research areas: Computer Science, Artificial Intelligence; Computer Science, Information Systems; Computer Science, Theory & Methods

The evaluation of sentence similarity measures

Additional Links

Abstract

Metrics

Details

InCites Highlights

Drexel University Social media