Logo image
Semantic Analysis for Focused Multi-Document Summarization (fMDS) of Text
Conference proceeding

Semantic Analysis for Focused Multi-Document Summarization (fMDS) of Text

Quinsulon Israel, Hyoil Han and Il-Yeol Song
30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, v 13-17-, pp 339-344
01 Jan 2015

Abstract

Computer Science Computer Science, Interdisciplinary Applications Science & Technology Technology
Excess amounts of unstructured data are easily accessible in digital format quickly, yet there is no way for a human reader to easily 'ingest and digest' as quickly. This information overload places too heavy a burden on society for its analysis and execution needs. Focused (i.e. topic, query, question, category, etc.) multidocument summarization is an information reduction solution that has reached a state-of-the-art and now demands further exploration into other techniques to model human summarization activity. Such techniques have been mainly extractive and rely on distribution and complex machine learning on corpora in order to perform closely to humans. Consequently, the field needs to move toward more abstractive approaches to model human ways of summarizing. A simple, inexpensive and domain-independent system architecture is created for adding semantic analysis to the summarization process. Our system is novel for a couple of reasons. First, in its use of a semantic cue words feature and semantic class weighting to determine sentences with important information as a new semantic analysis metric. Second, in its use of semantic triples clustering to decompose natural language sentences into their most basic meaning to reduce the complexity of processing sentences and capture more likely semantic-related information. In competition against the gold standard baseline system from the Text Analysis Conference on the standardized summarization evaluation metric ROUGE, this work outperforms the baseline system by more than ten rankings. This work shows that semantic analysis and light-weight, open-domain techniques have potential.

Metrics

17 Record Views
1 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Computer Science, Interdisciplinary Applications
Logo image