Book chapter
Sentiment Classification with Supervised Sequence Embedding
Machine Learning and Knowledge Discovery in Databases, pp 159-174
2012
Abstract
In this paper, we introduce a novel approach for modeling n-grams in a latent space learned from supervised signals. The proposed procedure uses only unigram features to model short phrases (n-grams) in the latent space. The phrases are then combined to form document-level latent representation for a given text, where position of an n-gram in the document is used to compute corresponding combining weight. The resulting two-stage supervised embedding is then coupled with a classifier to form an end-to-end system that we apply to the large-scale sentiment classification task. The proposed model does not require feature selection to retain effective features during pre-processing, and its parameter space grows linearly with size of n-gram. We present comparative evaluations of this method using two large-scale datasets for sentiment classification in online reviews (Amazon and TripAdvisor). The proposed method outperforms standard baselines that rely on bag-of-words representation populated with n-gram features.
Metrics
14 Record Views
17 citations in Scopus
Details
- Title
- Sentiment Classification with Supervised Sequence Embedding
- Creators
- Dmitriy Bespalov - Drexel UniversityYanjun Qi - NEC Labs America, Princeton NJBing Bai - NEC Labs America, Princeton NJAli Shokoufandeh - Drexel University
- Publication Details
- Machine Learning and Knowledge Discovery in Databases, pp 159-174
- Series
- Lecture Notes in Computer Science
- Publisher
- Springer Berlin Heidelberg; Berlin, Heidelberg
- Resource Type
- Book chapter
- Language
- English
- Academic Unit
- Computer Science
- Scopus ID
- 2-s2.0-84866865051
- Other Identifier
- 991019173947104721