Logo image
Learning Focused Hierarchical Topic Models with Semi-Supervision in Microblogs
Conference proceeding   Peer reviewed

Learning Focused Hierarchical Topic Models with Semi-Supervision in Microblogs

Anton Slutsky, Xiaohua Hu and Yuan An
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, v 9078, pp 598-609
01 Jan 2015

Abstract

Computer Science Computer Science, Artificial Intelligence Computer Science, Information Systems Computer Science, Theory & Methods Science & Technology Technology
Topic modeling approaches, such as Latent Dirichlet Allocation (LDA) and Hierarchical LDA (hLDA) have been used extensively to discover topics in various corpora. Unfortunately, these approaches do not perform well when applied to collections of social media posts. Further, these approaches do not allow users to focus topic discovery around subjectively interesting concepts. We propose the new Semi-Supervised Microblog-hLDA (SS-Micro-hLDA) model to discover topic hierarchies in short, noisy microblog documents in a way that allows users to focus topic discovery around interesting areas. We test SS-Micro-hLDA using a large, public collection of Twitter messages and Reddit social blogging site and show that our model outperforms hLDA, Constrained-hLDA, Recursive-rCRP and TSSB in terms of Pointwise Mutual Information (PMI) Score. Further, we test our model in terms of information entropy of held-out data and show that the new approach produces highly focused topic hierarchies.

Metrics

8 Record Views
1 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas
Computer Science, Artificial Intelligence
Computer Science, Information Systems
Computer Science, Theory & Methods
Logo image