Conference proceeding
Semantic pattern mining for text mining
2016 IEEE International Conference on Big Data (Big Data)
Dec 2016
Abstract
Pattern mining is a fundamental topic in data mining area. Many pattern mining techniques, such as closed and maximal pattern mining have been proposed for different applications. However, when calculating the frequency of a pattern, the existing techniques treat each word equally. For example, although the word `pie' in `I love eating pie.' is quite different from `pie' in `american pie', `pie' in `american pie' will still be added up to the counts of `pie' when calculating its frequency. Therefore, this paper aims to overcome the drawback to find the valid patterns tailored to text mining. We will approach pattern mining from a different perspective and introduce a novel problem of frequent semantic pattern mining. We then propose an algorithm to solve this problem via suffix array sorting. The algorithm can be implemented to run in linear time. Compared with traditional pattern representations, our results show the semantic patterns extracted are more than 13% compact. Also, classifier built on these features is no less or more powerful.
Metrics
9 Record Views
3 citations in Web of Science
7 citations in Scopus
Details
- Title
- Semantic pattern mining for text mining
- Creators
- Xiaoli Song - Coll. of Comput. & Inf., Drexel Univ., Philadelphia, PA, USAXiaoTong Wang - Nanyang Technological UniversityXiaohua Hu - Drexel University
- Publication Details
- 2016 IEEE International Conference on Big Data (Big Data)
- Publisher
- IEEE
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science
- Scopus ID
- 2-s2.0-85015156368
- Other Identifier
- 991019170470604721