Logo image
An effective framework for semistructured document classification via hierarchical attention model
Journal article   Open access   Peer reviewed

An effective framework for semistructured document classification via hierarchical attention model

Weizhong Zhao, Dandan Fang, Jinyong Zhang, Yao Zhao, Xiaowei Xu, Xingpeng Jiang, Xiaohua Hu and Tingting He
International journal of intelligent systems, v 36(9), pp 5161-5183
01 Sep 2021
url
https://doi.org/10.1002/int.22508View
Published, Version of Record (VoR)Open Access (License Unspecified) Open

Abstract

Computer Science, Artificial Intelligence Science & Technology Computer Science Technology
Recent years have witnessed the rapidly growing of the amount of semistructured documents in real-world applications. Due to the huge size of the real-world data, how to manage semistructured documents effectively is a big challenge for researchers. As a fundamental task in natural language processing field, document classification is a feasible way to handle the large-scale semistructured documents. However, existing methods fail to explicitly take advantage of the hierarchical semantics in semistructured documents. It's known that the contained semantics is beneficial for understanding the semistructured documents. Considering the hierarchical structure of a given semistructured document, we propose a semistructured document classification framework which explicitly utilizes the semantic hierarchical attention mechanism. More specifically, the hierarchical attention mechanism and graph neural network are employed to model semistructured documents, by which the multilevel semantic relationships and grammatical information are considered. Moreover, we propose an adaptive class cost learning method to treat the issue of data imbalance. Comprehensive experiments are conducted on two real-world data sets, and the results demonstrate that our framework performs better than selected baselines for semistructured document classification.

Metrics

8 Record Views
12 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
International collaboration
Web of Science research areas
Computer Science, Artificial Intelligence
Logo image