Logo image
Integrating Extra Knowledge into Word Embedding Models for Biomedical NLP Tasks
Conference proceeding

Integrating Extra Knowledge into Word Embedding Models for Biomedical NLP Tasks

Yuan Ling, Yuan An, Mengwen Liu, Sadid A. Hasan, Yetian Fan and Xiaohua Hu
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), pp 968-975
01 Jan 2017

Abstract

Computer Science, Artificial Intelligence Computer Science, Hardware & Architecture Engineering, Electrical & Electronic Science & Technology Computer Science Engineering Technology
Word embedding in the NLP area has attracted increasing attention in recent years. The continuous bag-of-words model (CBOW) and the continuous Skip-gram model (Skip-gram) have been developed to learn distributed representations of words from a large amount of unlabeled text data. In this paper, we explore the idea of integrating extra knowledge to the CBOW and Skip-gram models and applying the new models to biomedical NLP tasks. The main idea is to construct a weighted graph from knowledge bases (KBs) to represent structured relationships among words/concepts. In particular, we propose a GCBOW model and a GSkip-gram model respectively by integrating such a graph into the original CBOW model and Skip-gram model via graph regularization. Our experiments on four general domain standard datasets show encouraging improvements with the new models. Further evaluations on two biomedical NLP tasks (biomedical similarity/relatedness task and biomedical Information Retrieval (IR) task) show that our methods have better performance than baselines.

Metrics

10 Record Views
22 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Industry collaboration
Domestic collaboration
International collaboration
Web of Science research areas
Computer Science, Artificial Intelligence
Computer Science, Hardware & Architecture
Engineering, Electrical & Electronic
Logo image