首页 | 本学科首页   官方微博 | 高级检索  
     

基于领域知识的增强约束词向量
引用本文:王恒升,刘通,任晋. 基于领域知识的增强约束词向量[J]. 中文信息学报, 2019, 33(4): 37-47
作者姓名:王恒升  刘通  任晋
作者单位:1.中南大学 机电工程学院,湖南 长沙 410083;
2.中南大学 高性能复杂制造国家重点实验室,湖南 长沙 410083
基金项目:国家973计划(2013CB035504)
摘    要:词向量是一种词语的数字化的表达。基于神经网络模型,利用语料中词语之间的上下文关系这一约束条件,通过大量训练得到词向量。词向量在表达词的语义上的表现给人以无限的希望与想象空间,基于词向量的文本分类、人机对话、智能检索等得到了广泛的研究。该文针对校园信息查询的特定应用,建立了所涉及词语的分类本体,除了利用语料中词语上下文关系外,还将本体知识作为约束条件进行词向量的训练,增强了词向量的语义表达。基于skip-gram模型,采用多任务的神经网络训练方法,在自己收集的语料上训练得到了针对领域的词向量。实验表明,基于领域知识的增强约束词向量能够更准确地表达词的语义信息。

关 键 词:增强约束词向量  语义表达  本体知识

Constraint-enhanced Word Embedding Based on Domain Knowledge
WANG Hengsheng,LIU Tong,REN Jin. Constraint-enhanced Word Embedding Based on Domain Knowledge[J]. Journal of Chinese Information Processing, 2019, 33(4): 37-47
Authors:WANG Hengsheng  LIU Tong  REN Jin
Affiliation:1.College of Mechanical & Electrical Engineering, Central South University, Changsha, Hunan 410083, China;
2.State Key Laboratory for High Performance Complex Manufacturing, Central South University, Changsha, Hunan 410083, China
Abstract:For the design of a specific application of natural language based dialog system, i.e. campus information inquiry system, this paper proposes a method of improving word embedding for the expressiveness of semantic meanings. In addition to employing the word contexts in the training of word embedding, the domain specific knowledge is also introduced into the model training to enhance the expressiveness of word embedding. The knowledge about the application is organized into an ontology which was incorporated into word embedding through multi-task training of neural network model adapted from skip-gram, which is both a kind of constraint and a kind of enhancement to the word embedding. Experiments show the validness of the proposed embedding.
Keywords:constraint-enhanced word embedding    semantic expression    ontology knowledge  
本文献已被 维普 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号