首页 | 官方网站   微博 | 高级检索  
     

反馈式K近邻语义迁移学习的领域命名实体识别
引用本文:朱艳辉,,李飞,,冀相冰,,曾志高,,徐啸,.反馈式K近邻语义迁移学习的领域命名实体识别[J].智能系统学报,2019,14(4):820-830.
作者姓名:朱艳辉    李飞    冀相冰    曾志高    徐啸  
作者单位:1. 湖南工业大学 计算机学院, 湖南 株洲 412008;2. 湖南省智能信息感知及处理技术重点实验室, 湖南 株洲 412008
摘    要:领域命名实体识别是构建领域知识图谱的重要基础。针对专业领域语料匮乏的特点,构建基于深度学习的BiLSTM-CNN-CRFs网络模型,并提出一种反馈式K近邻语义迁移学习的领域命名实体识别方法。首先,对专业领域语料和通用领域语料分别训练得到语料文档向量,使用马哈拉诺比斯距离计算领域语料与通用语料的语义相似性,针对每个专业领域样本分别取K个语义最相似的通用领域样本进行语义迁移学习,构建多个迁移语料集。然后,使用BiLSTM-CNN-CRFs网络模型对迁移语料集进行领域命名实体识别,并对识别结果进行评估和前馈,根据反馈结果选取合适的K值,作为语义迁移学习的最佳阈值。以包装领域和医疗领域为例进行实验验证,结果表明:本文方法取得了很好的识别效果,可以有效解决专业领域语料匮乏问题。

关 键 词:领域命名实体识别  K近邻反馈式K近邻  语义迁移学习  深度学习  卷积神经网络  文档向量  马哈拉诺比斯距离  包装领域  医疗领域

Domain-named entity recognition based on feedback K-nearest semantic transfer learning
ZHU Yanhui,,LI Fei,,JI Xiangbing,,ZENG Zhigao,,XU Xiao,.Domain-named entity recognition based on feedback K-nearest semantic transfer learning[J].CAAL Transactions on Intelligent Systems,2019,14(4):820-830.
Authors:ZHU Yanhui    LI Fei    JI Xiangbing    ZENG Zhigao    XU Xiao  
Affiliation:1. School of Computer, Hu’nan University of Technology, Zhuzhou 412008, China;2. Hu’nan Key Laboratory of Intelligent Information Perception and Processing Technology, Zhuzhou 412008, China
Abstract:Domain-named entity recognition is an important foundation in constructing domain knowledge maps. In view of the scarcity of such recognition, this paper constructs a BiLSTM-CNN-CRFs network model based on deep learning as well as proposes a domain-named entity recognition method based on feedback K-nearest-neighbor semantic transfer learning. First, the corpus of the professional field and the general field were trained to obtain the corpus document vector, and the semantic similarity between the corpus of a domain and the common corpus was calculated using the Mahalanobis distance calculation. For each specialized domain sample, K common domain samples with the most similar semantics were taken for semantic transfer learning, and several transfer corpus sets were constructed. Then, the BiLSTM-CNN-CRFs network model was used to identify domain-named entities in N migration corpuses and evaluate and feedforward the recognition results. An appropriate K value was selected as the best threshold for semantic transfer learning according to the feedback results. The packaging and medical fields were taken as examples for experimental verification. The results showed that the method proposed in this paper has a good recognition effect and can effectively solve the problem of lack of corpus in the field of specialization.
Keywords:domain-named entity recognition  K-nearest neighborfeedback K-nearest neighbor  semantic transfer learning  deep learning  CNN  Doc2Vec  Mahalanobis distance  packaging field  medical field
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号