首页 | 本学科首页   官方微博 | 高级检索  
     

基于BLSTM-CRF中文领域命名实体识别框架设计
引用本文:张俊飞,毕志升,王静,吴小玲.基于BLSTM-CRF中文领域命名实体识别框架设计[J].计算技术与自动化,2019,38(3):117-121.
作者姓名:张俊飞  毕志升  王静  吴小玲
作者单位:广州医科大学生物工程系,广东广州,511436;广州医科大学生物工程系,广东广州,511436;广州医科大学生物工程系,广东广州,511436;广州医科大学生物工程系,广东广州,511436
基金项目:国家自然科学基金;广州市高校创新创业教育项目;教育教学改革项目
摘    要:为在不依赖特征工程的情况下提高中文领域命名实体识别性能,构建了BLSTM-CRF神经网络模型。首先利用CBOW模型对1998年1月至6月人民日报语料进行负采样递归训练,生成低维度稠密字向量表,以供查询需要;然后基于Boson命名实体语料,查询字向量表形成字向量,并利用Jieba分词获取语料中字的信息特征向量;最后组合字向量和字信息特征向量,输入到BLSTM-CRF深层神经网络中。实验结果证明,该模型面向中文领域命名实体能够较好的进行识别,F1值达到91.86%。

关 键 词:BLSTM-CRF  CBOW  BOSON  命名实体识别

Design of Chinese Domain Named Entity Recognition Framework Based on BLSTM-CRF
ZHANG Jun-fei,BI Zhi-sheng,WANG Jing,WU Xiao-ling.Design of Chinese Domain Named Entity Recognition Framework Based on BLSTM-CRF[J].Computing Technology and Automation,2019,38(3):117-121.
Authors:ZHANG Jun-fei  BI Zhi-sheng  WANG Jing  WU Xiao-ling
Affiliation:(Department of Biomedical Engineering,Guangdong Medical University,Guangzhou,Guangdong 511436,China)
Abstract:The BLSTM-CRF neural network model is built to improve the performance of Chinese domain named entity recognition in the absence of feature engineering.First,the CBOW model was used to carry out recursion training of negative sampling on the corpus of People's Daily from January to June in 1998 to generate a low-dimensional dense word vector table for the query needs;then,based on Boson entity corpus,the word vector was formed by querying the word vector table,and the information feature vector of the words in the corpus was obtained by using the Jieba participle;finally,the combined word vector and word information feature vector are input into BLSTM-CRF deep neural network.Experimental results show that the model can be well identified for the Chinese domain named entities,and the F1 value is up to 91.86%.
Keywords:BLSTM-CRF  CBOW  Boson  named entity recognition
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号