首页 | 本学科首页   官方微博 | 高级检索  
     

基于BERT和对抗训练的食品领域命名实体识别
引用本文:董哲,邵若琦,陈玉梁,翟维枫.基于BERT和对抗训练的食品领域命名实体识别[J].计算机科学,2021,48(5):247-253.
作者姓名:董哲  邵若琦  陈玉梁  翟维枫
作者单位:北方工业大学电气与控制工程学院 北京 100144
基金项目:国家重点研发计划课题(2018YFC1602703);国家自然科学基金(61873006)。
摘    要:为了在食品领域从非结构化语料中抽取出有效的实体信息,提出了一种基于BERT(Bidirectional Encoder Representations from Transformers)和对抗训练的命名实体识别(Named Entity Recognition,NER)的方法。命名实体识别是一种典型的序列标注问题。目前,深度学习方法已经被广泛应用于该任务并取得了显著的成果,但食品领域等特定领域中的命名实体识别存在难以构建大量样本集、专用名词边界识别不准确等问题。针对这些问题,文中利用BERT得到字向量,以丰富语义的表示;并引入对抗训练,在有效防止中文分词任务私有信息的噪声的基础上,利用中文分词(Chinese Word Segmentation,CWS)和命名实体识别的共享信息来提高识别实体边界的精确率。在两类领域的语料上进行实验,这两类领域分别是中文食品安全案例和人民日报新闻。其中,中文食品安全案例用于训练命名实体识别任务,人民日报新闻用于训练中文分词任务。使用对抗训练来提高命名实体识别任务中实体(包括人名、地名、机构名、食品名称、添加剂名称)识别的精确度,实验结果表明,所提方法的精确率、召回率和F1值分别为95.46%,89.50%,92.38%,因此在食品领域边界不显著的中文命名实体识别任务上,该方法的了F1值得到提升。

关 键 词:食品领域  命名实体识别  BERT  BiLSTM  对抗训练

Named Entity Recognition in Food Field Based on BERT and Adversarial Training
DONG Zhe,SHAO Ruo-qi,CHEN Yu-liang,ZHAI Wei-feng.Named Entity Recognition in Food Field Based on BERT and Adversarial Training[J].Computer Science,2021,48(5):247-253.
Authors:DONG Zhe  SHAO Ruo-qi  CHEN Yu-liang  ZHAI Wei-feng
Affiliation:(School of Electrical and Control Engineering,North China University of Technology,Beijing 100144,China)
Abstract:Aiming at extracting effective entity information from unstructured corpus in the field of food safety,a named entity recognition(NER)method based on BERT(Bidirectional Encoder Representations from Transformers)and adversarial training is proposed.NER is a typical sequence labeling problem.At present,deep learning methods have been widely used in this task and have achieved remarkable results.However,there are problems such as difficulty in constructing a large number of sample sets for NER in specific fields like the food field,and inaccurate recognition of proper noun boundaries.To solve these problems,BERT is used to get the word vector,which enriches the semantic representation.To optimize the NER task,adversarial training is introduced,which not only uses the shared information obtained from task training of Chinese word segmentation(CWS)and NER,but also prevents the private information of CWS task from generating noise.The experiment is based on the corpus of two categories,which are Chinese food safety cases and People’s Daily news respectively.Among them,the Chinese food safety cases data set is used to train the NER task,and the“People’s Daily”news data set is used to train the CWS task.We use adversarial training to improve the precision of the NER task for entity recognition(including name,location,organization,food name and additive).Experimental results show that the proposed method’s Precision rate,Recall rate and F1 score are 95.46%,89.50%and 92.38%respectively.Therefore,this method has a high F1 score for Chinese NER task,where the boundary of a specific domain is indistinct.
Keywords:Food field  Named entity recognition  BERT  BiLSTM  Adversarial training
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号