首页 | 本学科首页   官方微博 | 高级检索  
     

基于主题提示的电力命名实体识别
引用本文:康雨萌,何玮,翟千惠,程雅梦,俞阳.基于主题提示的电力命名实体识别[J].计算机系统应用,2022,31(9):272-279.
作者姓名:康雨萌  何玮  翟千惠  程雅梦  俞阳
作者单位:国网江苏营销服务中心, 南京 210019
基金项目:国网江苏省电力有限公司科技项目(J2021151)
摘    要:传统的命名实体识别方法可以凭借充足的监督数据实现较好的识别效果.而在针对电力文本的命名实体识别中,由于对专业知识的依赖,往往很难获取足够的监督数据,即存在少样本场景.同时,由于电力行业的精确性要求,相比于一般的开放领域任务,电力领域的实体类型更多,因此难度更大.针对这些挑战,本文提出了一个基于主题提示的命名实体识别方法.该方法将每个实体类型视为一个主题,并使用主题模型从训练语料中获取与类型相关的主题词.通过枚举实体跨度、实体类型、主题词以填充模板并构建提示句.使用生成式预训练语言模型对提示句排序,最终识别出实体与对应类型标签.实验结果表明,在中文电力命名实体识别数据集上,相比于几种传统命名实体方法,基于主题提示的方法取得了更好的效果.

关 键 词:命名实体识别  预训练模型  提示模板  主题模型  电力语料
收稿时间:2021/12/16 0:00:00
修稿时间:2022/1/13 0:00:00

Electric Power Named Entity Recognition Based on Topic Prompt
KANG Yu-Meng,HE Wei,ZHAI Qian-Hui,CHENG Ya-Meng,YU Yang.Electric Power Named Entity Recognition Based on Topic Prompt[J].Computer Systems& Applications,2022,31(9):272-279.
Authors:KANG Yu-Meng  HE Wei  ZHAI Qian-Hui  CHENG Ya-Meng  YU Yang
Affiliation:State Grid Jiangsu Marketing Service Center, Nanjing 210019, China
Abstract:Traditional named entity recognition methods can achieve favorable results owing to sufficient supervision data. As far as named entity recognition from electric power texts is concerned, however, the dependence on professional knowledge often makes it difficult to obtain sufficient supervision data, which is also known as a few-shot scenario. In addition, electric power named entity recognition is more challenging than general open domain tasks due to the accuracy requirements of the electric power industry and the more categories of entities in this industry. To overcome these challenges, this study proposes a named entity recognition method based on topic prompts. This method regards each entity category as a topic and uses the topic model to obtain topic words related to the category from the training corpus. Then, it fills in the template and constructs prompt sentences by enumerating entity spans, entity categories, and topic terms. Finally, the generative pre-trained language model is used to rank the prompt sentences and ultimately identify the entity and the corresponding category label. The experimental results show that on the dataset of Chinese electric power named entities to be recognized, the proposed method achieves better results than those offered by several traditional named entity recognition methods.
Keywords:named entity recognition  pre-trained model  prompt template  topic model  electric power corpus
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号