首页 | 本学科首页   官方微博 | 高级检索  
     

基于同义词词林和规则的中文远程监督人物关系抽取方法
引用本文:谢明鸿,冉强,王红斌.基于同义词词林和规则的中文远程监督人物关系抽取方法[J].计算机工程与科学,2021,43(9):1661-1667.
作者姓名:谢明鸿  冉强  王红斌
作者单位:(1.昆明理工大学信息工程与自动化学院,云南 昆明 650500; 2.昆明理工大学云南省人工智能重点实验室,云南 昆明 650500)
基金项目:国家自然科学基金(61966020)
摘    要:远程监督是一种根据知识库自动对齐实体进行大规模语料标注的方法,但过强的假设导致获取的语料混有大量的噪声.针对这一问题,提出了一种基于同义词词林和规则的中文远程监督人物关系抽取方法,该方法基于多示例学习思想将人物关系句子划分为包(bag)级,利用同义词词林对人物关系触发词做词频统计,确定最大词频候选关系和次大词频候选关系,再结合特定的人物关系判别规则判断人物关系.对bag判断出某个人物关系后,再对其进一步进行多关系预测,最终得到人物关系预测结果.在大规模的中文远程监督人物关系抽取公开数据集(IPRE)上的实验结果表明,所提方法得到的结果具有较好的F1值,并且能识别远程监督数据测试集标签所没标注出的人物关系.

关 键 词:同义词词林  规则  远程监督  人物关系  关系抽取  
收稿时间:2020-05-11
修稿时间:2020-07-21

A Chinese distant supervised personal relationship extraction method based on TongYiCi CiLin and rules
XIE Ming-hong,RAN Qiang,WANG Hong-bin.A Chinese distant supervised personal relationship extraction method based on TongYiCi CiLin and rules[J].Computer Engineering & Science,2021,43(9):1661-1667.
Authors:XIE Ming-hong  RAN Qiang  WANG Hong-bin
Affiliation:(1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500; 2.Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China)
Abstract:Distant supervision is a large-scale corpus labeling method based on automatic alignment of entities in the knowledge base, but the excessively strong assumptions lead to a large amount of noise in the acquired corpus. Aiming at this problem, this paper proposes a Chinese distant supervised personal relationship extraction method based on TongYiCi CiLin and rules. The multi-instances learning idea is used to divide the personal relationship into bags. Based on it, TongYiCi CiLin is used to do word frequency statistics on personal relationship trigger words, which can determine the candidate relationship of maximum word frequency and sub-large word frequency. Then, specific personal relationship judgment rules are combined to judge the personal relationship. After judging a personal relationship in a bag, the multi-relationship is further predicted to get the final result of the personal relationship. Expe- rimental results on IPRE, which is a large-scale Chinese distant supervised personal relationship public data set, show that our results have a good F1 value and can identify the personal relationship not marked by the distant supervision data test set.
Keywords:TongYiCi CiLin  rules  distant supervision  personal relationship  relationship extraction  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号