首页 | 本学科首页   官方微博 | 高级检索  
     

基于无监督集成聚类的开放关系抽取方法
引用本文:谢斌红,李玉,赵红燕.基于无监督集成聚类的开放关系抽取方法[J].中文信息学报,2022,36(5):49-58.
作者姓名:谢斌红  李玉  赵红燕
作者单位:太原科技大学 计算机科学与技术学院,山西 太原 030024
基金项目:山西省重点研发计划(重点)高新领域项目(201703D111027);山西省重点研发计划项目(201803D121048);山西省重点研发计划项目(201803D121055)
摘    要:开放关系抽取(Open Relation Extraction, OpenRE)旨在从开放域语料库中抽取关系事实。大多数OpenRE方法通常局限于无监督方法提取命名实体之间的关系模式,然后将语义等价的模式聚类成一个关系簇,但由于缺少监督信息且聚类精度较低,影响了最终的关系抽取效果。为了进一步提高聚类性能,该文提出一种无监督集成聚类框架(Unsupervised Ensemble Clustering,UEC),它将无监督集成学习与基于信息度量的多步聚类算法相结合自主创建高质量伪标签,并以此作为监督信息改进关系特征的学习,从而引导聚类过程,获得更好的标签质量,最后通过多次迭代聚类发现文本中的关系类型。在FewRel和NYT-FB数据集上的实验结果表明,该文方法优于其他主流的基线OpenRE模型,F1值分别达到了65.2%和67.1%。

关 键 词:开放关系抽取  集成聚类  伪标签  

Open Relation Extraction Based on Unsupervised Ensemble Clustering
XIE Binhong,LI Yu,ZHAO Hongyan.Open Relation Extraction Based on Unsupervised Ensemble Clustering[J].Journal of Chinese Information Processing,2022,36(5):49-58.
Authors:XIE Binhong  LI Yu  ZHAO Hongyan
Affiliation:Department of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan, Shanxi 030024, China
Abstract:Open relation extraction (OpenRE) aims to extract relations for facts from open domain corpus. Most OpenRE methods are unsupervised methods to cluster semantically equivalent patterns into a relation cluster. To further improve the clustering performance, we proposed an unsupervised ensemble clustering framework(UEC), which combines unsupervised ensemble learning with iterative clustering algorithm based on information measurement to create high-quality labels. Such high-quality label can be used as supervised information to improve the feature learning and the clustering process to obtain better labels. Finally, through multiple iterative clustering, the relational types in the text can be effectively discovered. The experimental results on FewRel and NYT-FB datasets show that UEC is superior to other mainstream OpenRE models, with F1 score reaching 65.2% and 67.1%, respectively.
Keywords:open relation extraction  ensemble clustering  pseudo label  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号