首页 | 本学科首页   官方微博 | 高级检索  
     

中文开放式多元实体关系抽取
引用本文:李颖,郝晓燕,王勇.中文开放式多元实体关系抽取[J].计算机科学,2017,44(Z6):80-83.
作者姓名:李颖  郝晓燕  王勇
作者单位:太原理工大学计算机科学与技术学院 晋中030600,太原理工大学计算机科学与技术学院 晋中030600,太原理工大学计算机科学与技术学院 晋中030600
基金项目:本文受基于框架语义标注的中文篇章指代消解策略研究(2012011011-2)资助
摘    要:传统信息抽取针对特定的领域。当转换到新领域时,需要人工编写新的抽取规则和人工标记新的训练样本。开放信息抽取突破了传统信息抽取的局限性。现有的开放式信息抽取系统大多针对英文,然而,目前对于中文的研究相对较少,并主要以抽取三元组为主,没有针对中文抽取多元组的方法。因此提出了一种基于依存分析的中文开放式多元实体关系抽取方法。首先,对文本集进行预处理和依存关系分析;然后将动词视为候选关系词,将与此动词有满足条件的有效依存路径的基本名词短语视为实体词,关联两个及两个以上的实体词的关系词可与实体词组成候选多元实体关系组;最后,使用经过训练的逻辑回归分类器对多元实体关系组进行过滤。对百度百科数据集的抽取结果显示,所提方法在抽取大量实体关系多元组时准确性可达到81%。

关 键 词:中文开放式信息抽取  依存分析  实体关系抽取  机器学习  OIE  word2vec

N-ary Chinese Open Entity-relation Extraction
LI Ying,HAO Xiao-yan and WANG Yong.N-ary Chinese Open Entity-relation Extraction[J].Computer Science,2017,44(Z6):80-83.
Authors:LI Ying  HAO Xiao-yan and WANG Yong
Affiliation:College of Computer Science and Technology,Taiyuan University of Technology,Jinzhong 030600,China,College of Computer Science and Technology,Taiyuan University of Technology,Jinzhong 030600,China and College of Computer Science and Technology,Taiyuan University of Technology,Jinzhong 030600,China
Abstract:Traditionally,information extraction (IE) has focused on satisfying precise,narrow,pre-specified requests from small homogeneous corpora.Shifting to a new domain requires the user to name the target relations and to manually create new extraction rules or hand-tag new training examples.Open information extraction (OIE) overcomes the limitations of traditional IE techniques,which trains individual extractors for every single relation type.Present studies have attracted much attention on English OIE.However,few studies have been reported on OIE for Chinese.This paper presented a N-ary Chinese OIE system(N-COIE).N-COIE preprocesses the sentences using the nature language processing tools,and then extracts entity-relation groups from the preprocessed sentences.Finally,N-COIE filters entity-relation groups using the trained logistic regression classifier.Empirical results show the effectiveness of the proposed system.
Keywords:Chinese open information extraction  Dependency parsing  Entity-relation extraction  Machine learning  OIE  Word2vec
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号