首页 | 本学科首页   官方微博 | 高级检索  
     

实体关系自动抽取
引用本文:车万翔,刘挺,李生. 实体关系自动抽取[J]. 中文信息学报, 2005, 19(2): 2-7
作者姓名:车万翔  刘挺  李生
作者单位:哈尔滨工业大学计算机学院,黑龙江哈尔滨 150001
摘    要:实体关系抽取是信息抽取领域中的重要研究课题。本文使用两种基于特征向量的机器学习算法,Winnow 和支持向量机(SVM) ,在2004 年ACE(Automatic Content Extraction) 评测的训练数据上进行实体关系抽取实验。两种算法都进行适当的特征选择,当选择每个实体的左右两个词为特征时,达到最好的抽取效果,Winnow和SVM算法的加权平均F-Score 分别为73108 %和73127 %。可见在使用相同的特征集,不同的学习算法进行实体关系的识别时,最终性能差别不大。因此使用自动的方法进行实体关系抽取时,应当集中精力寻找好的特征。

关 键 词:计算机应用  中文信息处理  实体关系抽取  ACE 评测  特征选择  
文章编号:1003-0077(2005)02-0001-06
修稿时间:2004-06-20

Automatic Entity Relation Extraction
CHE Wan-xiang,LIU Ting,LI Sheng. Automatic Entity Relation Extraction[J]. Journal of Chinese Information Processing, 2005, 19(2): 2-7
Authors:CHE Wan-xiang  LIU Ting  LI Sheng
Affiliation:School of Computer Science and Technology , Harbin Institute of Technology , Harbin ,Heilongjiang 150001 ,China
Abstract:Entity Relation Extraction is an important research field in Information Extraction. Two kinds of machine learning algorithms, Winnow and Support Vector Machine (SVM), were used to extract entity relation from the training data of ACE (Automatic Content Extraction) Evaluation 2004 automatically. Both of the algorithms need appropriate feature selection. When two words around an entity were selected, the performance of the both algorithms got the peak. The average weighted F Score of Winnow and SVM algorithms were 73 08% and 73 27% respectively. We can conclude that when the same feature set is used, the performance of different machine learning algorithms get little difference. So we should pay more attention to find better features when we use the automatic learning methods to extract the entity relation.
Keywords:computer application  Chinese information processing  entity relation extraction  ACE evaluation  feature selection  
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号