首页 | 本学科首页   官方微博 | 高级检索  
     

基于网络挖掘的实体关系元组自动获取
引用本文:李维刚,刘挺,李生.基于网络挖掘的实体关系元组自动获取[J].电子学报,2007,35(11):2111-2116.
作者姓名:李维刚  刘挺  李生
作者单位:哈尔滨工业大学计算机学院信息检索研究室,黑龙江哈尔滨 150001
摘    要:二元实体关系元组可以应用到知识库构建,数据挖掘,模式抽取等多个领域.本文利用特定关系的一个元组和一个关键词作为种子,结合多种自然语言处理底层技术,采取改进的模式获取方法和自举迭代策略,提出了一种新的从Web上抽取实体关系元组的方法.基准方法的平均准确率达到了78.12%,采用过滤措施后抽取方法的平均准确率达到了98.42%.实验结果表明,利用网络挖掘方法获取的实体关系元组能够很好满足信息抽取的应用,对抽取出的元组进一步处理,能够获取更多有价值的信息.

关 键 词:自举方法  实体关系  元组  信息抽取  网络挖掘  
文章编号:0372-2112(2007)11-2111-06
收稿时间:2006-11-23
修稿时间:2007-07-09

Automated Entity Relation Tuple Extraction Using Web Mining
LI Wei-gang,LIU Ting,LI Sheng.Automated Entity Relation Tuple Extraction Using Web Mining[J].Acta Electronica Sinica,2007,35(11):2111-2116.
Authors:LI Wei-gang  LIU Ting  LI Sheng
Affiliation:Information Retrieval Laboratory,School of Computer Science and Technology,Harbin Institute of Technology,Heilongjiang,Harbin 150001,China
Abstract:Binary entity relationship tuples can be applied in many fields such as knowledge base construction,data mining and pattern extraction and so on.A seed with a tuple and a keyword of a special relation is used to implement the method of extract- ing entity relation tuples from the web.Multiple Natural Language Processing(NLP)technologies are combined in this method.A novel pattern acquisition method and an improved bootstrapping iteration strategy are adopted to extract tuples.The baseline method achieves to 78.12% of average precision.The method with filtering measure achieves to 98.42%.The experimental results show that it can satisfy information extraction application well and the extracted tuples can derive more valuable information through fur- ther processing.
Keywords:bootstrapping  entity relation  tuples  information extraction  web mining
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号