首页 | 本学科首页   官方微博 | 高级检索  
     

中文专利属性值对抽取技术及应用
引用本文:孙东普,朱鸣华,林鸿飞.中文专利属性值对抽取技术及应用[J].计算机工程与科学,2016,38(4):800-806.
作者姓名:孙东普  朱鸣华  林鸿飞
作者单位:;1.大连理工大学计算机科学与技术学院
基金项目:国家自然科学基金(61202254,61402075);辽宁省自然科学基金(201202031,201402003)
摘    要:专利信息抽取是专利分析的基础,属性及属性值的识别与抽取是专利信息抽取所要解决的关键问题。目前,在中文专利信息抽取领域针对属性和属性值同步抽取的研究较少。本文以中文专利摘要作为实验语料,运用统计学习知识,提出一种基于条件随机场的抽取方法。该方法将属性和属性值视为命名实体,利用语料训练得到条件随机场模型,从而实现对属性和属性值的抽取;再利用挖掘的关联规则完成属性与属性值匹配。实验结果的准确率、召回率和F值分别是80.8%、81.2%和81.0%,其表明该方法能够高效同步抽取属性和属性值。同时,在抽取结果的基础上,本文完成了对专利的分析和同类专利的比较,体现了本方法的实用价值。

关 键 词:属性抽取  属性值抽取  中文专利  条件随机场
收稿时间:2015-01-27
修稿时间:2016-04-25

Chinese patent attribute value extraction technology and its application
SUN Dong pu,ZHU Ming hua,LIN Hong fei.Chinese patent attribute value extraction technology and its application[J].Computer Engineering & Science,2016,38(4):800-806.
Authors:SUN Dong pu  ZHU Ming hua  LIN Hong fei
Affiliation:(School of Computer Science and Technology,Dalian University of Technology,Dalian 116024,China)
Abstract:Patent information extraction is the foundation of patent analysis, and its attributes and attribute value extraction are important to patent information extraction. However, few studies focus on synchronously extracting attributes and their values in Chinese patent information extraction. Using abstracts of the Chinese patents as corpus, we propose a conditional random fields (CRFs) method based on statistic learning knowledge. Firstly,regarding the attributes and attribute values as named entities,we obtain a CRFs model by training sets, and then use this model to extract attributes and attribute values from the corpus.Secondly, we employ association rules to match the attributes and their values. Experimental results show that the accuracy, recall and F score can reach 80.8%, 81.2% and 81.0% respectively.The comparison of the extraction results proves the practical value of the proposal.
Keywords:attribute extraction  attribute value extraction  Chinese patent  conditional random fields (CRFs)  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号