首页 | 本学科首页   官方微博 | 高级检索  
     

基于多特征融合的中文比较句识别算法
引用本文:张 辰,冯 冲,刘全超,师 超,黄河燕,周海云. 基于多特征融合的中文比较句识别算法[J]. 中文信息学报, 2013, 27(6): 110-117
作者姓名:张 辰  冯 冲  刘全超  师 超  黄河燕  周海云
作者单位:北京理工大学,北京 100081
基金项目:国家重点基础研究发展计划(973计划)资助项目(2013CB329605,2013CB329303);国家自然科学基金资助项目(61201351);国家自然科学基金重点资助项目(61132009)
摘    要:观点承载着文本的重要信息,而比较句是观点评论中一种常见的句式现象。针对中文比较句识别问题,该文提出了一种基于规则与统计相结合的方法并进行实验。该方法先对语料及其分词结果进行规范化处理,再通过基于比较特征词词典与句法结构模板、依存关系相结合的方法进行泛提取。然后设计一种CSR规则提取算法,并利用CRF挖掘实体对象信息及语义角色信息。最后利用SVM分类器,选取不同特征维数,找到使性能达到最优的特征形式完成精提取。

关 键 词:比较句  规则  CRF  SVM  

Chinese Comparative Sentence Identification Based On Multi-feature Fusion
ZHANG Chen,FENG Chong,LIU Quanchao,SHI Chao,HUANG Heyan,ZHOU Haiyun. Chinese Comparative Sentence Identification Based On Multi-feature Fusion[J]. Journal of Chinese Information Processing, 2013, 27(6): 110-117
Authors:ZHANG Chen  FENG Chong  LIU Quanchao  SHI Chao  HUANG Heyan  ZHOU Haiyun
Affiliation:Beijing Institute of Technology, Beijing 100081, China
Abstract:Opinions always carry important information of texts. Comparative sentence is a common way to express opinion. This paper described how to recognize comparative sentences from Chinese text documents by applying rule-based methods and statistical methods as well as analyze the performance of these methods. This method firstly normalized the corpus and its segmentation results, and then got the broad extraction results by using a lexicon-based method, sentence structure and dependent relationship analysis. Then a kind of CSR rule extraction algorithm was designed to extract the dependency relationship. The paper also used a CRF algorithm to identify entities and semantic roles. Finally, by using SVM classifier and choosing different feature dimensions the paper found the most optimum and effective features combination to finish the accurate extraction.
Key wordscomparative sentence;rule;CRF;SVM
Keywords:comparative sentence  rule  CRF  SVM  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号