首页 | 本学科首页   官方微博 | 高级检索  
     

基于特征加权的新闻主题句抽取
引用本文:万国,张桂平,白宇,朱耀辉.基于特征加权的新闻主题句抽取[J].中文信息学报,2017,31(5):120-126.
作者姓名:万国  张桂平  白宇  朱耀辉
作者单位:沈阳航空航天大学 知识工程研究中心,辽宁 沈阳 110136
基金项目:沈阳省自然科学基金(20170540696);沈阳市科技计划项目(17-231-1-82)
摘    要:根据新闻文本的特点,分别对新闻标题与正文进行分析,该文提出了一种针对新闻文本的特征加权的主题句抽取方法。首先对新闻主题句在文本中的分布情况进行分析,选取了位置特征;然后根据新闻标题对于新闻主旨的提示作用,选取了标题句子重合度与关联度的特征,且在关联度特征中将基于加权二部图的最大匹配算法融入其中;最后依据句子的得分排名,进行主题句抽取。实验显示,利用该方法进行主题句抽取的P@1为75.9%,P@3 达到92.4%。

关 键 词:特征加权  重合度  关联度  加权二部图  

News Topic Sentence Extraction via Weighted Features
WAN Guo,ZHANG Guiping,BAI Yu,ZHU Yaohui.News Topic Sentence Extraction via Weighted Features[J].Journal of Chinese Information Processing,2017,31(5):120-126.
Authors:WAN Guo  ZHANG Guiping  BAI Yu  ZHU Yaohui
Affiliation:Knowledge Engineering Research Center, Shenyang Aerospace University, Shenyang, Liaoning 110136, China
Abstract:A topic sentence extraction method for news text is proposed. Firstly, the location feature is derived from the distribution of news topic sentence in the text. Then, the overlap ratio between a sentence and the title calculated owing to the interrelation of the news title with the theme. To best estimate the relevancy between the title and the candidate topic sentence, a maximum matching based on weighted bipartite graph is applied. Finally, the topic sentence is selected according to the sentence rank score. The experimental results show that the proposed method reaches 75.9% in P@1, and 92.4% in P@3.
Keywords:feature weighted  overlap ratio  relevancy degree  weighted bipartite graph  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号