首页 | 官方网站   微博 | 高级检索  
     

增量采样聚类驱动的新闻事件发现
引用本文:陈晓琪,,谢振平,,刘渊,.增量采样聚类驱动的新闻事件发现[J].智能系统学报,2020,15(6):1175-1184.
作者姓名:陈晓琪    谢振平    刘渊  
作者单位:1. 江南大学 人工智能与计算机学院, 江苏 无锡 214122;2. 江南大学 江苏省媒体设计与软件技术重点实验室, 江苏 无锡 214122
摘    要:为获得更好的事件发现和代表性新闻抽取性能,引入数据集代表点采样聚类的视角,研究实现了一种事件发现及表示的集成分析方法。对于给定的新闻流数据,首先引入信息支撑度定义新闻间关系权重和事件关系权重,并通过引入双层近邻传播算法的迭代构建整体时间流上的单向事件内容支撑度网络,实现代表性新闻的分层增量采样,进一步考虑以最大相似度划分策略实现代表性新闻上的整体新闻流数据聚类。实验结果表明,相比于现有相关方法,新方法在大规模新闻流数据上具有显著的计算效率,可提取出新闻流中极有代表性的新闻,以及获得更好的新闻文档聚类质量,其热点事件发现结果与权威机构评选的重大新闻有极高吻合度。

关 键 词:新闻流数据  事件发现  代表性新闻  增量采样  信息支撑度  近邻传播  事件网络  分层聚类

News event detection driven by incremental sampling clustering
CHEN Xiaoqi,,XIE Zhenping,,LIU Yuan,.News event detection driven by incremental sampling clustering[J].CAAL Transactions on Intelligent Systems,2020,15(6):1175-1184.
Authors:CHEN Xiaoqi    XIE Zhenping    LIU Yuan  
Affiliation:1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China;2. Jiangsu Key Laboratory of Media Design and Software Technology, Jiangnan University, Wuxi 214122, China
Abstract:For obtaining better performance of event detection and representative news extraction, an integrated analysis method of event detection and representation is proposed by introducing the sampling clustering strategy on news documents. For a given news flow data, first, we present two-weight definitions on the relationships between news and events by introducing an information supporting degree concept and then construct a one-way event content support network on the whole time flow using the iterative algorithm of double-layer nearest affinity propagation to realize layer-by-layer incremental sampling of representative news. Furthermore, overall news clustering was performed by using the maximum similarity division strategy. According to our experimental results, compared with existing related methods, the new method has significant computational efficiency for processing large-scale news flow data. It can extract the most representative news from the news flow and obtain better clustering quality of news documents. Its hot event detection results are highly consistent with the major news selected by the authority.
Keywords:news flow data  event detection  representative news  incremental sampling  information supporting degree  affinity propagation  event network  hierarchical clustering
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号