首页 | 本学科首页   官方微博 | 高级检索  
     


Finding division points for a time series corpus based on structural change point detection
Authors:Hiroshi Kobayashi  Ryosuke Saga
Affiliation:1.Osaka Prefecture University,Sakai,Japan
Abstract:This paper describes a method of finding the proper points for dividing a corpus with time series information to extract local and frequent keywords. Previous works have proposed the corpus separating method for extracting keywords from a corpus. However, this method divides a corpus at equal intervals so that it cannot consider the topic changes. The present paper utilizes the idea of the topic model and the topic extracted through latent Dirichlet allocation to consider the topic change. This paper identifies the points at which large topic changes occur to divide the corpus using structural change detection method. An experiment involving newspaper articles with 5-year topics confirm that the points at which the topics of each document change are detected to find the division points based on the idea of structural change point detection and our method is better than previous methods based on recall measure.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号