Affiliation: | 1.School of Computer Science and Engineering, Northeastern University, Shenyang, China ;2.School of Computer Science and Information Technology, RMIT University, Melbourne, Australia ;3.Key Laboratory of Medical Image Computing (Northeastern University), Ministry of Education, Shenyang, China ; |
Abstract: | Effectively and efficiently summarizing social media is crucial and non-trivial to analyze social media. On social streams, events which are the main concept of semantic similar social messages, often bring us a firsthand story of daily news. However, to identify the valuable news, it is almost impossible to plough through millions of multi-modal messages one by one with traditional methods. Thus, it is urgent to summarize events with a few representative data samples on the streams. In this paper, we provide a vivid textual-visual media summarization approach for microblog streams, which exploits the incremental latent semantic analysis (LSA) of detected events. Firstly, with a novel weighting scheme for keyword relationship, we can detect and track daily sub-events on a keyword relation graph (WordGraph) of microblog streams effectively. Then, to summarize the stream with representative texts and images, we use cross-modal fusion to analyze the semantics of microblog texts and images incrementally and separately, with a novel incremental cross-modal LSA algorithm. The experimental results on a real microblog dataset show that our method is at least 1.31% better and 23.67% faster than existing state-of-the-art methods, and cross-modal fusion can improve the summarization performance by 4.16% on average. |