首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this work, we propose a novel multimedia summarization technique from Online Social Networks (OSNs). In particular, we model each Multimedia Social Network (MSN)—i.e. an OSN focusing on the management and sharing of multimedia information—using an hypergraph based approach and exploit influence analysis methodologies to determine the most important multimedia objects with respect to one or more topics of interest. Successively, we obtain from the list of candidate objects a multimedia summary using a summarization model together with an heuristics that aims to generate summaries with priority (with respect to some user keywords), continuity, variety and not receptiveness features. The performed experiments on Flickr shows the effectiveness of proposed approach.  相似文献   

2.
The rise of online social media has led to an explosion of metadata-containing user generated content. The tracking of metadata distribution is essential to understand social media. This paper presents two statistical models that detect interpretable topics over time along with their hashtags distribution. A topic is represented by a cluster of words that frequently occur together, and a context is represented by a cluster of hashtags, i.e., the hashtag distribution. The models combine a context with a related topic by jointly modeling words with hashtags and time. Experiments with real-world datasets demonstrate that the proposed models discover topics over time with related contexts effectively.  相似文献   

3.
李山  沈浩 《计算机工程与设计》2021,42(10):2979-2987
微博这类高自我呈现社交媒体的话题热度预测方法,是否同样适用于知乎这类低自我呈现社交媒体平台,是有待检验的问题.对此从意见领袖和群体特征维度构建指标体系,采用基于树模型、核空间、线性模型、神经网络等4类(共11种)机器学习回归算法,构建话题热度预测模型进行对比计算分析.结果发现与高自我呈现平台不同,线性模型效果更好,其中对特征进行选择的弹性网回归算法的效果最佳;群体规模对话题热度影响较小,话题类型却对话题热度影响较大.  相似文献   

4.
In recent years, microblogs have become an important source for reporting real-world events. A real-world occurrence reported in microblogs is also called a social event. Social events may hold critical materials that describe the situations during a crisis. In real applications, such as crisis management and decision making, monitoring the critical events over social streams will enable watch officers to analyze a whole situation that is a composite event, and make the right decision based on the detailed contexts such as what is happening, where an event is happening, and who are involved. Although there has been significant research effort on detecting a target event in social networks based on a single source, in crisis, we often want to analyze the composite events contributed by different social users. So far, the problem of integrating ambiguous views from different users is not well investigated. To address this issue, we propose a novel framework to detect composite social events over streams, which fully exploits the information of social data over multiple dimensions. Specifically, we first propose a graphical model called location-time constrained topic (LTT) to capture the content, time, and location of social messages. Using LTT, a social message is represented as a probability distribution over a set of topics by inference, and the similarity between two messages is measured by the distance between their distributions. Then, the events are identified by conducting efficient similarity joins over social media streams. To accelerate the similarity join, we also propose a variable dimensional extendible hash over social streams. We have conducted extensive experiments to prove the high effectiveness and efficiency of the proposed approach.  相似文献   

5.
Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identifying highly relevant terms and topics from the mass of social media data. This approach combines term ranking, localised language analysis, unsupervised topic clustering and multilingual sentiment analysis to extract prominent topics through analysis of Twitter's tweets from a period of time. It is observed that each of the ranking methods tested has their strengths and weaknesses, and that our proposed ‘Joint’ ranking method is able to take advantage of the strengths of the ranking methods. This ‘Joint’ ranking method coupled with an unsupervised topic clustering model is shown to have the potential to discover topics of interest or concern to a local community. Practically, being able to do so may help decision makers to gauge the true opinions or concerns on the ground. Theoretically, the research is significant as it shows how an unsupervised online topic identification approach can be designed without much manual annotation effort, which may have great implications for future development of expert and intelligent systems.  相似文献   

6.
One-class learning and concept summarization for data streams   总被引:2,自引:2,他引:0  
In this paper, we formulate a new research problem of concept learning and summarization for one-class data streams. The main objectives are to (1) allow users to label instance groups, instead of single instances, as positive samples for learning, and (2) summarize concepts labeled by users over the whole stream. The employment of the batch-labeling raises serious issues for stream-oriented concept learning and summarization, because a labeled instance group may contain non-positive samples and users may change their labeling interests at any time. As a result, so the positive samples labeled by users, over the whole stream, may be inconsistent and contain multiple concepts. To resolve these issues, we propose a one-class learning and summarization (OCLS) framework with two major components. In the first component, we propose a vague one-class learning (VOCL) module for concept learning from data streams using an ensemble of classifiers with instance level and classifier level weighting strategies. In the second component, we propose a one-class concept summarization (OCCS) module that uses clustering techniques and a Markov model to summarize concepts labeled by users, with only one scanning of the stream data. Experimental results on synthetic and real-world data streams demonstrate that the proposed VOCL module outperforms its peers for learning concepts from vaguely labeled stream data. The OCCS module is also able to rebuild a high-level summary for concepts marked by users over the stream.  相似文献   

7.
Innovations in Systems and Software Engineering - The availability of images of events almost in real-time on social media has a prospect in many application developments. A humanitarian technology...  相似文献   

8.
提出了一种基于主题与子事件抽取的多文档自动文摘方法。该方法突破传统词频统计方法,除考虑词语频率、位置信息外,还将词语是否为描述文本集合的主题和子事件作为因素,提取出了8个基本特征,利用逻辑回归模型预测基本特征对词语权重的影响,计算词语权重。通过建立句子向量空间模型给句子打分,结合句子分数和冗余度产生文摘。对N-gram同现频率、主题词覆盖率和高频词覆盖率3种不同参数,分别在Coverage Baseline、Centroid-Based Summary和Word Mining based Summary(WMS)3种不同文摘系统下所产生的文摘质量,进行了对比实验,结果表明WMS系统在多方面具有优越的性能。  相似文献   

9.
Topics often transit among documents in a document collection. To improve the accuracy of the topic detection and tracking (TDT) algorithms in discovering topics or classifying documents, it is necessary to make full use of this kind of topic transition information. However, TDT algorithms usually find topics based on topic models, such as LDA, pLSI, etc., which are a kind of mixture model and make the topic transition difficult to be denoted and implemented. A topic transition model representation based on hidden Markov model is present, and learning the topic transition from documents is discussed. Based on the model, two TDT algorithms incorporating topic transition, i.e. topic discovering and document classifying, are provided to show the application of the proposed model. Experiments on two real-world document collections are done with the two algorithms, and performance comparison with other similar algorithm shows that the accuracy can achieve 93% for topic discovering in Reuters-21578, and 97.3% in document classifying. Furthermore, topic transition discovered by the algorithm on a dataset which was collected from a BBS website is consistent with the manual analysis results.  相似文献   

10.
Detecting topics from Twitter streams has become an important task as it is used in various fields including natural disaster warning, users opinion assessment, and traffic prediction. In this article, we outline different types of topic detection techniques and evaluate their performance. We categorize the topic detection techniques into five categories which are clustering, frequent pattern mining, Exemplar-based, matrix factorization, and probabilistic models. For clustering techniques, we discuss and evaluate nine different techniques which are sequential k-means, spherical k-means, Kernel k-means, scalable Kernel k-means, incremental batch k-means, DBSCAN, spectral clustering, document pivot clustering, and Bngram. Moreover, for matrix factorization techniques, we analyze five different techniques which are sequential Latent Semantic Indexing (LSI), stochastic LSI, Alternating Least Squares (ALS), Rank-one Downdate (R1D), and Column Subset Selection (CSS). Additionally, we evaluate several other techniques in the frequent pattern mining, Exemplar-based, and probabilistic model categories. Results on three Twitter datasets show that Soft Frequent Pattern Mining (SFM) and Bngram achieve the best term precision, while CSS achieves the best term recall and topic recall in most of the cases. Moreover, Exemplar-based topic detection obtains a good balance between the term recall and term precision, while achieving a good topic recall and running time.  相似文献   

11.
Statistical summaries of IP traffic are at the heart of network operation and are used to recover aggregate information on subpopulations of flows. It is therefore of great importance to collect the most accurate and informative summaries given the router's resource constraints. A summarization algorithm, such as Cisco's sampled NetFlow, is applied to IP packet streams that consist of multiple interleaving IP flows. We develop sampling algorithms and unbiased estimators which address sources of inefficiency in current methods. First, we design tunable algorithms whereas currently a single parameter (the sampling rate) controls utilization of both memory and processing/access speed (which means that it has to be set according to the bottleneck resource). Second, we make a better use of the memory hierarchy, which involves exporting partial summaries to slower storage during the measurement period.  相似文献   

12.
13.
With the rapid increase in social websites that has dramatically increased the volume of social media, which includes the use of images and videos, visual understanding has attracted great interest in several areas such as multimedia, computer vision, and pattern recognition. Valuable auxiliary resources available on social websites, such as user-provided tags, aid in the tasks of visual understanding. Therefore, several methods have been proposed for exploring the auxiliary resources for tag refinement, image retrieval, and media summarization. This work conducts a comprehensive survey of recent advances in visual understanding by mining social media in order to discuss their merits and limitations. We then analyze the difficulties and challenges of visual understanding followed by several possible future research directions.  相似文献   

14.
Modern Web search engines still have many limitations: search terms are not disambiguated, search terms in one query cannot be in different languages, the retrieved media items have to be in the same language as the search terms and search results are not integrated across a live stream of different media channels, including TV, online news and social media. The system described in this paper enables all of this by combining a media stream processing architecture with cross-lingual and cross-modal semantic annotation, search and recommendation. All those components were developed in the xLiMe project.  相似文献   

15.
Multimedia Tools and Applications - The era of video data has fascinated users into creating, processing, and manipulating videos for various applications. Voluminous video data requires higher...  相似文献   

16.
Multimedia Tools and Applications - Recently, with the widespread popularity of SNS (Social Network Service), such as Twitter, Facebook, people are increasingly accustomed to sharing feeling,...  相似文献   

17.
Microblogging services allow users to publish their thoughts, activities, and interests in the form of text streams and to share them with others in a social network. A user’s text stream in a microblogging service is temporally composed of the posts the user has written or republished from other socially connected users. In this context, most research on the microblogging service has primarily focused on social graph or topic extraction from the text streams, and in particular, several studies attempted to discover user’s topics of interests from a text stream since the topics play a crucial role in user search, friend recommendation, and contextual advertisement. Yet, they did not yet fully address unique properties of the stream. In this paper, we study a problem of detecting the topics of long-term steady interests to a user from a text stream, considering its dynamic and social characteristics, and propose a graph-based topic extraction model. Extensive experiments have been carried out to investigate the effects of the proposed approach by using a real-world dataset, and the proposed model is shown to produce better performance than the existing alternatives.  相似文献   

18.
Keyphrase extraction from social media is a crucial and challenging task. Previous studies usually focus on extracting keyphrases that provide the summary of a corpus. However, they do not take users’ specific needs into consideration. In this paper, we propose a novel three-stage model to learn a keyphrase set that represents or related to a particular topic. Firstly, a phrase mining algorithm is applied to segment the documents into human-interpretable phrases. Secondly, we propose a weakly supervised model to extract candidate keyphrases, which uses a few pre-specific seed keyphrases to guide the model. The model consequently makes the extracted keyphrases more specific and related to the seed keyphrases (which reflect the user’s needs). Finally, to further identify the implicitly related phrases, the PMI-IR algorithm is employed to obtain the synonyms of the extracted candidate keyphrases. We conducted experiments on two publicly available datasets from news and Twitter. The experimental results demonstrate that our approach outperforms the state-of-the-art baselines and has the potential to extract high-quality task-oriented keyphrases.  相似文献   

19.
Many social media facilitate paralinguistic digital affordances (PDAs): one-click tools for phatic communication to which senders and receivers alike ascribe meaning. This research explores the nature of social support perceived from the receipt of PDAs within social media, seeking to understand how individuals ascribe supportive meaning to PDAs based on (1) their goal in the post to which the PDA was used as a reply, (2) relational closeness with the PDA provider, and (3) the perceived automaticity of the PDA received. A national survey (N = 325) explored the receipt of PDAs across five social media, and facilitated cross-platform analysis. Analyses reveal both main and interaction effects among the three proposed antecedents, so that intentional PDAs from relationally close providers to messages seeking social support were perceived as most supportive. Findings reveal individuals heuristically make idiosyncratic sense of the same cue from different senders in different situations.  相似文献   

20.
Due to the increasing popularity of contents of social media platforms, the number of posts and messages is steadily increasing. A huge amount of data is generated daily as an outcome of the interactions between fans of the networking platforms. It becomes extremely troublesome to find the most relevant, interactive information for the subscribers. The aim of this work is to enable the users to get a powerful brief of comments without reading the entire list. This paper opens up a new field of short text summarization (STS) predicated on a hybrid ant colony optimization coming with a mechanism of local search, called ACO-LS-STS, to produce an optimal or near-optimal summary. Initially, the graph coloring algorithm, called GC-ISTS, was employed before to shrink the solution area of ants to small sets. Evidently, the main purpose of using the GC algorithm is to make the search process more facilitated, faster and prevents the ants from falling into the local optimum. First, the dissimilar comments are assembled together into the same color, at the same time preserving the information ratio as for an original list of comment. Subsequently, activating the ACO-LS-STS algorithm, which is a novel technique concerning the extraction of the most interactive comments from each color in a parallel form. At the end, the best summary is picked from the best color. This problem is formalized as an optimization problem utilizing GC and ACO-LS to generate the optimal solution. Eventually, the proposed algorithm was evaluated and tested over a collection of Facebook messages with their associated comments. Indeed, it was found that the proposed algorithm has an ability to capture a good solution that is guaranteed to be near optimal and had realized notable performance in comparison with traditional document summarization algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号