首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
随着数字媒体等技术的发展,出现了弹幕系统这种新型的评论模式并逐渐流行。它能够使视频观众即时发布关于视频情节内容的评论,也可以帮助观众理解视频内容。弹幕文本数据的产生,为短文本处理和实时数据处理提供了新的素材。研究弹幕数据的特点和其表达的情感,可以帮助我们更好地理解视频情节;研究弹幕内容之间的相似度进而分析用户之间的关联关系,不仅能够深入了解弹幕用户的特点、发掘不同视频之间的潜在联系,而且可以为视频制作时受众群体的选择提供更为准确的解决方案。首先将弹幕文本数据进行收集和预处理,然后计算这些文本的情感值。针对弹幕文本口语化的特点,建立了网络弹幕常用词词典。通过改进传统的k-means聚类算法,对所有发表弹幕的用户进行基于情感值的分类。这样的分类可以帮助我们了解观看特定类型视频的观众在情感上的异同点。  相似文献   

2.
Liu  Bo  Ni  Zeyang  Luo  Junzhou  Cao  Jiuxin  Ni  Xudong  Liu  Benyuan  Fu  Xinwen 《World Wide Web》2019,22(6):2953-2975

Social networking websites with microblogging functionality, such as Twitter or Sina Weibo, have emerged as popular platforms for discovering real-time information on the Web. Like most Internet services, these websites have become the targets of spam campaigns, which contaminate Web contents and damage user experiences. Spam campaigns have become a great threat to social network services. In this paper, we investigate crowd-retweeting spam in Sina Weibo, the counterpart of Twitter in China. We carefully analyze the characteristics of crowd-retweeting spammers in terms of their profile features, social relationships and retweeting behaviors. We find that although these spammers are likely to connect more closely than legitimate users, the underlying social connections of crowd-retweeting campaigns are different from those of other existing spam campaigns because of the unique features of retweets that are spread in a cascade. Based on these findings, we propose retweeting-aware link-based ranking algorithms to infer more suspicious accounts by using identified spammers as seeds. Our evaluation results show that our algorithms are more effective than other link-based strategies.

  相似文献   

3.
This paper introduces a workload characterization study of the most popular short video sharing service of Web 2.0, YouTube. Based on a vast amount of data gathered in a five-month period, we analyzed characteristics of around 250,000 YouTube popular and regular videos. In particular, we collected lists of related videos for each video clip recursively and analyzed their statistical behavior. Understanding YouTube traffic and similar Web 2.0 video sharing sites is crucial to develop synthetic workload generators. Workload simulators are required for evaluating the methods addressing the problems of high bandwidth usage and scalability of Web 2.0 sites such as YouTube. The distribution models, in particular Zipf-like behavior of YouTube popular video files suggests proxy caching of YouTube popular videos can reduce network traffic and increase scalability of YouTube Web site. YouTube workload characteristics provided in this work enabled us to develop a workload generator to evaluate the effectiveness of this approach.  相似文献   

4.
Online social networks have become immensely popular in recent years and have become the major sources for tracking the reverberation of events and news throughout the world. However, the diversity and popularity of online social networks attract malicious users to inject new forms of spam. Spamming is a malicious activity where a fake user spreads unsolicited messages in the form of bulk message, fraudulent review, malware/virus, hate speech, profanity, or advertising for marketing scam. In addition, it is found that spammers usually form a connected community of spam accounts and use them to spread spam to a large set of legitimate users. Consequently, it is highly desirable to detect such spammer communities existing in social networks. Even though a significant amount of work has been done in the field of detecting spam messages and accounts, not much research has been done in detecting spammer communities and hidden spam accounts. In this work, an unsupervised approach called SpamCom is proposed for detecting spammer communities in Twitter. We model the Twitter network as a multilayer social network and exploit the existence of overlapping community-based features of users represented in the form of Hypergraphs to identify spammers based on their structural behavior and URL characteristics. The use of community-based features, graph and URL characteristics of user accounts, and content similarity among users make our technique very robust and efficient.  相似文献   

5.
Locating content in existing video archives is both a time and bandwidth consuming process since users might have to download and manually watch large portions of superfluous videos. In this paper, we present two novel prototypes using an Internet based video composition and streaming system with a keyword-based search interface that collects, converts, analyses, indexes, and ranks video content. At user requests, the system can automatically sequence out portions of single videos or aggregate content from multiple videos to produce a single, personalized video stream on-the-fly.  相似文献   

6.
Concept detection is targeted at automatically labeling video content with semantic concepts appearing in it, like objects, locations, or activities. While concept detectors have become key components in many research prototypes for content-based video retrieval, their practical use is limited by the need for large-scale annotated training sets. To overcome this problem, we propose to train concept detectors on material downloaded from web-based video sharing portals like YouTube, such that training is based on tags given by users during upload, no manual annotation is required, and concept detection can scale up to thousands of concepts. On the downside, web video as training material is a complex domain, and the tags associated with it are weak and unreliable. Consequently, performance loss is to be expected when replacing high-quality state-of-the-art training sets with web video content.This paper presents a concept detection prototype named TubeTagger that utilizes YouTube content for an autonomous training. In quantitative experiments, we compare the performance when training on web video and on standard datasets from the literature. It is demonstrated that concept detection in web video is feasible, and that – when testing on YouTube videos – the YouTube-based detector outperforms the ones trained on standard training sets. By applying the YouTube-based prototype to datasets from the literature, we further demonstrate that: (1) If training annotations on the target domain are available, the resulting detectors significantly outperform the YouTube-based tagger. (2) If no annotations are available, the YouTube-based detector achieves comparable performance to the ones trained on standard datasets (moderate relative performance losses of 11.4% is measured) while offering the advantage of a fully automatic, scalable learning. (3) By enriching conventional training sets with online video material, performance improvements of 11.7% can be achieved when generalizing to domains unseen in training.  相似文献   

7.
Currently, most video on-demand services offered over the Internet do not exploit the idle resources available from end-users, including YouTube. We present a taxonomic analysis of user-assistance in video on-demand systems, where users are both clients and servers, helping with the task of video distribution. From a theoretical perspective, we develop a deterministic fluid model suitable for sequential systems. We mathematically prove the Peer-to-Peer Sequential Fluid Model is globally stable in the Lyapunov sense, no matter the network parameters of the cooperative system. We theoretically prove that cooperative systems always outperform non-cooperative solutions. From a practical point of view, a caching problem is proposed and discussed in order to tackle technological concerns to massively distribute popular videos on-demand. The goal is to distribute video items into repositories minimizing the waiting times of end-users. The caching problem is inside the class of NP-Complete computational problems, and heuristically solved with a GRASP methodology enriched with a path-relinking technique. Predictions inspired in a statistical analysis of real-life YouTube traces suggest the introduction of cooperation is both robust and economically attractive. These results highlight the harmony between our theoretical development and practice.  相似文献   

8.
9.
Accurate video tagging has been becoming increasingly crucial for online video management and search. This article documents a novel framework called comprehensive video tagger (CVTagger) to facilitate accurate tag-based video annotation. The system applies both multimodal and temporal properties combined with a novel classification framework with hierarchical structure based on multilayer concept model and regression analysis. The advanced architecture enables effective incorporation of both video concept dependency and temporal dynamics. Using a large-scale test collection containing 50,000 YouTube videos, a set of empirical studies have been carried out and experimental results demonstrate various advantages of CVTagger over the state-of-the-art techniques.  相似文献   

10.
Online social video websites such as YouTube allow users to manually annotate their video documents with textual labels. These labels can be used as indexing keywords to facilitate search and organization of video data. However, manual video annotation is usually a labor-intensive and time-consuming process. In this work, we propose a novel social video annotation approach that combines multiple feature sets based on a tri-adaptation approach. For the shots in each video, they are annotated by aggregating models that are learned from three complementary feature sets. Meanwhile, the models are collaboratively adapted by exploring unlabeled shots. In this sense, the method can be viewed as a novel semi-supervised algorithm that explores three complementary views. Our approach also exploits the temporal smoothness of video labels by applying a label correction strategy. Experiments on a web video dataset demonstrate the effectiveness of the proposed approach.  相似文献   

11.
With the rise of social networking services such as Facebook and Twitter, the problem of spam and content pollution has become more significant and intractable. Using social networking services, users are able to develop relationships and share messages with others in a very convenient manner; however, they are vulnerable to receiving spam messages. The automatic detection of spammers or content polluters on the network can effectively reduce the burden on the service provider in making a decision on appropriate counteractions. Content polluters can be automatically identified by using the supervised learning technique of artificial intelligence. To build a classification model with high accuracy automatically from the training data set, it is important to identify a set of useful features that can classify polluters and non-polluters. Moreover, because we deal with a huge amount of raw data in this process, the efficiency of data preparation and model creation are also critical issues that need to be addressed. In this paper, we present an efficient method for detecting content polluters on Twitter. Specifically, we propose a set of features that can be easily extracted from the messages and behaviors of Twitter users and construct a new breed of classifiers based on these features. The proposed approach requires only a minimal number of feature values per Twitter user and thus adds considerably less time to the overall mining process compared to other methods. Experiments confirm that the proposed approach outperforms previous approaches in both classification accuracy and processing time.  相似文献   

12.
With the rapid development of WiFi and 3G/4G, people tend to view videos on mobile devices. These devices are ubiquitous but have small memory to cache videos. As a result, in contrast to traditional computers, these devices aggravate the network pressure of content providers. Previous studies use CDN to solve this problem. But its static leasing mechanism in which the rental space cannot be dynamically adjusted makes the operational cost soar and incompatible with the dynamically video delivery. In our study, based on a thorough analysis of user behavior from Tencent Video, a popular Chinese on-line video share platform, we identify two key user behaviors. Firstly, lots of users in the same region tend to watch the same video. Secondly, the popularity distribution of videos conforms with the Pareto principle, i.e., the top 20% popular videos own 80% of all video traffic. To turn these observations into silver bullet, we propose and implement a novel cloud- and peer-assisted video on demand system (CPA-VoD). In the system, we group users in the same region as a peer swarm, and in the same peer swarm, users can provide videos to other users by sharing their cached videos. Besides, we cache the 10% most popular videos in cloud servers to further alleviate the network pressure. We choose cloud servers to cache videos because the rental space can be dynamically adjusted. According to the evaluation on a real dataset from Tencent Video, CPA-VoD alleviates the network pressure and the operation cost excellently, while only 20.9% traffic is serviced by the content provider.  相似文献   

13.
Screencasts are used to capture a developer’s screen while they narrate how a piece of software works or how the software can be extended. They have recently become a popular alternative to traditional text-based documentation. This paper describes our investigation into how developers produce and share developer-focused screencasts. In this study, we identified and analyzed a set of development screencasts from YouTube to explore what kinds of software knowledge are shared in video walkthroughs of code and what techniques are used for sharing software knowledge. We also interviewed YouTube screencast producers to understand their motivations for creating screencasts as well as to discover the challenges they face while producing code-focused videos. Finally, we compared YouTube screencasts to videos hosted on the professional RailsCasts website to better understand the differences and practices of this more curated ecosystem with the YouTube platform. Our three-phase study showed that video is a useful medium for communicating program knowledge between developers and that developers build their online persona and reputation by sharing videos through social channels. These findings led to a number of best practices for future screencast creators.  相似文献   

14.
The sharing and re-sharing of videos on social sites, blogs e-mail, and other means has given rise to the phenomenon of viral videos—videos that become popular through internet sharing. In this paper we seek to better understand viral videos on YouTube by analyzing sharing and its relationship to video popularity using millions of YouTube videos. The socialness of a video is quantified by classifying the referrer sources for video views as social (e.g. an emailed link, Facebook referral) or non-social (e.g. a link from related videos). We find that viewership patterns of highly social videos are very different from less social videos. For example, the highly social videos rise to, and fall from, their peak popularity more quickly than less social videos. We also find that not all highly social videos become popular, and not all popular videos are highly social. By using our insights on viral videos we are able develop a method for ranking blogs and websites on their ability to spread viral videos.  相似文献   

15.
Despite the large variety and wide adoption of different techniques to detect and filter unsolicited messages (spams), the total amount of such messages over the Internet remains very large. Some reports point out that around 80% of all emails are spams. As a consequence, significant amounts of network resources are still wasted as filtering strategies are usually performed only at the email destination server. Moreover, a considerable part of these unsolicited messages is sent by users who are unaware of their spamming activity and may thus inadvertently be classified as spammers. In this case, these oblivious users act as spambots, i.e., members of a spamming botnet. This paper proposes a new method for detecting spammers at the source network, whether they are individual malicious users or oblivious members of a spamming botnet. Our method, called SpaDeS, is based on a supervised classification technique and relies only on network-level metrics, thus not requiring inspection of message content. We evaluate SpaDeS using real datasets collected from a Brazilian broadband ISP. Our results show that our method is quite effective, correctly classifying the vast majority (87%) of the spammers while misclassifying only around 2% of the legitimate users.  相似文献   

16.
Online Social Networks act as a popular forum to promote and manage the reputation of aristocrats. Generally, big organizations, politicians, celebrities, and journalists require a huge number of followers/fans to promote and manage their reputation on Online Social Networks such as Twitter. This demand of more followers has originated Twitter Followers Market that deals with the sale and purchase of fake/compromised Twitter accounts. In this paper, an analysis of merchants of this marketing industry has been conducted using graph- and content-based features. The present study has been conducted by collecting 15,750 tweets related to the sale or purchase of Twitter followers posted by around 995 users. The analysis infers that the merchants of this marketing industry fulfill the characteristics of spammers as stated by the rules and policies of Twitter. Further, machine learning classification approach has been used to categorize these merchants from genuine users based on their behavioral features. To the best of our knowledge, this is the novel study to analyze and then categorize the behavior of such merchants involved in the sale of Twitter accounts as spammers.  相似文献   

17.
Many video service sites headed by YouTube know what content requires copyright protection. However, they lack a copyright protection system that automatically distinguishes whether uploaded videos contain legal or illegal content. Existing protection techniques use content-based retrieval methods that compare the features of video. However, if the video encoding has changed in resolution, bit-rate or codec, these techniques do not perform well. Thus, this paper proposes a novel video matching algorithm even if the type of encoding has changed. We also suggest an intelligent copyright protection system using the proposed algorithm. This can serve to automatically prevent the uploading of illegal content. The proposed method has represented the accuracy of 97% with searching algorithm in video-matching experiments and 98.62% with automation algorithm in copyright-protection experiments. Therefore, this system could form a core technology that identifies illegal content and automatically excludes access to illegal content by many video service sites.  相似文献   

18.
YouTube是全球著名的视频网站,为用户提供高质量的视频等服务.在移动流媒体服务规范的基础上,设计了YouTube媒体播放器客户端的系统架构,根据播放器的功能设计,分为HTTP引擎、音视频缓冲、音视频解码、音视频同步、自适应和UI界面几个模块.在比较分析移动流媒体传输协议的基础上,重点研究了使用HTTP协议优化网络引擎的实现方案.采用了H.264解码器从算法级和代码级对传输数据进行优化,从而实现媒体播放器在YouTube视频网站上播放视频的功能.  相似文献   

19.
User-Generated Content has become very popular since new web services such as YouTube allow for the distribution of user-produced media content. YouTube-like services are different from existing traditional VoD services in that the service provider has only limited control over the creation of new content. We analyze how content distribution in YouTube is realized and then conduct a measurement study of YouTube traffic in a large university campus network. Based on these measurements, we analyzed the duration and the data rate of streaming sessions, the popularity of videos, and access patterns for video clips from the clients in the campus network. The analysis of the traffic shows that trace statistics are relatively stable over short-term periods while long-term trends can be observed. We demonstrate how synthetic traces can be generated from the measured traces and show how these synthetic traces can be used as inputs to trace-driven simulations. We also analyze the benefits of alternative distribution infrastructures to improve the performance of a YouTube-like VoD service. The results of these simulations show that P2P-based distribution and proxy caching can reduce network traffic significantly and allow for faster access to video clips.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号