首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
3.
4.
基于文本及视音频多模态信息的新闻分割   总被引:1,自引:0,他引:1  
提出了一种融合文本和视音频多模态特征的电视新闻自动分割方案。该方案充分考虑各种媒体特征的特点,先用矢量模型和GMM对文本进行预分割,用语谱图和HMM对语音预分割、用改进的直方图和SVM分类器对视频进行预分割。然后在时间同步的基础上,使用复合策略用ANN对预分割的数据进行融合,从而获得具有一定语义内容的视频段。实验结果表明此方法的有效性,并且分割后的视频片段具备较完整的语义信息特征,避免了分割的过度细碎的弊端。  相似文献   

5.
Computer simulated avatars and humanoid robots have an increasingly prominent place in today's world. Acceptance of these synthetic characters depends on their ability to properly and recognizably convey basic emotion states to a user population. This study presents an analysis of the interaction between emotional audio (human voice) and video (simple animation) cues. The emotional relevance of the channels is analyzed with respect to their effect on human perception and through the study of the extracted audio-visual features that contribute most prominently to human perception. As a result of the unequal level of expressivity across the two channels, the audio was shown to bias the perception of the evaluators. However, even in the presence of a strong audio bias, the video data were shown to affect human perception. The feature sets extracted from emotionally matched audio-visual displays contained both audio and video features while feature sets resulting from emotionally mismatched audio-visual displays contained only audio information. This result indicates that observers integrate natural audio cues and synthetic video cues only when the information expressed is in congruence. It is therefore important to properly design the presentation of audio-visual cues as incorrect design may cause observers to ignore the information conveyed in one of the channels.  相似文献   

6.
Advances in the media and entertainment industries, including streaming audio and digital TV, present new challenges for managing and accessing large audio-visual collections. Current content management systems support retrieval using low-level features, such as motion, color, and texture. However, low-level features often have little meaning for naive users, who much prefer to identify content using high-level semantics or concepts. This creates a gap between systems and their users that must be bridged for these systems to be used effectively. To this end, in this paper, we first present a knowledge-based video indexing and content management framework for domain specific videos (using basketball video as an example). We will provide a solution to explore video knowledge by mining associations from video data. The explicit definitions and evaluation measures (e.g., temporal support and confidence) for video associations are proposed by integrating the distinct feature of video data. Our approach uses video processing techniques to find visual and audio cues (e.g., court field, camera motion activities, and applause), introduces multilevel sequential association mining to explore associations among the audio and visual cues, classifies the associations by assigning each of them with a class label, and uses their appearances in the video to construct video indices. Our experimental results demonstrate the performance of the proposed approach.  相似文献   

7.
8.
9.
Video activity analysis is used in various video applications such as human action recognition, video retrieval, video archiving. In this paper, we propose to apply 3D wavelet transform statistics to natural video signals and employ the resulting statistical attributes for video modeling and analysis. From the 3D wavelet transform, we investigate the marginal and joint statistics as well as the Mutual Information (MI) estimates. We show that marginal histograms are approximated quite well by Generalized Gaussian Density (GGD) functions; and the MI between coefficients decreases when the activity level increases in videos. Joint statistics attributes are applied to scene activity grouping, leading to 87.3% accurate grouping of videos. Also, marginal and joint statistics features extracted from the video are used for human action classification employing Support Vector Machine (SVM) classifiers and 93.4% of the human activities are properly classified.  相似文献   

10.
Complex activities, e.g. pole vaulting, are composed of a variable number of sub-events connected by complex spatio-temporal relations, whereas simple actions can be represented as sequences of short temporal parts. In this paper, we learn hierarchical representations of activity videos in an unsupervised manner. These hierarchies of mid-level motion components are data-driven decompositions specific to each video. We introduce a spectral divisive clustering algorithm to efficiently extract a hierarchy over a large number of tracklets (i.e. local trajectories). We use this structure to represent a video as an unordered binary tree. We model this tree using nested histograms of local motion features. We provide an efficient positive definite kernel that computes the structural and visual similarity of two hierarchical decompositions by relying on models of their parent–child relations. We present experimental results on four recent challenging benchmarks: the High Five dataset (Patron-Perez et al., High five: recognising human interactions in TV shows, 2010), the Olympics Sports dataset (Niebles et al., Modeling temporal structure of decomposable motion segments for activity classification, 2010), the Hollywood 2 dataset (Marszalek et al., Actions in context, 2009), and the HMDB dataset (Kuehne et al., HMDB: A large video database for human motion recognition, 2011). We show that per-video hierarchies provide additional information for activity recognition. Our approach improves over unstructured activity models, baselines using other motion decomposition algorithms, and the state of the art.  相似文献   

11.
12.
基于累积边缘图像的现实人体动作识别   总被引:2,自引:0,他引:2  
为了从现实环境下识别出人体动作,本文研究了从无约束视频中提取特征表征人体动作的问题. 首先,在无约束的视频上使用形态学梯度操作消除部分背景,获得人体的轮廓形状; 其次,提取某一段视频上每一帧形状的边缘特征,累积到一幅图像中,称之为累积边缘图像 (Accumulative edge image, AEI); 然后,在该累积边缘图像上计算基于网格的方向梯度直方图(Histograms of orientation gradients, HOG),形成特征向量表征人体的动作, 送入分类器进行分类. YouTube数据集上的实验结果表明,本文的方法比其他方法更加有效.  相似文献   

13.
通过分析五类典型视频在视觉上的特性,提取了七种最能揭示几类视频差异的特征,并设计了一种基于一对一支持向量机(1-1 SVM)的视频内容自动分类算法,用于解决在对网络视频媒体的管理、点播、检索中对视频内容进行初步筛选的问题。基于大量实际视频片段的仿真实验结果证明了本算法在区分能力和准确率方面的性能优势。  相似文献   

14.
随着互联网为中心的信息技术的发展,国内科普视频受众数量不断增长,传统电视媒体和互联网络媒体都已经成为科普视频的重要的传播载体。本文综合分析了科普视频的特征,指出电视科学节目具有专业性、情境性、限制性和趣味性的显著特点,而互联网科普视频则更强调科学性、话题性、即时性和互动性,并指出在互联网+的发展战略下,电视科普视频和网络科普视频的相互借鉴、融合发展已经成为一种新的趋势,对科普视频特别是网络科普视频的创作有一定的借鉴作用。  相似文献   

15.
Web video categorization is a fundamental task for web video search. In this paper, we explore web video categorization from a new perspective, by integrating the model-based and data-driven approaches to boost the performance. The boosting comes from two aspects: one is the performance improvement for text classifiers through query expansion from related videos and user videos. The model-based classifiers are built based on the text features extracted from title and tags. Related videos and user videos act as external resources for compensating the shortcoming of the limited and noisy text features. Query expansion is adopted to reinforce the classification performance of text features through related videos and user videos. The other improvement is derived from the integration of model-based classification and data-driven majority voting from related videos and user videos. From the data-driven viewpoint, related videos and user videos are treated as sources for majority voting from the perspective of video relevance and user interest, respectively. Semantic meaning from text, video relevance from related videos, and user interest induced from user videos, are combined to robustly determine the video category. Their combination from semantics, relevance and interest further improves the performance of web video categorization. Experiments on YouTube videos demonstrate the significant improvement of the proposed approach compared to the traditional text based classifiers.  相似文献   

16.
Based on a heuristic approach to information processing, thematic reference of YouTube’s sidebars and YouTuber’s linguistic style are viewed as cues that should impact viewers’ evaluation of videos. In this study, a 2 × 2 online experiment was conducted wherein these factors were varied systematically for a video about nutrition myths. 147 participants assessed the credibility of information, YouTuber’s trustworthiness, and the self-reported learning gain. Results showed that a sidebar referring to similar (vs. unrelated) videos increased participants’ perceived trustworthiness of a YouTuber who used a YouTube-typical (vs. formal) language. Moreover, participants judged the learning gain to be higher when YouTube’s sidebar referred to similar videos. However, thematic reference of sidebar and linguistic style did not impact participants’ credibility judgments. Since people seem to recognize YouTube’s sidebars when evaluating videos, YouTube’s recommendation criteria might not only mediate videos, but also influence people’s judgments of YouTubers and videos.  相似文献   

17.
18.
香烟烟雾对环境条件敏感以及多特征间存在冗余,都导致无法在视频监控中准确进行烟雾识别,因此提出一种高维互信息与Simba特征加权相结合的算法(MI-Simba).首先采用视频特征提取方法获取烟雾统计度量特征、颜色布局特征和动态特征,构建初始特征向量;然后利用MI-Simba算法进行自动更新,构建该环境下最优特征组合;最后采用直推式支持向量机进行分类识别.针对室内和楼宇内场景,自建封闭空间吸烟视频数据集,采用5倍交叉策略进行比较验证,实验结果验证该算法在识别率和灵敏度两方面的有效性和优越性.  相似文献   

19.
提出了一种新的视频人脸表情识别方法. 该方法将识别过程分成人脸表情特征提取和分类2个部分,首先采用基于点跟踪的活动形状模型(ASM)从视频人脸中提取人脸表情几何特征;然后,采用一种新的局部支撑向量机分类器对表情进行分类. 在Cohn2Kanade数据库上对KNN、SVM、KNN2SVM和LSVM 4种分类器的比较实验结果验证了所提出方法的有效性.  相似文献   

20.
提出一种新的局部时空特征描述方法对视频序列进行识别和分类。结合SURF和光流检测图像中的时空兴趣点,并利用相应的描述子表示兴趣点。用词袋模型表示视频数据,结合SVM对包含不同行为的视频进行训练和分类。为了检测这种时空特征的有效性,通过UCF YouTube数据集进行了测试。实验结果表明,提出的算法能够有效识别各种场景下的人体行为。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号