首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
With the rapid adoption of consumer digital video recorders and an increase of home video data, content analysis has become an interesting and key research issue to provide personalized experiences and services for both camcorder users and viewers. In this paper, we present a novel view to tackle this issue, which aims at modeling and mining of the capture intention of camcorder users. Based on the study of intention mechanism in psychology, a set of domain-specific capture intention concepts is defined. A comprehensive and extensible scheme consisting of video structure decomposition, intention-oriented feature analysis, as well as singular-value-decomposition-based intention segmentation and learning-based intention classification is proposed to mine the users' capture intention. Experiments were carried on home video sequences of 90 h in total, taken by 16 persons over the past 20 years. Both the user study and objective evaluations indicate that our proposed intention-based approach is an effective complement to existing home video content analysis schemes  相似文献   

2.
Graph-Based Multiplayer Detection and Tracking in Broadcast Soccer Videos   总被引:1,自引:0,他引:1  
In this paper, we propose a graph-based approach for detecting and tracking multiple players in broadcast soccer videos. In the first stage, the position of the players in each frame is determined by removing the non player regions. The remaining pixels are then grouped using a region growing algorithm to identify probable player candidates. A directed weighted graph is constructed, where probable player candidates correspond to the nodes of the graph while each edge links candidates in a frame with the candidates in next two consecutive frames. Finally, dynamic programming is applied to find the trajectory of each player. Experiments with several sequences from broadcasted videos of international soccer matches indicate that the proposed approach is able to track the players reasonably well even under varied illumination and ground conditions.   相似文献   

3.
Associating faces appearing in Web videos with names presented in the surrounding context is an important task in many applications. However, the problem is not well investigated particularly under large-scale realistic scenario,mainly due to the scarcity of dataset constructed in such circumstance. In this paper, we introduce a Web video dataset of celebrities, named WebV-Cele, for name-face association. The dataset consists of 75 073 Internet videos of over 4 000 hours,covering 2 427 celebrities and 649 001 faces. This is, to our knowledge, the most comprehensive dataset for this problem.We describe the details of dataset construction, discuss several interesting findings by analyzing this dataset like celebrity community discovery, and provide experimental results of name-face association using five existing techniques. We also outline important and challenging research problems that could be investigated in the future.  相似文献   

4.
Video indexing requires the efficient segmentation of video into scenes. The video is first segmented into shots and a set of key-frames is extracted for each shot. Typical scene detection algorithms incorporate time distance in a shot similarity metric. In the method we propose, to overcome the difficulty of having prior knowledge of the scene duration, the shots are clustered into groups based only on their visual similarity and a label is assigned to each shot according to the group that it belongs to. Then, a sequence alignment algorithm is applied to detect when the pattern of shot labels changes, providing the final scene segmentation result. In this way shot similarity is computed based only on visual features, while ordering of shots is taken into account during sequence alignment. To cluster the shots into groups we propose an improved spectral clustering method that both estimates the number of clusters and employs the fast global k-means algorithm in the clustering stage after the eigenvector computation of the similarity matrix. The same spectral clustering method is applied to extract the key-frames of each shot and numerical experiments indicate that the content of each shot is efficiently summarized using the method we propose herein. Experiments on TV-series and movies also indicate that the proposed scene detection method accurately detects most of the scene boundaries while preserving a good tradeoff between recall and precision.  相似文献   

5.
Fast Nearest-Neighbor Query Processing in Moving-Object Databases   总被引:4,自引:1,他引:4  
A desirable feature in spatio-temporal databases is the ability to answer future queries, based on the current data characteristics (reference position and velocity vector). Given a moving query and a set of moving objects, a future query asks for the set of objects that satisfy the query in a given time interval. The difficulty in such a case is that both the query and the data objects change positions continuously, and therefore we can not rely on a given fixed reference position to determine the answer. Existing techniques are either based on sampling, or on repetitive application of time-parameterized queries in order to provide the answer. In this paper we develop an efficient method in order to process nearest-neighbor queries in moving-object databases. The basic advantage of the proposed approach is that only one query is issued per time interval. The time-parameterized R-tree structure is used to index the moving objects. An extensive performance evaluation, based on CPU and I/O time, shows that significant improvements are achieved compared to existing techniques.  相似文献   

6.
Identifying the active speaker in a video of a distributed meeting can be very helpful for remote participants to understand the dynamics of the meeting. A straightforward application of such analysis is to stream a high resolution video of the speaker to the remote participants. In this paper, we present the challenges we met while designing a speaker detector for the Microsoft RoundTable distributed meeting device, and propose a novel boosting-based multimodal speaker detection (BMSD) algorithm. Instead of separately performing sound source localization (SSL) and multiperson detection (MPD) and subsequently fusing their individual results, the proposed algorithm fuses audio and visual information at feature level by using boosting to select features from a combined pool of both audio and visual features simultaneously. The result is a very accurate speaker detector with extremely high efficiency. In experiments that includes hundreds of real-world meetings, the proposed BMSD algorithm reduces the error rate of SSL-only approach by 24.6%, and the SSL and MPD fusion approach by 20.9%. To the best of our knowledge, this is the first real-time multimodal speaker detection algorithm that is deployed in commercial products.   相似文献   

7.
Gesture plays an important role for recognizing lecture activities in video content analysis. In this paper, we propose a real-time gesture detection algorithm by integrating cues from visual, speech and electronic slides. In contrast to the conventional “complete gesture” recognition, we emphasize detection by the prediction from “incomplete gesture”. Specifically, intentional gestures are predicted by the modified hidden Markov model (HMM) which can recognize incomplete gestures before the whole gesture paths are observed. The multimodal correspondence between speech and gesture is exploited to increase the accuracy and responsiveness of gesture detection. In lecture presentation, this algorithm enables the on-the-fly editing of lecture slides by simulating appropriate camera motion to highlight the intention and flow of lecturing. We develop a real-time application, namely simulated smartboard, and demonstrate the feasibility of our prediction algorithm using hand gesture and laser pen with simple setup without involving expensive hardware.   相似文献   

8.
针对深度伪造视频检测存在的面部特征提取不充分的问题,提出了改进的ResNet(i_ResNet34)模型和3种基于信息删除的数据增强方式.首先,优化ResNet网络,使用分组卷积代替普通卷积,在不增加模型参数的前提下提取更丰富的人脸面部特征;接着改进模型虚线残差结构的shortcut分支,通过最大池化层完成下采样操作,...  相似文献   

9.
10.
Recently, we have proposed a real-time tracker that simultaneously tracks the 3-D head pose and facial actions in monocular video sequences that can be provided by low quality cameras. This paper has two main contributions. First, we propose an automatic 3-D face pose initialization scheme for the real-time tracker by adopting a 2-D face detector and an eigenface system. Second, we use the proposed methods—the initialization and tracking—for enhancing the human–machine interaction functionality of an AIBO robot. More precisely, we show how the orientation of the robot's camera (or any active vision system) can be controlled through the estimation of the user's head pose. Applications based on head-pose imitation such as telepresence, virtual reality, and video games can directly exploit the proposed techniques. Experiments on real videos confirm the robustness and usefulness of the proposed methods.   相似文献   

11.
MIS managers are inundated with various professional books and journals. As a result, the videotape market has exploded with products that attempt to tackle current information systems issues. This column reviews one such video series as well as one book on object-orientation, one book on hypertext, and the latest from Peter Drucker.  相似文献   

12.
MIS managers are inundated with various professional books and journals. As a result, the videotape market has exploded with products that attempt to tackle current information systems issues. This column reviews one such video series as well as one book on object-orientation, one book on hypertext, and the latest from Peter Drucker.  相似文献   

13.
14.
本文介绍了一个用于家庭服务机器人完成人脸检测、跟踪、识别的双目视觉系统。该系统首先采用人脸肤色模型结合相似度来检测人脸;然后通过基于颜色信息的CAMSHIFT算法跟踪运动的人脸;最后利用嵌入式隐马尔可夫模型对人脸进行识别。实验结果表明该系统能自动地检测、跟踪、识别人脸,而且该系统具有较良好的实时性和鲁棒性。  相似文献   

15.
智能化网络入侵检测中的关键词选择   总被引:1,自引:0,他引:1  
传统基于关键词的入侵检测技术主要缺点在于较高的虚警概率。为了克服高虚警的缺点,作者采用神经网络技术与关键词匹配技术相结合的方法,取得较好的效果。重点对智能化网络入侵检测中关键词表的选择原则及其对实际检测性能的影响效果进行了分析研究。对比实验结果证实了所提出的关键词选择原则。  相似文献   

16.
主机安全检测是为了保护主机的程序、数据或设备等免受非授权访问、使用或破坏。文章根据控制目的对相应检测点进行检查。对当前主机安全检测中典型检测点与控制点及其关联性进行了分析,并构建关联矩阵,有利于用户清晰理解主机安全检测的目的和方法以及安全措施的针对性,增强用户安全意识和认识,并提高相关安全产品的易用性和使用效率。  相似文献   

17.
在制作多媒体课件时,我们经常要用到影像。本文介绍在PowerPoint中插入影像对象的几种方法,这些方法通过实践证明是实用可行的。  相似文献   

18.
基于关联规则的特征选择算法   总被引:2,自引:0,他引:2  
关联规则能够发现数据库中属性之间的关联,通过优先选择短规则用于相关属性的选择,有可能得到最小的属性子集.基于此,本文提出一种基于关联规则的特征选择算法,实验结果表明在属性子集大小和分类精度上优于多种特征选择方法.同时,对支持度和置信度对算法效果的影响进行探索,结果表明高的支持度和置信度并不导致高的分类精度和小的特征子集,而充足的规则数是基于关联规则特征选择算法高效的必要条件.  相似文献   

19.
Feature Detection with Automatic Scale Selection   总被引:49,自引:4,他引:49  
The fact that objects in the world appear in different ways depending on the scale of observation has important implications if one aims at describing them. It shows that the notion of scale is of utmost importance when processing unknown measurement data by automatic methods. In their seminal works, Witkin (1983) and Koenderink (1984) proposed to approach this problem by representing image structures at different scales in a so-called scale-space representation. Traditional scale-space theory building on this work, however, does not address the problem of how to select local appropriate scales for further analysis. This article proposes a systematic methodology for dealing with this problem. A framework is presented for generating hypotheses about interesting scale levels in image data, based on a general principle stating that local extrema over scales of different combinations of -normalized derivatives are likely candidates to correspond to interesting structures. Specifically, it is shown how this idea can be used as a major mechanism in algorithms for automatic scale selection, which adapt the local scales of processing to the local image structure.Support for the proposed approach is given in terms of a general theoretical investigation of the behaviour of the scale selection method under rescalings of the input pattern and by integration with different types of early visual modules, including experiments on real-world and synthetic data. Support is also given by a detailed analysis of how different types of feature detectors perform when integrated with a scale selection mechanism and then applied to characteristic model patterns. Specifically, it is described in detail how the proposed methodology applies to the problems of blob detection, junction detection, edge detection, ridge detection and local frequency estimation.In many computer vision applications, the poor performance of the low-level vision modules constitutes a major bottleneck. It is argued that the inclusion of mechanisms for automatic scale selection is essential if we are to construct vision systems to automatically analyse complex unknown environments.  相似文献   

20.
Anomaly Detection Using Real-Valued Negative Selection   总被引:23,自引:0,他引:23  
This paper describes a real-valued representation for the negative selection algorithm and its applications to anomaly detection. In many anomaly detection applications, only positive (normal) samples are available for training purpose. However, conventional classification algorithms need samples for all classes (e.g. normal and abnormal) during the training phase. This approach uses only normal samples to generate abnormal samples, which are used as input to a classification algorithm. This hybrid approach is compared against an anomaly detection technique that uses self-organizing maps to cluster the normal data sets (samples). Experiments are performed with different data sets and some results are reported.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号