首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 0 毫秒
1.
Exploring video content structure for hierarchical summarization   总被引:4,自引:0,他引:4  
In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the supergroup into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.Published online: 15 September 2004 Corespondence to: Xingquan ZhuThis research has been supported by the NSF under grants 9972883-EIA, 9974255-IIS, 9983248-EIA, and 0209120-IIS, a grant from the state of Indiana 21th Century Fund, and by the U.S. Army Research Laboratory and the U.S. Army Research Office under grant DAAD19-02-1-0178.  相似文献   

2.
This paper presents a two-level queueing system for dynamic summarization and interactive searching of video content. Video frames enter the queueing system; some insignificant and redundant frames are removed; the remaining frames are pulled out of the system as top-level key frames. Using an energy-minimization method, the first queue removes the video frames that constitute the gradual transitions of video shots. The second queue measures the content similarity of video frames and reduces redundant frames. In the queueing system, all key frames are linked in a directed-graph index structure, allowing video content to be accessed at any level-of-detail. Furthermore, this graph-based index structure enables interactive video content exploration, and the system is able to retrieve the video key frames that complement the video content already viewed by users. Experimental results on four full-length videos show that our queueing system performs much better than two existing methods on video key frame selection at different compression ratios. The evaluation on video content search shows that our interactive system is more effective than other systems on eight video searching tasks. Compared with the regular media player, our system reduces the average content searching time by half.  相似文献   

3.
4.
This paper focuses on the integration of multimodal features for sport video structure analysis. The method relies on a statistical model which takes into account both the shot content and the interleaving of shots. This stochastic modelling is performed in the global framework of Hidden Markov Models (HMMs) that can be efficiently applied to merge audio and visual cues. Our approach is validated in the particular domain of tennis videos. The model integrates prior information about tennis content and editing rules. The basic temporal unit is the video shot. Visual features are used to characterize the type of shot view. Audio features describe the audio events within a video shot. Two sets of audio features are used in this study: the first one is extracted from a manual segmentation of the soundtrack and is more reliable. The second one is provided by an automatic segmentation and classification process. As a result of the overall HMM process, typical tennis scenes are simultaneously segmented and identified. The experiments illustrate the improvement of HMM-based fusion over indexing using only the best single media, when both media are of similar quality.
Ewa KijakEmail:
  相似文献   

5.
This paper proposes a framework to aid video analysts in detecting suspicious activity within the tremendous amounts of video data that exists in today’s world of omnipresent surveillance video. Ideas and techniques for closing the semantic gap between low-level machine readable features of video data and high-level events seen by a human observer are discussed. An evaluation of the event classification and detection technique is presented and a future experiment to refine this technique is proposed. These experiments are used as a lead to a discussion on the most optimal machine learning algorithm to learn the event representation scheme proposed in this paper.
Bhavani ThuraisinghamEmail:
  相似文献   

6.
7.
This paper focuses on learning recognition systems able to cope with sequential data for classification and segmentation tasks. It investigates the integration of discriminant power in the learning of generative models, which are usually used for such data. Based on a procedure that transforms a sample data into a generative model, learning is viewed as the selection of efficient component models in a mixture of generative models. This may be done through the learning of a support vector machine. We propose a few kernels for this and report experimental results for classification and segmentation tasks.  相似文献   

8.
Transient analysis of stochastic fluid models   总被引:5,自引:0,他引:5  
We analyze the transient behavior of stochastic fluid flow models in which the input and output rates are controlled by a finite homogeneous Markov process. Such models are used in asynchronous transfer mode (ATM) to evaluate the performance of fast packet switching and in manufacturing systems for the performance of producers and consumers coupled by a buffer. The transient analysis of such models has already been considered in earlier works and solutions have been obtained by the use of Laplace transform. We derive in this paper a new transient solution only based on recurrence relations. We show that this solution is particularly interesting for its numerical properties. The limiting behavior of the solution is also considered. We empirically show that the algorithm for computing the transient solution can be stopped when some stationary behavior is detected.  相似文献   

9.
This paper addresses the problem of assessing a trainee’s performance during a simulated delivery training by employing automatic analysis of a video camera signal. We aim at providing objective statistics reflecting the trainee’s behavior, so that the instructor is able to give valuable suggestions after the training. The basic idea is to analyze the moving and location parameters of the trainee, on which the behavior of the trainee can be judged and also compared. Our system consists of three major steps. In the first step, we label specific pixels with a given color, based on a Gaussian model. In the second step, the mean shift (MS) algorithm is employed to find the densest region of a color, where the center of that region indicates the center of a medical cap worn by a trainee. To accelerate the convergence of the MS algorithm, we propose to combine the distribution sampling and on-line mode updating based on the pyramid sampling technique. In the last step, we assume that the cap’s position represents the position of a trainee. Therefore, several statistics, such as the moving trajectory and the total movement of each trainee, can be calculated. These statistics associated with the domain knowledge, help us to determine trainees’ teamwork. Our system also enables an interactive way for instructors to choose the interested individual trainee, and then provides more results of him. Experimental evaluations using real delivery training videos demonstrate the effectiveness of the proposed work.1  相似文献   

10.
Automatic text segmentation and text recognition for video indexing   总被引:13,自引:0,他引:13  
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号