共查询到10条相似文献,搜索用时 0 毫秒
1.
Exploring video content structure for hierarchical summarization 总被引:4,自引:0,他引:4
In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the supergroup into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.Published online: 15 September 2004
Corespondence to: Xingquan ZhuThis research has been supported by the NSF under grants 9972883-EIA, 9974255-IIS, 9983248-EIA, and 0209120-IIS, a grant from the state of Indiana 21th Century Fund, and by the U.S. Army Research Laboratory and the U.S. Army Research Office under grant DAAD19-02-1-0178. 相似文献
2.
This paper presents a two-level queueing system for dynamic summarization and interactive searching of video content. Video
frames enter the queueing system; some insignificant and redundant frames are removed; the remaining frames are pulled out
of the system as top-level key frames. Using an energy-minimization method, the first queue removes the video frames that
constitute the gradual transitions of video shots. The second queue measures the content similarity of video frames and reduces
redundant frames. In the queueing system, all key frames are linked in a directed-graph index structure, allowing video content
to be accessed at any level-of-detail. Furthermore, this graph-based index structure enables interactive video content exploration,
and the system is able to retrieve the video key frames that complement the video content already viewed by users. Experimental
results on four full-length videos show that our queueing system performs much better than two existing methods on video key
frame selection at different compression ratios. The evaluation on video content search shows that our interactive system
is more effective than other systems on eight video searching tasks. Compared with the regular media player, our system reduces
the average content searching time by half. 相似文献
3.
4.
Ewa Kijak Guillaume Gravier Lionel Oisel Patrick Gros 《Multimedia Tools and Applications》2006,30(3):289-311
This paper focuses on the integration of multimodal features for sport video structure analysis. The method relies on a statistical model which takes into account both the shot content and the interleaving of shots. This stochastic modelling is performed in the global framework of Hidden Markov Models (HMMs) that can be efficiently applied to merge audio and visual cues. Our approach is validated in the particular domain of tennis videos. The model integrates prior information about tennis content and editing rules. The basic temporal unit is the video shot. Visual features are used to characterize the type of shot view. Audio features describe the audio events within a video shot. Two sets of audio features are used in this study: the first one is extracted from a manual segmentation of the soundtrack and is more reliable. The second one is provided by an automatic segmentation and classification process. As a result of the overall HMM process, typical tennis scenes are simultaneously segmented and identified. The experiments illustrate the improvement of HMM-based fusion over indexing using only the best single media, when both media are of similar quality.
相似文献
Ewa KijakEmail: |
5.
This paper proposes a framework to aid video analysts in detecting suspicious activity within the tremendous amounts of video
data that exists in today’s world of omnipresent surveillance video. Ideas and techniques for closing the semantic gap between
low-level machine readable features of video data and high-level events seen by a human observer are discussed. An evaluation
of the event classification and detection technique is presented and a future experiment to refine this technique is proposed.
These experiments are used as a lead to a discussion on the most optimal machine learning algorithm to learn the event representation
scheme proposed in this paper.
相似文献
Bhavani ThuraisinghamEmail: |
6.
7.
This paper focuses on learning recognition systems able to cope with sequential data for classification and segmentation tasks. It investigates the integration of discriminant power in the learning of generative models, which are usually used for such data. Based on a procedure that transforms a sample data into a generative model, learning is viewed as the selection of efficient component models in a mixture of generative models. This may be done through the learning of a support vector machine. We propose a few kernels for this and report experimental results for classification and segmentation tasks. 相似文献
8.
Transient analysis of stochastic fluid models 总被引:5,自引:0,他引:5
B. Sericola 《Performance Evaluation》1998,32(4):245-263
We analyze the transient behavior of stochastic fluid flow models in which the input and output rates are controlled by a finite homogeneous Markov process. Such models are used in asynchronous transfer mode (ATM) to evaluate the performance of fast packet switching and in manufacturing systems for the performance of producers and consumers coupled by a buffer. The transient analysis of such models has already been considered in earlier works and solutions have been obtained by the use of Laplace transform. We derive in this paper a new transient solution only based on recurrence relations. We show that this solution is particularly interesting for its numerical properties. The limiting behavior of the solution is also considered. We empirically show that the algorithm for computing the transient solution can be stopped when some stationary behavior is detected. 相似文献
9.
Jungong Han Peter H.N. de WithAshley Merien Guid Oei 《Pattern recognition letters》2012,33(4):453-461
This paper addresses the problem of assessing a trainee’s performance during a simulated delivery training by employing automatic analysis of a video camera signal. We aim at providing objective statistics reflecting the trainee’s behavior, so that the instructor is able to give valuable suggestions after the training. The basic idea is to analyze the moving and location parameters of the trainee, on which the behavior of the trainee can be judged and also compared. Our system consists of three major steps. In the first step, we label specific pixels with a given color, based on a Gaussian model. In the second step, the mean shift (MS) algorithm is employed to find the densest region of a color, where the center of that region indicates the center of a medical cap worn by a trainee. To accelerate the convergence of the MS algorithm, we propose to combine the distribution sampling and on-line mode updating based on the pyramid sampling technique. In the last step, we assume that the cap’s position represents the position of a trainee. Therefore, several statistics, such as the moving trajectory and the total movement of each trainee, can be calculated. These statistics associated with the domain knowledge, help us to determine trainees’ teamwork. Our system also enables an interactive way for instructors to choose the interested individual trainee, and then provides more results of him. Experimental evaluations using real delivery training videos demonstrate the effectiveness of the proposed work.1 相似文献
10.
Automatic text segmentation and text recognition for video indexing 总被引:13,自引:0,他引:13
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval
is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of
text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable
and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their
complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single
bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate
the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments
to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable
for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging
and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics
in videos. 相似文献