首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Audiovisual integration for tennis broadcast structuring   总被引:2,自引:2,他引:0  
This paper focuses on the integration of multimodal features for sport video structure analysis. The method relies on a statistical model which takes into account both the shot content and the interleaving of shots. This stochastic modelling is performed in the global framework of Hidden Markov Models (HMMs) that can be efficiently applied to merge audio and visual cues. Our approach is validated in the particular domain of tennis videos. The model integrates prior information about tennis content and editing rules. The basic temporal unit is the video shot. Visual features are used to characterize the type of shot view. Audio features describe the audio events within a video shot. Two sets of audio features are used in this study: the first one is extracted from a manual segmentation of the soundtrack and is more reliable. The second one is provided by an automatic segmentation and classification process. As a result of the overall HMM process, typical tennis scenes are simultaneously segmented and identified. The experiments illustrate the improvement of HMM-based fusion over indexing using only the best single media, when both media are of similar quality.
Ewa KijakEmail:
  相似文献   

2.
This paper presents a framework that explicitly detects events in broadcasting baseball videos and facilitates the development of many practical applications. Three phases of contributions are included in this work: reliable shot classification, explicit event detection, and elaborate applications. At the shot classification stage, color and geometric information are utilized to classify video shots into several canonical views. To explicitly detect semantic events, rule-based decision and model-based decision methods are developed. We emphasize that this system efficiently and exactly identifies what happened in baseball games rather than roughly finding some interesting parts. On the basis of explicit event detection, many accurate and practical applications such as automatic box score generation and game summarization could be built. The reported results show the effectiveness of the proposed framework and demonstrate some research opportunities about bridging the semantic gap for sports videos.
Ja-Ling WuEmail:
  相似文献   

3.
Grouping video content into semantic segments and classifying semantic scenes into different types are the crucial processes to content-based video organization, management and retrieval. In this paper, a novel approach to automatically segment scenes and semantically represent scenes is proposed. Firstly, video shots are detected using a rough-to-fine algorithm. Secondly, key-frames within each shot are selected adaptively with hybrid features, and redundant key-frames are removed by template matching. Thirdly, spatio-temporal coherent shots are clustered into the same scene based on the temporal constraint of video content and visual similarity between shot activities. Finally, under the full analysis of typical characters on continuously recorded videos, scene content is semantically represented to satisfy human demand on video retrieval. The proposed algorithm has been performed on various genres of films and TV program. Promising experimental results show that the proposed method makes sense to efficient retrieval of interesting video content.
Yuncai LiuEmail:
  相似文献   

4.
In this paper, we address the problem of video frame rate up-conversion (FRC) in the compressed domain. FRC is often recognized as video temporal interpolation. This problem is very challenging when targeted for video sequences with inconsistent camera and object motion, such as sports videos. A novel compressed domain motion compensation scheme is presented and applied in this paper, aiming at up-sampling frame rates in sports videos. MPEG-2 encoded motion vectors (MVs) are utilized as inputs in the proposed algorithm. The decoded MVs undergo a cumulative spatiotemporal interpolation. An iterative rejection scheme based on the dense motion vector field (MVF) and the generalized affine motion model is exploited to detect global camera motion. Subsequently, the foreground object separation is performed by additionally examining the temporal consistency of the output of iterative rejections. This consistency check process helps coalesce the resulting foreground blocks and weed out the unqualified blocks. Finally, different compensation strategies for the camera and object motions are applied to interpolate the new frames. Illustrative examples are provided to demonstrate the efficacy of the proposed approach. Experimental results are compared with the popular block and non-block based frame interpolation approaches.
Jinsong WangEmail:
  相似文献   

5.
6.
In this paper, we propose an innovative architecture to segment a news video into the so-called “stories” by both using the included video and audio information. Segmentation of news into stories is one of the key issues for achieving efficient treatment of news-based digital libraries. While the relevance of this research problem is widely recognized in the scientific community, we are in presence of a few established solutions in the field. In our approach, the segmentation is performed in two steps: first, shots are classified by combining three different anchor shot detection algorithms using video information only. Then, the shot classification is improved by using a novel anchor shot detection method based on features extracted from the audio track. Tests on a large database confirm that the proposed system outperforms each single video-based method as well as their combination.
Mario VentoEmail:
  相似文献   

7.
This paper proposes a new approach for shot-based retrieval by optimal matching (OM), which provides an effective mechanism for the similarity measure and ranking of shots by one-to-one matching. In the proposed approach, a weighted bipartite graph is constructed to model the color similarity between two shots. Then OM based on Kuhn–Munkres algorithm is employed to compute the maximum weight of a constructed bipartite graph as the shot similarity value by one-to-one matching among frames. To improve the speed efficiency of OM, two improved algorithms are also proposed: bipartite graph construction based on subshots and bipartite graph construction based on the same number of keyframes. Besides color similarity, motion feature is also employed for shot similarity measure. A motion histogram is constructed for each shot, the motion similarity between two shots is then measured by the intersection of their motion histograms. Finally, the shot similarity is based on the linear combination of color and motion similarity. Experimental results indicate that the proposed approach achieves better performance than other methods in terms of ranking and retrieval capability.
Jianguo XiaoEmail:
  相似文献   

8.
We introduce a cinematographic video production system to create movie-like attractive footage from our indoor daily life. Since the system is designed for ordinary users in non-studio environments, it is composed of standard hardware components, provides a simple interface, and works in near real-time of 5 ~ 6 frames/sec. The proposed system reconstructs a visual hull from acquired multiple videos and then generates final videos from the model by referring to the camera shots used in film-making. The proposed method utilizes “Reliability” to compensate for errors that may have occurred in non-studio environments and to produce the most natural scene from the reconstructed model. By using a virtual camera control system, even non-experts can easily convert the 3D model to movies that look as if they were created by experienced filmmakers.
Kiyoshi KogureEmail:
  相似文献   

9.
This paper proposes a new play segmentation algorithm using a local adaptive model for each sports game, in which the play start shots (PSS) that represent the start of each play segment are detected by comparing all of keyframes with the PSS model. The PSS model is calculated on the fly using generic clustering algorithm and a repetitive characteristic of the PSS. The end of each play segment (the play end shot (PES)) is determined by detecting close up shots using the field color extracted from the play start shots since the camera will focus on the players or the audience with close up view. Experimental results with 28 baseball videos show that good performance can be obtained with the proposed algorithm compared to other algorithms.
Jinguk JeongEmail:
  相似文献   

10.
Indexing and segmenting of video content by motion, color and texture has been intensively explored leading to a commonly used representation in a storyboard. In this paper, a novel method of visualization of video content is proposed. First of all, the content is segmented into shots, and then a spatio-temporal color signature of shots, based on color distribution in the frames, is proposed. This spatio-temporal color signature serves as a basis for graph clustering and graph visualization tools. Those, integrated in a platform for visualization of huge graphs, Tulip, supply an exciting graph-based navigation interface for multimedia content. The results obtained on feature documentaries are promising.
Jenny Benois-PineauEmail:
  相似文献   

11.
Efficient and robust shot change detection   总被引:6,自引:0,他引:6  
In this article, we deal with the problem of shot change detection which is of primary importance when trying to segment and abstract video sequences. Contrary to recent experiments, our aim is to elaborate a robust but very efficient (real-time even with uncompressed data) method to deal with the remaining problems related to shot change detection: illumination changes, context and data independency, and parameter settings. To do so, we have considered some adaptive threshold and derivative measures in a hue-saturation colour space. We illustrate our robust and efficient method by some experiments on news and football broadcast video sequences.
Nicole VincentEmail:
  相似文献   

12.
13.
In applications, such as post-production and archiving of audiovisual material, users are confronted with large amounts of redundant unedited raw material, called rushes. Viewing and organizing this material are crucial but time consuming tasks. Typically, multiple but slightly different takes of the same scene can be found in the rushes video. We propose a method for detecting and clustering takes of one scene shot from the same or very similar camera positions. An important subproblem is to determine the similarity of video segments. We propose a distance measure based on the Longest Common Subsequence (LCSS) model. Two variants of the proposed approach, one with a threshold parameter and one with automatically determined threshold, are compared against the Dynamic Time Warping (DTW) distance measure on six videos from the TRECVID 2007 BBC rushes summarization data set. We also evaluate the influence of the applied temporal segmentation method at the input on the results. Applications of the proposed method to automatic skimming and interactive browsing of rushes video are described.
Georg ThallingerEmail:
  相似文献   

14.
Caching collaboration and cache allocation in peer-to-peer video systems   总被引:1,自引:1,他引:0  
Providing scalable video services in a peer-to-peer (P2P) environment is challenging. Since videos are typically large and require high communication bandwidth for delivery, many peers may be unwilling to cache them in whole to serve others. In this paper, we address two fundamental research problems in providing scalable P2P video services: (1) how a host can find enough video pieces, which may scatter among the whole system, to assemble a complete video; and (2) given a limited buffer size, what part of a video a host should cache and what existing data should be expunged to make necessary space. We address these problems with two new ideas: Cell caching collaboration and Controlled Inverse Proportional (CIP) cache allocation. The Cell concept allows cost-effective caching collaboration in a fully distributed environment and can dramatically reduce video lookup cost. On the other hand, CIP cache allocation challenges the conventional caching wisdom by caching unpopular videos in higher priority. Our approach allows the system to retain many copies of popular videos to avoid creating hot spots and at the same time, prevent unpopular videos from being quickly evicted from the system. We have implemented a Gnutella-like simulation network and use it as a testbed to evaluate the proposed technique. Our extensive study shows convincingly the performance advantage of the new scheme.
Wallapak TavanapongEmail:
  相似文献   

15.
Efficient data broadcasting is independent of request arrivals, and is thus highly promising when transmitting popular videos. A conventionally adopted broadcasting method is periodic broadcasting, which divides a popular video into segments, which are then simultaneously broadcast on different data channels. Once clients want to watch the video, they download the segments from these channels. The skyscraper broadcasting (SkB) scheme supports clients with small bandwidths. An SkB client requires only two-channel bandwidths to receive video segments. This work proposes a reverse SkB (RSkB) scheme, which extends SkB by reducing buffering spaces. The RSkB is mathematically shown to achieve on-time video delivery and two-channel client bandwidths. A formula for determining the maximum number of segments buffered by an RSkB client is presented. Finally, an analysis of RSkB reveals that its client buffer requirements are usually 25–37% lower than SkB. Extensive simulations of RSkB further demonstrate that RSkB yields lower client buffer demand than other proposed systems.
Hsiang-Fu YuEmail:
  相似文献   

16.
In conventional motion compensated temporal filtering based wavelet coding scheme, where the group of picture structure and low-pass frame position are fixed, variations in motion activities of video sequences are not considered. In this paper, we propose an adaptive group of picture structure selection scheme, which the group of picture size and low-pass frame position are selected based on mutual information. Furthermore, the temporal decomposition process is determined adaptively according to the selected group of picture structure. A large amount of experimental work is carried out to compare the compression performance of proposed method with the conventional motion compensated temporal filtering encoding scheme and adaptive group of picture structure in standard scalable video coding model. The proposed low-pass frame selection can improve the compression quality by about 0.3–0.5 dB comparing to the conventional scheme in video sequences with high motion activities. In the scenes with un-even variation of motion activities, e.g. frequent shot cuts, the proposed adaptive group of picture size can achieve a better compression capability than conventional scheme. When comparing to adaptive group of picture in standard scalable video coding model, the proposed group of picture structure scheme can lead to about 0.2~0.8 dB improvements in sequences with high motion activities or shot cut.
Zhao-Guang LiuEmail:
  相似文献   

17.
This paper describes security and privacy issues for multimedia database management systems. Multimedia data includes text, images, audio and video. It describes access control for multimedia database management systems and describes security policies and security architectures for such systems. Privacy problems that result from multimedia data mining are also discussed.
Bhavani ThuraisinghamEmail:
  相似文献   

18.
Nowadays data mining plays an important role in decision making. Since many organizations do not possess the in-house expertise of data mining, it is beneficial to outsource data mining tasks to external service providers. However, most organizations hesitate to do so due to the concern of loss of business intelligence and customer privacy. In this paper, we present a Bloom filter based solution to enable organizations to outsource their tasks of mining association rules, at the same time, protect their business intelligence and customer privacy. Our approach can achieve high precision in data mining by trading-off the storage requirement. This research was supported by the USA National Science Foundation Grants CCR-0310974 and IIS-0546027.
Ling Qiu (Corresponding author)Email:
Yingjiu LiEmail:
Xintao WuEmail:
  相似文献   

19.
Automatic personalized video abstraction for sports videos using metadata   总被引:1,自引:1,他引:0  
Video abstraction is defined as creating a video abstract which includes only important information in the original video streams. There are two general types of video abstracts, namely the dynamic and static ones. The dynamic video abstract is a 3-dimensional representation created by temporally arranging important scenes while the static video abstract is a 2-dimensional representation created by spatially arranging only keyframes of important scenes. In this paper, we propose a unified method of automatically creating these two types of video abstracts considering the semantic content targeting especially on broadcasted sports videos. For both types of video abstracts, the proposed method firstly determines the significance of scenes. A play scene, which corresponds to a play, is considered as a scene unit of sports videos, and the significance of every play scene is determined based on the play ranks, the time the play occurred, and the number of replays. This information is extracted from the metadata, which describes the semantic content of videos and enables us to consider not only the types of plays but also their influence on the game. In addition, user’s preferences are considered to personalize the video abstracts. For dynamic video abstracts, we propose three approaches for selecting the play scenes of the highest significance: the basic criterion, the greedy criterion, and the play-cut criterion. For static video abstracts, we also propose an effective display style where a user can easily access target scenes from a list of keyframes by tracing the tree structures of sports games. We experimentally verified the effectiveness of our method by comparing our results with man-made video abstracts as well as by conducting questionnaires.
Noboru BabaguchiEmail:
  相似文献   

20.
Streaming of scalable h.264 videos over the Internet   总被引:1,自引:0,他引:1  
To investigate the benefits of scalable codecs in the case of rate adaptation problem, a streaming system for scalable H.264 videos has been implemented. The system considers congestion level in the network and buffer status at the client during adaptation process. The rate adaptation algorithm is content adaptive. It selects an appropriate substream from the video file by taking into account the motion dynamics of video. The performance of the system has been tested under congestion-free and congestion scenarios. The performance results indicate that the system reacts to congestion properly and can be used for Internet video streaming where losses occur unpredictably.
Aylin KantarcıEmail:
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号