首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
The paper presents an automatic video summarization technique based on graph theory methodology and the dominant sets clustering algorithm. The large size of the video data set is handled by exploiting the connectivity information of prototype frames that are extracted from a down-sampled version of the original video sequence. The connectivity information for the prototypes which is obtained from the whole set of data improves video representation and reveals its structure. Automatic selection of the optimal number of clusters and hereafter keyframes is accomplished at a next step through the dominant set clustering algorithm. The method is free of user-specified modeling parameters and is evaluated in terms of several metrics that quantify its content representational ability. Comparison of the proposed summarization technique to the Open Video storyboard, the Adaptive clustering algorithm and the Delaunay clustering approach, is provided.
D. BesirisEmail:
  相似文献   

2.
Grouping video content into semantic segments and classifying semantic scenes into different types are the crucial processes to content-based video organization, management and retrieval. In this paper, a novel approach to automatically segment scenes and semantically represent scenes is proposed. Firstly, video shots are detected using a rough-to-fine algorithm. Secondly, key-frames within each shot are selected adaptively with hybrid features, and redundant key-frames are removed by template matching. Thirdly, spatio-temporal coherent shots are clustered into the same scene based on the temporal constraint of video content and visual similarity between shot activities. Finally, under the full analysis of typical characters on continuously recorded videos, scene content is semantically represented to satisfy human demand on video retrieval. The proposed algorithm has been performed on various genres of films and TV program. Promising experimental results show that the proposed method makes sense to efficient retrieval of interesting video content.
Yuncai LiuEmail:
  相似文献   

3.
This paper addresses the problem of ensuring the integrity of a digital video and presents a scalable signature scheme for video authentication based on cryptographic secret sharing. The proposed method detects spatial cropping and temporal jittering in a video, yet is robust against frame dropping in the streaming video scenario. In our scheme, the authentication signature is compact and independent of the size of the video. Given a video, we identify the key frames based on differential energy between the frames. Considering video frames as shares, we compute the corresponding secret at three hierarchical levels. The master secret is used as digital signature to authenticate the video. The proposed signature scheme is scalable to three hierarchical levels of signature computation based on the needs of different scenarios. We provide extensive experimental results to show the utility of our technique in three different scenarios—streaming video, video identification and face tampering.
Mohan S. KankanhalliEmail:
  相似文献   

4.
Real-time 2D to 3D video conversion   总被引:1,自引:0,他引:1  
We present a real-time implementation of 2D to 3D video conversion using compressed video. In our method, compressed 2D video is analyzed by extracting motion vectors. Using the motion vector maps, depth maps are built for each frame and the frames are segmented to provide object-wise depth ordering. These data are then used to synthesize stereo pairs. 3D video synthesized in this fashion can be viewed using any stereoscopic display. In our implementation, anaglyph projection was selected as the 3D visualization method, because it is mostly suited to standard displays.
Ianir IdesesEmail:
  相似文献   

5.
In this paper, we propose a new real-time content filtering framework for live broadcasts in TV terminals. Content filtering in TV terminals is a necessary provision of personalized broadcasting services in that it enables a TV viewer to obtain desired scenes from multiple channel broadcasts. In this paper, a stable and reliable filtering structure and an algorithm for multiple inputs are proposed. Moreover, real-time filtering requirements such as frame sampling rate per channel, number of input channels, and buffer condition are analyzed to achieve real-time processing in terminals with limited computing power. Based on queueing theory, we model the system and resolve the filtering requirements. To verify the proposed system and analysis, a filtering algorithm for soccer videos is applied which is modified for real-time processing. Through analysis of visual features (e.g., dominant color and edge components) and detection of spatial objects (e.g., a score board), it recognizes a temporal pattern between successive video frames and filters desired scenes. Experiments on soccer videos have been performed and the results validate the effectiveness of the proposed approach and system.
Yong Man Ro (Corresponding author)Email:
  相似文献   

6.
The paper presents a real-time algorithm that compensates image distortions due to atmospheric turbulence in video sequences, while keeping the real moving objects in the video unharmed. The algorithm involves (1) generation of a “reference” frame, (2) estimation, for each incoming video frame, of a local image displacement map with respect to the reference frame, (3) segmentation of the displacement map into two classes: stationary and moving objects; (4) turbulence compensation of stationary objects. Experiments with both simulated and real-life sequences have shown that the restored videos, generated in real-time using standard computer hardware, exhibit excellent stability for stationary objects while retaining real motion.
Barak FishbainEmail:
  相似文献   

7.
Online updating appearance generative mixture model for meanshift tracking   总被引:1,自引:0,他引:1  
This paper proposes an appearance generative mixture model based on key frames for meanshift tracking. Meanshift tracking algorithm tracks an object by maximizing the similarity between the histogram in tracking window and a static histogram acquired at the beginning of tracking. The tracking therefore could fail if the appearance of the object varies substantially. In this paper, we assume the key appearances of the object can be acquired before tracking and the manifold of the object appearance can be approximated by piece-wise linear combination of these key appearances in histogram space. The generative process is described by a Bayesian graphical model. An Online EM algorithm is proposed to estimate the model parameters from the observed histogram in the tracking window and to update the appearance histogram. We applied this approach to track human head motion and to infer the head pose simultaneously in videos. Experiments verify that our online histogram generative model constrained by key appearance histograms alleviates the drifting problem often encountered in tracking with online updating, that the enhanced meanshift algorithm is capable of tracking object of varying appearances more robustly and accurately, and that our tracking algorithm can infer additional information such as the object poses. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.
Jilin Tu (Corresponding author)Email:
Hai TaoEmail:
Thomas HuangEmail:
  相似文献   

8.
In applications, such as post-production and archiving of audiovisual material, users are confronted with large amounts of redundant unedited raw material, called rushes. Viewing and organizing this material are crucial but time consuming tasks. Typically, multiple but slightly different takes of the same scene can be found in the rushes video. We propose a method for detecting and clustering takes of one scene shot from the same or very similar camera positions. An important subproblem is to determine the similarity of video segments. We propose a distance measure based on the Longest Common Subsequence (LCSS) model. Two variants of the proposed approach, one with a threshold parameter and one with automatically determined threshold, are compared against the Dynamic Time Warping (DTW) distance measure on six videos from the TRECVID 2007 BBC rushes summarization data set. We also evaluate the influence of the applied temporal segmentation method at the input on the results. Applications of the proposed method to automatic skimming and interactive browsing of rushes video are described.
Georg ThallingerEmail:
  相似文献   

9.
Efficient video encryption scheme based on advanced video coding   总被引:1,自引:0,他引:1  
A video encryption scheme combining with advanced video coding (AVC) is presented and analyzed in this paper, which is different from the ones used in MPEG1/2 video encryption. In the proposed scheme, the intra-prediction mode and motion vector difference are encrypted with the length-kept encryption algorithm (LKE) in order to keep the format compliance, and the residue data of the macroblocks are encrypted with the residue data encryption algorithm (RDE) in order to keep low cost. Additionally, a key distribution scheme is proposed to keep the robustness to transmission errors, which assigns sub-keys to different frames or slices independently. The encryption scheme’s security, time efficiency and error robustness are analyzed in detail. Experimental results show that the encryption scheme keeps file format unchanged, is secure against replacement attacks, is efficient in computing, and is robust to some transmission errors. These properties make it a suitable choice for real-time applications, such as secure IPTV, secure videoconference or mobile/wireless multimedia, etc.
Shiguo LianEmail:
  相似文献   

10.
In this paper, we address the problem of video frame rate up-conversion (FRC) in the compressed domain. FRC is often recognized as video temporal interpolation. This problem is very challenging when targeted for video sequences with inconsistent camera and object motion, such as sports videos. A novel compressed domain motion compensation scheme is presented and applied in this paper, aiming at up-sampling frame rates in sports videos. MPEG-2 encoded motion vectors (MVs) are utilized as inputs in the proposed algorithm. The decoded MVs undergo a cumulative spatiotemporal interpolation. An iterative rejection scheme based on the dense motion vector field (MVF) and the generalized affine motion model is exploited to detect global camera motion. Subsequently, the foreground object separation is performed by additionally examining the temporal consistency of the output of iterative rejections. This consistency check process helps coalesce the resulting foreground blocks and weed out the unqualified blocks. Finally, different compensation strategies for the camera and object motions are applied to interpolate the new frames. Illustrative examples are provided to demonstrate the efficacy of the proposed approach. Experimental results are compared with the popular block and non-block based frame interpolation approaches.
Jinsong WangEmail:
  相似文献   

11.
This paper presents an FPGA-based architecture for local tone mapping of gray scale high dynamic range images. The architecture is described in VHDL and has been synthesized using Altera Quartus tools. It achieves an operating frequency consistent with a video rate of 60 frames per second for a frame of 1,024 × 768 pixels. The proposed architecture is a modification of the nine-scale Reinhard operator. Approximations to the original Reinhard operator ensure that the operator is amenable to implementation in hardware. A peak signal-to-noise ratio study shows that our fixed-point hardware approximation produces results similar to a floating-point original.
Joan E. CarlettaEmail:
  相似文献   

12.
The objective measurement of blocking artifacts plays an important role in the design, optimization, and assessment of image and video compression. In this paper, we propose a novel measurement algorithm for blocking artifacts. Computer simulation results indicate that the proposed method accurately measures the blocking artifacts without using the original image. Moreover, the proposed algorithm can be easily implemented in both pixel and DCT domains.
Chun-Su ParkEmail:
  相似文献   

13.
Multimedia applications often have performance requirements that make these applications computing resource intense; e.g., the number of video frames displayed to the user must be about 25 frames per second. A user hint is an indication of the interest that a user has in an application. Examples of user hints include a screen saver being invoked or covering a window with another window. In the case that the user is running video, the occurrence of these hints imply that the user is no longer viewing the video. However, the resource usage of the application has not changed. This paper describes an architecture that make use of user hints to reduce the resource consumption of an application. The emphasis is on network traffic and CPU usage. Experimental results are presented.
Hanan LutfiyyaEmail:
  相似文献   

14.
Video is an information-intensive media with much redundancy. Therefore, it is desirable to be able to mine structure or semantics of video data for efficient browsing, summarization and highlight extraction. In this paper, we propose a mosaic based approach to key-event as well as structure mining, which is regarded as a complementary view for sports video analysis. Mosaic is generated for each shot by a novel efficient mosaicing scheme, which constructs a global motion path and selects a best subset of frames for mosaicing. These improved mosaics are then used as the representative image of shot content. Based on mosaic, the structure and event in sports video are mined by the methods with prior knowledge and without prior knowledge. Without prior knowledge, our system is able to locate global view shots taken by dominant camera. If prior knowledge is available, the events in these global view shots are detected using robust features extracted from mosaics. For global view mining, the experiments compared with key-frame-based scheme have demonstrated that this mosaic-based scheme presents better results in several kinds of sports videos; for events mining, the detection of key-plays and key-events in the specific-domain of soccer videos have proved its effectiveness.
Xian-Sheng HuaEmail:
  相似文献   

15.
Using the multiple reference frames compensation in the H264 coder improves the coding efficiency for sequences which contain uncovered backgrounds, repetitive motions and highly textured areas. Unfortunately this technique requires excessive memory and computation resources. In this article, we proposed and implemented a technique based on Markov Random Fields Algorithm relying on robust moving pixel segmentation. By the introduction of this technique, we were able to decrease the number of reference frames from five to three while keeping similar video coding performances. The coding time decreased by 35% and the sequence quality was preserved. After the validation of our idea, we evaluated the processing time of the Markov algorithm on architectures intended for embedded multimedia applications. Both DSP and FPGA implementations were explored. We were able to process 50 frames(128 × 128)/s on the EP1S10 FPGA paltform and 35 frames(128 × 128)/s on the ADSP BF533.
Patrick GardaEmail:
  相似文献   

16.
Quantitative usability requirements are a critical but challenging, and hence an often neglected aspect of a usability engineering process. A case study is described where quantitative usability requirements played a key role in the development of a new user interface of a mobile phone. Within the practical constraints of the project, existing methods for determining usability requirements and evaluating the extent to which these are met, could not be applied as such, therefore tailored methods had to be developed. These methods and their applications are discussed.
Timo Jokela (Corresponding author)Email:
Jussi KoivumaaEmail:
Jani PirkolaEmail:
Petri SalminenEmail:
Niina KantolaEmail:
  相似文献   

17.
This paper presents an efficient VLSI architecture for fast implementation of sub-pixel interpolation of H.264/AVC. Several optimization techniques at different design levels, such as parallel processing, vector register, pipeline architecture, and in-place computation, are utilized to reduce the number of memory access and accelerate the interpolation computations. The proposed application-specific processor can meet the real-time constraint of the sub-pixel interpolation algorithm for the 16:9 video format (4,690 × 2,304) at 30 frames per second (fps) at 100 MHz clock rate.
Philip P. DangEmail:
  相似文献   

18.
Television daily produces massive amounts of videos. Digital video is unfortunately an unstructured document in which it is very difficult to find any information. Television streams have however a strong and stable but hidden structure that we want to discover by detecting repeating objects in the video stream. This paper shows that television streams are actually highly redundant and that detecting repeats can be an effective way to detect the underlying structure of the video. A method for detecting these repetitions is presented here with an emphasis on the efficiency of the search in a large video corpus. Very good results are obtained both in terms of effectiveness (98% in recall and precision) as well as efficiency since one day of video is queried against a 3 weeks dataset in only 1 s.
Patrick GrosEmail:
  相似文献   

19.
In conventional motion compensated temporal filtering based wavelet coding scheme, where the group of picture structure and low-pass frame position are fixed, variations in motion activities of video sequences are not considered. In this paper, we propose an adaptive group of picture structure selection scheme, which the group of picture size and low-pass frame position are selected based on mutual information. Furthermore, the temporal decomposition process is determined adaptively according to the selected group of picture structure. A large amount of experimental work is carried out to compare the compression performance of proposed method with the conventional motion compensated temporal filtering encoding scheme and adaptive group of picture structure in standard scalable video coding model. The proposed low-pass frame selection can improve the compression quality by about 0.3–0.5 dB comparing to the conventional scheme in video sequences with high motion activities. In the scenes with un-even variation of motion activities, e.g. frequent shot cuts, the proposed adaptive group of picture size can achieve a better compression capability than conventional scheme. When comparing to adaptive group of picture in standard scalable video coding model, the proposed group of picture structure scheme can lead to about 0.2~0.8 dB improvements in sequences with high motion activities or shot cut.
Zhao-Guang LiuEmail:
  相似文献   

20.
The purposes of this study are (a) to establish a measurement for evaluating conversational impressions of group discussions, and (b) to make an exploratory investigation on their interactional processes which may affect to form those impressions. The impression rating and factor analysis undertaken first give us four factors concerning conversational impressions of “focus group interviews (FGIs)”: conversational activeness, conversational sequencing, the attitudes of participants and the relationships of participants. In relation to the factors of conversational activeness and conversational sequencing in particular, the microanalysis of four selected topical scenes from our database further shows that the behavior of the moderator and the interviewees is organized not independently but with reference to each other. The study thus emphasizes the importance of the integration of quantitative and qualitative approaches towards human interactions.
Kana Suzuki (Corresponding author)Email: Email:
Ikuyo MorimotoEmail:
Etsuo MizukamiEmail:
Hiroko OtsukaEmail:
Hitoshi IsaharaEmail:
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号