期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Sudden scene change detection in compressed video using interpolated macroblocks in B-frames

W. A. C. Fernando 《Multimedia Tools and Applications》2006,28(3):301-320

This paper addresses an important area in video processing, namely compressed domain processing. For video indexing, video scene transition detection is an essential step to segment the video. Current techniques for scene change detection tend to suffer from a major limitation as most of them cannot identify scene transitions in the compressed domain. Since most video is expected to be stored in the compressed domain, scene transition detection in this domain is highly desirable. In this paper an algorithm for video scene change detection is proposed to overcome this limitation. In this scheme, properties of the B-frames are used as it is capable of measuring the correlation between two adjacent reference frames. The results show that this scheme performs better than schemes based on P-frames. Proposed scheme can be directly applied with compressed data with minimum decompression and hence it is computationally efficient and makes real time implementations possible. Results show that video scene transitions can be identified satisfactorily with the proposed scheme. 相似文献

2.

一种层次视频摘要生成方法

程文刚须德《中国图象图形学报》2004,9(1):118-123

视频摘要是视频内容的一种压缩表示方式。为了能够更好地浏览视频，提出了一种根据浏览或检索的粒度不同来建立两种层次视频摘要(镜头级和场景级)的思想，并给出了一种视频摘要生成方法：首先用一种根据内容变化自动提取镜头内关键帧的方法来实现关键帧的提取；继而用一种改进的时间自适应算法通过镜头的组合来得到场景；最后在场景级用最小生成树方法提取代表帧。由于关键帧和代表帧分别代表了它们所在镜头和场景的主要内容，因此它们的序列就构成了视频总结。一些电影视频片段检验的实验结果表明，这种生成方法能够较好地提供粗细两种粒度的视频内容总结。相似文献

3.

Statistical sequential analysis for real-time video scene change detection on compressed multimedia bitstream 总被引：1，自引：0，他引：1

Lelescu D. Schonfeld D. 《Multimedia, IEEE Transactions on》2003,5(1):106-117

The increased availability and usage of multimedia information have created a critical need for efficient multimedia processing algorithms. These algorithms must offer capabilities related to browsing, indexing, and retrieval of relevant data. A crucial step in multimedia processing is that of reliable video segmentation into visually coherent video shots through scene change detection. Video segmentation enables subsequent processing operations on video shots, such as video indexing, semantic representation, or tracking of selected video information. Since video sequences generally contain both abrupt and gradual scene changes, video segmentation algorithms must be able to detect a large variety of changes. While existing algorithms perform relatively well for detecting abrupt transitions (video cuts), reliable detection of gradual changes is much more difficult. A novel one-pass, real-time approach to video scene change detection based on statistical sequential analysis and operating on a compressed multimedia bitstream is proposed. Our approach models video sequences as stochastic processes, with scene changes being reflected by changes in the characteristics (parameters) of the process. Statistical sequential analysis is used to provide an unified framework for the detection of both abrupt and gradual scene changes. 相似文献

4.

ViBE: a compressed video database structured for active browsing and search

Taskiran C. Jau-Yuen Chen Albiol A. Torres L. Bouman C.A. Delp E.J. 《Multimedia, IEEE Transactions on》2004,6(1):103-118

In this paper, we describe a unique new paradigm for video database management known as ViBE (video indexing and browsing environment). ViBE is a browseable/searchable paradigm for organizing video data containing a large number of sequences. The system first segments video sequences into shots by using a new feature vector known as the Generalized Trace obtained from the DC-sequence of the compressed data. Each video shot is then represented by a hierarchical structure known as the shot tree. The shots are then classified into pseudo-semantic classes that describe the shot content. Finally, the results are presented to the user in an active browsing environment using a similarity pyramid data structure. The similarity pyramid allows the user to view the video database at various levels of detail. The user can also define semantic classes and reorganize the browsing environment based on relevance feedback. We describe how ViBE performs on a database of MPEG sequences. 相似文献

5.

InsightVideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval 总被引：2，自引：0，他引：2

Xingquan Zhu Elmagarmid A.K. Xiangyang Xue Lide Wu Catlin A.C. 《Multimedia, IEEE Transactions on》2005,7(4):648-666

Hierarchical video browsing and feature-based video retrieval are two standard methods for accessing video content. Very little research, however, has addressed the benefits of integrating these two methods for more effective and efficient video content access. In this paper, we introduce InsightVideo, a video analysis and retrieval system, which joins video content hierarchy, hierarchical browsing and retrieval for efficient video access. We propose several video processing techniques to organize the content hierarchy of the video. We first apply a camera motion classification and key-frame extraction strategy that operates in the compressed domain to extract video features. Then, shot grouping, scene detection and pairwise scene clustering strategies are applied to construct the video content hierarchy. We introduce a video similarity evaluation scheme at different levels (key-frame, shot, group, scene, and video.) By integrating the video content hierarchy and the video similarity evaluation scheme, hierarchical video browsing and retrieval are seamlessly integrated for efficient content access. We construct a progressive video retrieval scheme to refine user queries through the interactions of browsing and retrieval. Experimental results and comparisons of camera motion classification, key-frame extraction, scene detection, and video retrieval are presented to validate the effectiveness and efficiency of the proposed algorithms and the performance of the system. 相似文献

6.

Video parsing and browsing using compressed data 总被引：16，自引：0，他引：16

Hongjiang Zhang Chien Yong Low Stephen W. Smoliar 《Multimedia Tools and Applications》1995,1(1):89-111

Parsing video content is an important first step in the video indexing process. This paper presents algorithms to automate the video parsing task, including partitioning a source video into clips and classifying those clips according to camera operations, using compressed video data. We have developed two algorithms and a hybrid approach to partitioning video data compressed according to the JPEG and MPEG standards. The algorithms utilize both the video content encoded in DCT (Discrete Cosine Transform) coefficients and the motion vectors between frames. The hybrid approach integrates the two algorithms and incorporates multi-pass strategies and motion analyses to improve both accuracy and processing speed. Also, we present content-based video browsing tools which utilize the information, particularly about the shot boundaries and key frames, obtained from parsing. 相似文献

7.

A novel compact yet rich key frame creation method for compressed video summarization

Mengjuan Fei Wei Jiang Weijie Mao 《Multimedia Tools and Applications》2018,77(10):11957-11977

Video summarization has great potential to enable rapid browsing and efficient video indexing in many applications. In this study, we propose a novel compact yet rich key frame creation method for compressed video summarization. First, we directly extract DC coefficients of I frame from a compressed video stream, and DC-based mutual information is computed to segment the long video into shots. Then, we select shots with static background and moving object according to the intensity and range of motion vector in the video stream. Detecting moving object outliers in each selected shot, the optimal object set is then selected by importance ranking and solving an optimum programming problem. Finally, we conduct an improved KNN matting approach on the optimal object outliers to automatically and seamlessly splice these outliers to the final key frame as video summarization. Previous video summarization methods typically select one or more frames from the original video as the video summarization. However, these existing key frame representation approaches for video summarization eliminate the time axis and lose the dynamic aspect of the video scene. The proposed video summarization preserves both compactness and considerably richer information than previous video summaries. Experimental results indicate that the proposed key frame representation not only includes abundant semantics but also is natural, which satisfies user preferences. 相似文献

8.

一种高效的视频切变检测算法 总被引：4，自引：0，他引：4

下载免费PDF全文

陆海斌章毓晋《中国图象图形学报》1999,4(10):805-810

介绍了一种可以在压缩域和非压缩域实时检测视频切变的算法。算法采用了大小双重窗口,利用大窗口全局阈值提取候选切变位置,再在以候选切变位置为中心的小窗口中结合双侧和单侧判断进一步检测真实切变位置。算法能有效地避免因摄像机和目标的剧烈运动造成造成误检和漏检的情况,在检测五段视频的实验中,取得１００％的查全率和９６％的准确率。相似文献

9.

On fast microscopic browsing of MPEG-compressed video 总被引：1，自引：0，他引：1

Boon-Lock Yeo 《Multimedia Systems》1999,7(4):269-281

MPEG has been established as a compression standard for efficient storage and transmission of digital video. However, users are limited to VCR-like (and tedious) functionalities when viewing MPEG video. The usefulness of MPEG video is presently limited by the lack of tools available for fast browsing, manipulation and processing of MPEG video. In this paper, we first address the problem of rapid access to individual shots and frames in MPEG video. We build upon the compressed-video-processing framework proposed in [1, 8], and propose new and fast algorithms based on an adaptive mixture of approximation techniques for extracting spatially reduced image sequence of uniform quality from MPEG video across different frame types and also under different motion activities in the scenes. The algorithms execute faster than real time on a Pentium personal computer. We demonstrate how the reduced images facilitate fast and convenient shot- and frame-level video browsing and access, shot-level editing and annotation, without the need for frequent decompression of MPEG video. We further propose methods for reducing the auxiliary data size associated with the reduced images through exploitation of spatial and temporal redundancy. We also address how the reduced images lead to computationally efficient algorithms for video analysis based on intra- and inter-shot processing for video database and browsing applications. The algorithms, tools for browsing and techniques for video processing presented in this paper have been used by many in IBM Research on more than 30 h of MPEG-1 video for video browsing and analysis. 相似文献

10.

一种层次的电影视频摘要生成方法 总被引：1，自引：0，他引：1

下载免费PDF全文

赵亚琴周献中何新《中国图象图形学报》2007,12(8):1412-1417

合理地组织视频数据对于基于内容的视频分析和检索有着重要的意义。提出了一种基于运动注意力模型的电影视频摘要生成方法。首先给出了一种基于滑动镜头窗的聚类算法将相似的镜头组织成为镜头类;然后根据电影视频场景内容的发展模式,在定义两个镜头类的3种时序关系的基础上,提出了一种基于镜头类之间的时空约束关系的场景检测方法;最后利用运动注意力模型选择场景中的重要镜头和代表帧,由选择的代表帧集合和重要镜头的关键帧集合建立层次视频摘要(场景级和镜头级)。该方法较全面地涵盖了视频内容,又突出了视频中的重要内容,能够很好地应用于电影视频的快速浏览和检索。相似文献

11.

Techniques used and open challenges to the analysis,indexing and retrieval of digital video

Alan F. Smeaton 《Information Systems》2007

Video in digital format is now commonplace and widespread in both professional use, and in domestic consumer products from camcorders to mobile phones. Video content is growing in volume and while we can capture, compress, store, transmit and display video with great facility, editing videos and manipulating them based on their content is still a non-trivial activity. In this paper, we give a brief review of the state of the art of video analysis, indexing and retrieval and we point to research directions which we think are promising and could make searching and browsing of video archives based on video content, as easy as searching and browsing (text) web pages. We conclude the paper with a list of grand challenges for researchers working in the area. 相似文献

12.

Automatic text segmentation and text recognition for video indexing 总被引：13，自引：0，他引：13

Rainer Lienhart Wolfgang Effelsberg 《Multimedia Systems》2000,8(1):69-81

Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos. 相似文献

13.

A comparison of AR full motion video traffic models in B-ISDN

A. Alheraish 《Computers & Electrical Engineering》2005,31(1):1-22

Full motion video traffic is envisaged to be a major source for Internet and broadband integrated services digital networks (B-ISDN). Accurate traffic models of full motion video are needed to design networks and improve video services. Auto-regressive process (AR) proves to be a viable modeling approach of full motion video. A considerable amount of effort on AR video modeling has been reported in the recent studies which need to be thoroughly investigated. The aim of this paper is: (1) to survey a number of AR models for full motion video; (2) to classify the models according to their properties and framework; (3) to compare and contrast the models based on their attributes: residual, coding scheme, capturing scene changes, number of parameters, level of modeling, and complexity; (4) to show the ability of these models to predict accurately different aspects of network performance; (5) to give recommendations that might be helpful in determining the appropriate model for full motion video based on the target application; (6) to give direction for future work on this important modeling scheme. 相似文献

14.

Particle Video: Long-Range Motion Estimation Using Point Trajectories 总被引：5，自引：0，他引：5

Peter Sand Seth Teller 《International Journal of Computer Vision》2008,80(1):72-91

This paper describes a new approach to motion estimation in video. We represent video motion using a set of particles. Each particle is an image point sample with a long-duration trajectory and other properties. To optimize particle trajectories we measure appearance consistency along the particle trajectories and distortion between the particles. The resulting motion representation is useful for a variety of applications and cannot be directly obtained using existing methods such as optical flow or feature tracking. We demonstrate the algorithm on challenging real-world videos that include complex scene geometry, multiple types of occlusion, regions with low texture, and non-rigid deformations. 相似文献

15.

基于DVB的视频关键帧浏览系统的实现

下载免费PDF全文

郑鹏薛海峰《中国图象图形学报》2003,8(12):1462-1466

随着计算机技术和电视技术的发展 ,数字视频节目日益增多 .为了迅速了解视频节目的内容 ,提出了建立基于数字视频广播 (DVB)的视频关键帧浏览系统的方案 .该系统首先利用 MPEG压缩视频提供的压缩参数 ,直接将视频序列划分成以镜头为单位的视频片段 ;然后提取每个镜头中的第一个 I帧作为关键帧 ,重构其 DC图象 ;最后 ,根据 DVB标准 ,通过扩展 SI表实现对这些视频关键帧进行封装的数据结构 ,从而实现电视台对视频关键帧的发送和接收 .文中给出了前端系统结构示意图和基于关键帧 DC图象的快速浏览实例 .由于该系统直接利用了压缩参数 ,从而减少了解压缩的开销 ,具有计算代价小 ,浏览速度快的特点 . 相似文献

16.

VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method 总被引：2，自引：0，他引：2

Sandra Eliza Fontes de Avila Ana Paula Brandão Lopes Antonio da Luz Jr. Arnaldo de Albuquerque Araújo 《Pattern recognition letters》2011,32(1):56-68

The fast evolution of digital video has brought many new multimedia applications and, as a consequence, has increased the amount of research into new technologies that aim at improving the effectiveness and efficiency of video acquisition, archiving, cataloging and indexing, as well as increasing the usability of stored videos. Among possible research areas, video summarization is an important topic that potentially enables faster browsing of large video collections and also more efficient content indexing and access. Essentially, this research area consists of automatically generating a short summary of a video, which can either be a static summary or a dynamic summary. In this paper, we present VSUMM, a methodology for the production of static video summaries. The method is based on color feature extraction from video frames and k-means clustering algorithm. As an additional contribution, we also develop a novel approach for the evaluation of video static summaries. In this evaluation methodology, video summaries are manually created by users. Then, several user-created summaries are compared both to our approach and also to a number of different techniques in the literature. Experimental results show - with a confidence level of 98% - that the proposed solution provided static video summaries with superior quality relative to the approaches to which it was compared. 相似文献

17.

Keyframe-based video summarization using Delaunay clustering 总被引：1，自引：0，他引：1

Padmavathi Mundur Yong Rao Yelena Yesha 《International Journal on Digital Libraries》2006,6(2):219-232

Recent advances in technology have made tremendous amounts of multimedia information available to the general population. An efficient way of dealing with this new development is to develop browsing tools that distill multimedia data as information oriented summaries. Such an approach will not only suit resource poor environments such as wireless and mobile, but also enhance browsing on the wired side for applications like digital libraries and repositories. Automatic summarization and indexing techniques will give users an opportunity to browse and select multimedia document of their choice for complete viewing later. In this paper, we present a technique by which we can automatically gather the frames of interest in a video for purposes of summarization. Our proposed technique is based on using Delaunay Triangulation for clustering the frames in videos. We represent the frame contents as multi-dimensional point data and use Delaunay Triangulation for clustering them. We propose a novel video summarization technique by using Delaunay clusters that generates good quality summaries with fewer frames and less redundancy when compared to other schemes. In contrast to many of the other clustering techniques, the Delaunay clustering algorithm is fully automatic with no user specified parameters and is well suited for batch processing. We demonstrate these and other desirable properties of the proposed algorithm by testing it on a collection of videos from Open Video Project. We provide a meaningful comparison between results of the proposed summarization technique with Open Video storyboard and K-means clustering. We evaluate the results in terms of metrics that measure the content representational value of the proposed technique. 相似文献

18.

Movie scene segmentation using background information

Liang-Hua Chen Yu-Chun Lai Hong-Yuan Mark Liao 《Pattern recognition》2008,41(3):1056-1065

Scene extraction is the first step toward semantic understanding of a video. It also provides improved browsing and retrieval facilities to users of video database. This paper presents an effective approach to movie scene extraction based on the analysis of background images. Our approach exploits the fact that shots belonging to one particular scene often have similar backgrounds. Although part of the video frame is covered by foreground objects, the background scene can still be reconstructed by a mosaic technique. The proposed scene extraction algorithm consists of two main components: determination of the shot similarity measure and a shot grouping process. In our approach, several low-level visual features are integrated to compute the similarity measure between two shots. On the other hand, the rules of film-making are used to guide the shot grouping process. Experimental results show that our approach is promising and outperforms some existing techniques. 相似文献

19.

Constructing table-of-content for videos 总被引：15，自引：0，他引：15

Yong Rui Thomas S. Huang Sharad Mehrotra 《Multimedia Systems》1999,7(5):359-368

A fundamental task in video analysis is to extract structures from the video to facilitate user's access (browsing and retrieval). Motivated by the important role that the table of content (ToC) plays in a book, in this paper, we introduce the concept of ToC in the video domain. Some existing approaches implicitly use the ToC, but are mainly limited to low-level entities (e.g., shots and key frames). The drawbacks are that low-level structures (1) contain too many entries to be efficiently presented to the user; and (2) do not capture the underlying semantic structure of the video based on which the user may wish to browse/retrieve. To address these limitations, in this paper, we present an effective semantic-level ToC construction technique based on intelligent unsupervised clustering. It has the characteristics of better modeling the time locality and scene structure. Experiments based on real-world movie videos validate the effectiveness of the proposed approach. Examples are given to demonstrate the usage of the scene-based ToC in facilitating user's access to the video. 相似文献

20.

Combining 3D flow fields with silhouette-based human motion capture for immersive video

Christian Theobalt Joel Carranza Marcus A. Magnor Hans-Peter Seidel 《Graphical Models》2004,66(6):333-351

In recent years, the convergence of computer vision and computer graphics has put forth a new field of research that focuses on the reconstruction of real-world scenes from video streams. To make immersive 3D video reality, the whole pipeline spanning from scene acquisition over 3D video reconstruction to real-time rendering needs to be researched. In this paper, we describe latest advancements of our system to record, reconstruct and render free-viewpoint videos of human actors. We apply a silhouette-based non-intrusive motion capture algorithm making use of a 3D human body model to estimate the actor’s parameters of motion from multi-view video streams. A renderer plays back the acquired motion sequence in real-time from any arbitrary perspective. Photo-realistic physical appearance of the moving actor is obtained by generating time-varying multi-view textures from video. This work shows how the motion capture sub-system can be enhanced by incorporating texture information from the input video streams into the tracking process. 3D motion fields are reconstructed from optical flow that are used in combination with silhouette matching to estimate pose parameters. We demonstrate that a high visual quality can be achieved with the proposed approach and validate the enhancements caused by the the motion field step. 相似文献