首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Video indexing requires the efficient segmentation of video into scenes. The video is first segmented into shots and a set of key-frames is extracted for each shot. Typical scene detection algorithms incorporate time distance in a shot similarity metric. In the method we propose, to overcome the difficulty of having prior knowledge of the scene duration, the shots are clustered into groups based only on their visual similarity and a label is assigned to each shot according to the group that it belongs to. Then, a sequence alignment algorithm is applied to detect when the pattern of shot labels changes, providing the final scene segmentation result. In this way shot similarity is computed based only on visual features, while ordering of shots is taken into account during sequence alignment. To cluster the shots into groups we propose an improved spectral clustering method that both estimates the number of clusters and employs the fast global k-means algorithm in the clustering stage after the eigenvector computation of the similarity matrix. The same spectral clustering method is applied to extract the key-frames of each shot and numerical experiments indicate that the content of each shot is efficiently summarized using the method we propose herein. Experiments on TV-series and movies also indicate that the proposed scene detection method accurately detects most of the scene boundaries while preserving a good tradeoff between recall and precision.  相似文献   

2.
视频检索中镜头分割方法综述   总被引:22,自引:0,他引:22  
视频序列的镜头分割亦称镜头变化检测是视频检索中的关键技术之一。镜头变化是指视频序列中场景内容的变化。该文介绍了目前镜头分割的常用方法,包括灰度分割法、边缘分割法、彩色直方图分割法、MPEG视频的分割方法、块匹配镜头分割方法、统计判决镜头分割方法、基于聚类的镜头分割方法、镜头渐变的检测等,指出了研究场景内容的表征方法、特征提取方法、特征的检测尺度以及稳健可靠的实用镜头分割方法是目前主要的研究方向。  相似文献   

3.
基于时序结构图的视频流描述方法   总被引:1,自引:0,他引:1  
通过对视频流的分解可以获得基于关键帧集的视频流表示,但这种表示方法不能反映出视频流中隐藏的故事发展关系,为揭示这种关系,提出了一种视频流的快速聚类算法,用于对视频流分解单元进行相关性分析,该算法通过检测视频镜头间的相似性和连续性,实现把来自同一摄像机的视频镜头归并入同一视频类,并帱此得到而且为矿山频流的快速浏览和检索提供了新的思路。  相似文献   

4.
基于镜头的视频场景构造方法研究   总被引:3,自引:0,他引:3  
由于内容颗粒度地小,镜头层次的检索不能满足视频内容使用的需要。场景比镜头高一个层次的视频内容结构单,能在一定程度上缓解镜头颗粒度过小的问题。“场景”是一组镜头的集合,在内容上包含相似的对象或包含类似的背景。本文提出了一种基于镜头构造频场景的思路,包括三个环节:镜头边界探测,镜头特征提取和镜头聚类。  相似文献   

5.
视频摘要是视频内容的一种压缩表示方式。为了能够更好地浏览视频,提出了一种根据浏览或检索的粒度不同来建立两种层次视频摘要(镜头级和场景级)的思想,并给出了一种视频摘要生成方法:首先用一种根据内容变化自动提取镜头内关键帧的方法来实现关键帧的提取;继而用一种改进的时间自适应算法通过镜头的组合来得到场景;最后在场景级用最小生成树方法提取代表帧。由于关键帧和代表帧分别代表了它们所在镜头和场景的主要内容,因此它们的序列就构成了视频总结。一些电影视频片段检验的实验结果表明,这种生成方法能够较好地提供粗细两种粒度的视频内容总结。  相似文献   

6.
Shot clustering techniques for story browsing   总被引:1,自引:0,他引:1  
Automatic video segmentation is the first and necessary step for organizing a long video file into several smaller units. The smallest basic unit is a shot. Relevant shots are typically grouped into a high-level unit called a scene. Each scene is part of a story. Browsing these scenes unfolds the entire story of a film, enabling users to locate their desired video segments quickly and efficiently. Existing scene definitions are rather broad, making it difficult to compare the performance of existing techniques and to develop a better one. This paper introduces a stricter scene definition for narrative films and presents ShotWeave, a novel technique for clustering relevant shots into a scene using the stricter definition. The crux of ShotWeave is its feature extraction and comparison. Visual features are extracted from selected regions of representative frames of shots. These regions capture essential information needed to maintain viewers' thought in the presence of shot breaks. The new feature comparison is developed based on common continuity-editing techniques used in film making. Experiments were performed on full-length films with a wide range of camera motions and a complex composition of shots. The experimental results show that ShotWeave outperforms two recent techniques utilizing global visual features in terms of segmentation accuracy and time.  相似文献   

7.
一种基于均值漂移的视频场景检测方法   总被引:1,自引:1,他引:0       下载免费PDF全文
提出了一种高效的视频场景检测方法。首先基于均值漂移,在滑动镜头窗内对各镜头聚类,并获得相应的聚类中心,然后根据电影视频场景的发展模式,计算两个镜头类之间的时序距离,接着基于时空关系进行场景检测,并且由相应的聚类中心获得场景关键帧,最后对场景过分割进行后续处理。实验证实该方法能快速聚类,并且有效地检测出场景和场景关键帧。  相似文献   

8.
一种有效的视频场景检测方法   总被引:3,自引:2,他引:3  
合理地组织视频数据对于基于内容的视频分析和应用有着重要的意义。现有的基于镜头的视频分析方法由于镜头信息粒度太小而不能反映视频语义上的联系,因此有必要将视频内容按照高层语义单元——场景进行组织。提出了一种快速有效的视频场景检测方法,根据电影编辑的原理,对视频场景内容的发展模式进行了分类,给出了场景构造的原则;提出一种新的基于滑动镜头窗的组合方法,将相似内容的镜头组织成为镜头类;定义了镜头类相关性函数来衡量镜头类之间的相关性并完成场景的生成。实验结果证明了该方法的快速有效性。  相似文献   

9.
一种层次的电影视频摘要生成方法   总被引:1,自引:0,他引:1       下载免费PDF全文
合理地组织视频数据对于基于内容的视频分析和检索有着重要的意义。提出了一种基于运动注意力模型的电影视频摘要生成方法。首先给出了一种基于滑动镜头窗的聚类算法将相似的镜头组织成为镜头类;然后根据电影视频场景内容的发展模式,在定义两个镜头类的3种时序关系的基础上,提出了一种基于镜头类之间的时空约束关系的场景检测方法;最后利用运动注意力模型选择场景中的重要镜头和代表帧,由选择的代表帧集合和重要镜头的关键帧集合建立层次视频摘要(场景级和镜头级)。该方法较全面地涵盖了视频内容,又突出了视频中的重要内容,能够很好地应用于电影视频的快速浏览和检索。  相似文献   

10.
In this paper we introduce VideoGraph, a novel non-linear representation for scene structure of a video. Unlike classical linear sequential organization, VideoGraph concentrates the video content across the time line by structuring scenes and materializes with two-dimensional graph, which enables non-linear exploration on the scenes and their transitions. To construct VideoGraph, we adopt a sub-shot induced method to evaluate the spatio-temporal similarity between shot segments of video. Then, scene structure is derived by grouping similar shots and identifying the valid transitions between scenes. The final stage is to represent the scene structure using a graph with respect to scene transition topology. Our VideoGraph can provide a condensed representation in the scene level and facilitate a non-linear manner to browse videos. Experimental results are presented to demonstrate the effectiveness and efficiency by using VideoGraph to explore and access the video content.  相似文献   

11.
Recognizing scene information in images or has attracted much attention in computer vision or videos, such as locating the objects and answering "Where am research field. Many existing scene recognition methods focus on static images, and cannot achieve satisfactory results on videos which contain more complex scenes features than images. In this paper, we propose a robust movie scene recognition approach based on panoramic frame and representative feature patch. More specifically, the movie is first efficiently segmented into video shots and scenes. Secondly, we introduce a novel key-frame extraction method using panoramic frame and also a local feature extraction process is applied to get the representative feature patches (RFPs) in each video shot. Thirdly, a Latent Dirichlet Allocation (LDA) based recognition model is trained to recognize the scene within each individual video scene clip. The correlations between video clips are considered to enhance the recognition performance. When our proposed approach is implemented to recognize the scene in realistic movies, the experimental results shows that it can achieve satisfactory performance.  相似文献   

12.
视频层次结构挖掘   总被引:3,自引:0,他引:3  
视频处理的关键是视频信息的结构化,视频基本结构是由帧、镜头、场景和视频节目构成的层次结构。视频层次结构挖掘的一个简单框架是对视频进行镜头分割、抽取镜头特征和视频场景构造。论文在镜头分割的基础上提出了基于多特征的镜头聚类分析和基于镜头的场景边界检测两种视频场景构造方法,从而实现视频层次结构挖掘。实验表明,基于镜头的场景边界检测性能优于基于多特征的镜头聚类分析。  相似文献   

13.
Scene extraction is the first step toward semantic understanding of a video. It also provides improved browsing and retrieval facilities to users of video database. This paper presents an effective approach to movie scene extraction based on the analysis of background images. Our approach exploits the fact that shots belonging to one particular scene often have similar backgrounds. Although part of the video frame is covered by foreground objects, the background scene can still be reconstructed by a mosaic technique. The proposed scene extraction algorithm consists of two main components: determination of the shot similarity measure and a shot grouping process. In our approach, several low-level visual features are integrated to compute the similarity measure between two shots. On the other hand, the rules of film-making are used to guide the shot grouping process. Experimental results show that our approach is promising and outperforms some existing techniques.  相似文献   

14.
Spatio-temporal segmentation based on region merging   总被引:2,自引:0,他引:2  
This paper proposes a technique for spatio-temporal segmentation to identify the objects present in the scene represented in a video sequence. This technique processes two consecutive frames at a time. A region-merging approach is used to identify the objects in the scene. Starting from an oversegmentation of the current frame, the objects are formed by iteratively merging regions together. Regions are merged based on their mutual spatio-temporal similarity. We propose a modified Kolmogorov-Smirnov test for estimating the temporal similarity. The region-merging process is based on a weighted, directed graph. Two complementary graph-based clustering rules are proposed, namely, the strong rule and the weak rule. These rules take advantage of the natural structures present in the graph. Experimental results on different types of scenes demonstrate the ability of the proposed technique to automatically partition the scene into its constituent objects  相似文献   

15.
视频聚类是视频索引和检索的重要组成部分.本文针对镜头已分割好的视频如何提取更高语义层次的场景,考虑帧图像间以帧分块的局部似然比特征和小波变换的全局边缘特征相结合的综合相似性度量,利用视频编辑的一种常用特征及代表性镜头的选取原则,给出了一种新的语义场景的提取算法.数值实验表明该算法对基于对话类的视频类型有很好的场景提取效果,与WBS(Window-based Sweep Algorithm)算法相比,查全率和查准率分别提高了8.7%和28.4%.  相似文献   

16.
17.
In video processing, a common first step is to segment the videos into physical units, generally called shots. A shot is a video segment that consists of one continuous action. In general, these physical units need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are usually described by one or several representative frames, called key-frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the first problem, an algorithm called “net comparison” is devised. It is accurate and fast because it uses both statistical and spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original image similarity criterion, which considers both spatial layout and detail content in an image. For this purpose, coefficients of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit (quasi-) invariant properties, thus making the algorithm robust for many types of object/camera motions and scaling variances. The novel “seek and spread” strategy used in key frame computing allows us to obtain a large representative range for the key frames. Inter-shot redundancy of the key-frames is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques.  相似文献   

18.
19.
Grouping video content into semantic segments and classifying semantic scenes into different types are the crucial processes to content-based video organization, management and retrieval. In this paper, a novel approach to automatically segment scenes and semantically represent scenes is proposed. Firstly, video shots are detected using a rough-to-fine algorithm. Secondly, key-frames within each shot are selected adaptively with hybrid features, and redundant key-frames are removed by template matching. Thirdly, spatio-temporal coherent shots are clustered into the same scene based on the temporal constraint of video content and visual similarity between shot activities. Finally, under the full analysis of typical characters on continuously recorded videos, scene content is semantically represented to satisfy human demand on video retrieval. The proposed algorithm has been performed on various genres of films and TV program. Promising experimental results show that the proposed method makes sense to efficient retrieval of interesting video content.
Yuncai LiuEmail:
  相似文献   

20.
DVD影片中基于内容的镜头查询技术与实现方法   总被引:2,自引:1,他引:1  
基于内容的检索是广泛应用于多媒体系统中的一种不同于基于文本检索的方法,尤其是在图像与视频库中。当前随着DVD技术的发展,一张影碟中所包含的信息将越来越多,要查找其中用户需要的信息也变得越来越困难,尤其是对于故事情节较长的影片或容量较大的DVD影碟中所包含的几部影片,对它们进行基于内容的检索是非常必要的。该文提出一种针对DVD-VIDEO影碟的基于内容的检索方法,通过对视频镜头检测、代表帧提取以及相似镜头聚类等来形成一个影片场景浏览图,并将影片中每个镜头或场景间的前后转移关系用导航键联结,以完成对影片镜头的查找,使得用户对影片中相关镜头的检索与查找更加方便与快捷。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号