首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We analyze the autocorrelation structure for a class of scene-based MPEG video models at the groups-of-pictures (GOP) (course grain) and frame (fine grain) levels assuming an arbitrary scene-length distribution. At the GOP level, we establish the relationship between the scene-length statistics and the short-range/long-range dependence (SRD/LRD) of the underlying model. We formally show that when the intrascene dynamics exhibit SRD, the overall model exhibits LRD if and only if the second moment of the scene length is infinite. Our results provide the theoretical foundation for several empirically derived scene-based models. We then study the impact of traffic correlations on the packet loss performance at a video buffer. Two popular families of scene-length distributions are investigated: Pareto and Weibull. In the case of Pareto distributed scene lengths, it is observed that the performance is rather insensitive to changes in the buffer size even as the video model enters the SRD regime. For Weibull distributed scene lengths, we observe that for small buffers the loss performance under a frame-level model can be larger than its GOP-level counterpart by orders of magnitude. In this case, the reliance on GOP-level models will result in very optimistic results  相似文献   

2.
The analysis and mining of traffic video sequences to discover important but previously unknown knowledge such as vehicle identification, traffic flow, queue detection, incident detection, and the spatio-temporal relations of the vehicles at intersections, provide an economic approach for daily traffic monitoring operations. To meet such demands, a multimedia data mining framework is proposed in this paper. The proposed multimedia data mining framework analyzes the traffic video sequences using background subtraction, image/video segmentation, vehicle tracking, and modeling with the multimedia augmented transition network (MATN) model and multimedia input strings, in the domain of traffic monitoring over traffic intersections. The spatio-temporal relationships of the vehicle objects in each frame are discovered and accurately captured and modeled. Such an additional level of sophistication enabled by the proposed multimedia data mining framework in terms of spatio-temporal tracking generates a capability for automation. This capability alone can significantly influence and enhance current data processing and implementation strategies for several problems vis-à-vis traffic operations. Three real-life traffic video sequences obtained from different sources and with different weather conditions are used to illustrate the effectiveness and robustness of the proposed multimedia data mining framework by demonstrating how the proposed framework can be applied to traffic applications to answer the spatio-temporal queries.  相似文献   

3.
针对MPEG-4的低质量和高质量两种编码质量的视频源,给出了一种混合模型,此模型在多个时间尺度上反映帧大小的变化;场景的变化和同一场景内码流的波动,用几何分布来描述场景的变化,用AR(2)模型来反映同一场景内码流的波动,对I帧采用合成模型建模,对P,B帧采用几何分布建模,最后将此模型与不考虑场景变化的几种情况进行了比较,并且对它们的排队性能作了分析,证明了这种合成模型能够很好地反映实际视频流的特征。  相似文献   

4.
基于MPEG-4 FGS的视频流量模型   总被引:1,自引:1,他引:0  
何伟  王晖 《计算机仿真》2004,21(10):107-110
目前MPEG-4精细颗粒度可伸缩性(Fine Graruldty Scalability,FGS)编码视频正成为视频流服务的一种主要的业务流,因此,针对MPEG-4FGS视频流量进行建模对于网络性能仿真和通信网络设计具有十分重要的意义。该文首先介绍了MPEG-4FGS编码原理,然后对MPEG-4FGS视频流量的统计特性进行了分析,在此基础上,提出了基于MPEG-4FGS的视频流量模型。实验结果表明,该模型能较好地拟合原视频帧序列大小,且能根据网络带宽的动态变化进行适应的码率分配。  相似文献   

5.
We present an approach for MPEG variable bit rate (VBR) video modeling and classification using fuzzy techniques. We demonstrate that a type-2 fuzzy membership function, i.e., a Gaussian MF with uncertain variance, is most appropriate to model the log-value of I/P/B frame sizes in MPEG VBR video. The fuzzy c-means (FCM) method is used to obtain the mean and standard deviation (std) of T/P/B frame sizes when the frame category is unknown. We propose to use type-2 fuzzy logic classifiers (FLCs) to classify video traffic using compressed data. Five fuzzy classifiers and a Bayesian classifier are designed for video traffic classification, and the fuzzy classifiers are compared against the Bayesian classifier. Simulation results show that a type-2 fuzzy classifier in which the input is modeled as a type-2 fuzzy set and antecedent membership functions are modeled as type-2 fuzzy sets performs the best of the five classifiers when the testing video product is not included in the training products and a steepest descent algorithm is used to tune its parameters  相似文献   

6.
MPEG VBR视频流—统计特性及其模型   总被引:3,自引:0,他引:3  
黄天云  孙世新 《计算机学报》2001,24(9):1002-1008
该文分析了已有的MPEG VBR视频流模型及其缺点,并在此基础上对VBR视频流的统计特性进行了研究,实验结果表明,通过对整个视频流进行场景划分(聚类),聚类间用Markov调制链建模,而每一聚类中独立的场景则可以采用TES模型基于GOP(Group Of Picture)而非帧进行建模,则该方法既能避免状态空间过大,又能避免帧间周期性的自相关,因而能够更好地拟合VBR视频流序列的一阶和二阶统计特性,同时,对独立场景的GOP分布函数可以采用Gamma函数进行拟合,自相关函数则可以采用双指数函数更好地拟合。  相似文献   

7.
This paper presents a frame-level hybrid framework for modeling MPEG-4 and H.264 multi-layer variable bit rate (VBR) video traffic. To accurately capture long-range-dependent and short-range-dependent properties of VBR sequences, we use wavelets to model the distribution of I-frame sizes and a simple time-domain model for P/B frame sizes. However, unlike previous studies, we analyze and successfully model both inter-GOP (group of pictures) and intra-GOP correlation in VBR video and build an enhancement-layer model using cross-layer correlation. Simulation results demonstrate that our model effectively preserves the temporal burstiness and captures important statistical features (e.g., the autocorrelation function and the frame-size distribution) of original traffic. We also show that our model possesses lower complexity and has better performance than the previous methods in both single- and multi-layer sequences.  相似文献   

8.
提出一种基于全局场景特征在视频序列中寻找频繁镜头集合,并通过局部语义特征精确定位视频场景边界的视频场景分割方法。首先对分析视频进行高精度镜头分割,选取具有代表性的镜头关键帧。然后提取各镜头关键帧的全局场景特征和局部特征,并利用局部特征聚类得到的视觉词对各个镜头关键帧进行语义标注。接下来计算基于全局场景特征的镜头间相关性,结合视频场景的概念和特性,在镜头关键帧序列中寻找局部频繁出现的相关性高的镜头集合,粗略定位视频场景位置。最后利用镜头关键帧的语义标注特征精确定位视频场景边界。实验证明该方法能够准确、有效地检测并定位到大部分视频场景。  相似文献   

9.
《Real》2000,6(5):347-357
In this paper, we present a new approach to modeling variable bit rate (VBR) coded video sources in asynchronous transfer mode (ATM) networks. Unlike the existing methods which model the number of cells generated by the coder for a sequence of video frames, the new approach improves modeling accuracy by considering the characteristics of different cells generated by the coder, and modeling the number of cells in each type of macroblock of a frame separately. The model is tested by comparing the cell loss rate and mean queue size in simulation of an ATM switch, with the same statistics produced when traces generated by the model are used as the source. Comparisons with the existing models are made by measuring both the quality of their predictions for network performance and the quality of service experienced by a user.  相似文献   

10.
Several scene-detection algorithms, which are only based on bit rate fluctuations, have been proposed. All of them are presented on the fixed thresholds, which are obtained by the empirical records of the video characteristics. Due to the sensitivity of these methods to the accuracy of the records, which are generally obtained by testing several values repeatedly, bad performance evaluation might be observed for the actual scene detection, especially for real-time video traffic. In this paper, we review the previous works in this area, and study the correlation between the scene duration and the scene change at the frame level, and simultaneously investigate the local statistical characteristics of scenes such as variance and peak bit rate etc. Based on this analysis, an effective decision function is first constructed for the scene segmentation. Then, we propose a scene-detection algorithm using the defined dynamic threshold model, which can capture the statistical properties of the scene changes. Experimental results using 15 variable bit rate MPEG video traces indicate good performances of the proposed algorithm with significantly improved scene-detection accuracy.  相似文献   

11.
Modeling video sources for real-time scheduling   总被引:1,自引:0,他引:1  
What is the impact of the autocorrelation of variable-bit-rate (VBR) sources on real-time scheduling algorithms? Our results show that the impact of long term, or interframe, autocorrelation is negligible, while the impact of short term, or intraframe, autocorrelation can be significant. Such results are essentially independent of the video coding scheme employed. To derive these results, video sequences are modeled as a collection of stationary subsequences called scenes. Within a scene, a statistical model is derived for both the sequence of frames and of slices. The model captures the distribution and the autocorrelation function of real-time video data. In previous work, the pseudoperiodicity of the slice level auto-correlation function made it difficult to develop a simple yet accurate model. We present a generalization of previous methods that can easily capture this pseudoperiodicity and is suited for modeling a greater variety of autocorrelation functions. By simply tuning a few parameters, the model reproduces the statistic behavior of sources with different types and levels of correlation on both the frame and the slice level.  相似文献   

12.
A key characteristic of video data is the associated spatial and temporal semantics. It is important that a video model models the characteristics of objects and their relationships in time and space. J.F. Allen's (1983) 13 temporal relationships are often used in formulating queries that contain the temporal relationships among video frames. For the spatial relationships, most of the approaches are based on projecting objects on a two or three-dimensional coordinate system. However, very few attempts have been made formally to represent the spatio-temporal relationships of objects contained in the video data and to formulate queries with spatio-temporal constraints. The purpose of the work is to design a model representation for the specification of the spatio-temporal relationships among objects in video sequences. The model describes the spatial relationships among objects for each frame in a given video scene and the temporal relationships (for this frame) of the temporal intervals measuring the duration of these spatial relationships. It also models the temporal composition of an object, which reflects the evolution of object's spatial relationships over the subsequent frames in the video scene and in the entire video sequence. Our model representation also provides an effective and expressive way for the complete and precise specification of distances among objects in digital video. This model is a basis for the annotation of raw video  相似文献   

13.
14.
The abnormal visual event detection is an important subject in Smart City surveillance where a lot of data can be processed locally in edge computing environment. Real-time and detection effectiveness are critical in such an edge environment. In this paper, we propose an abnormal event detection approach based on multi-instance learning and autoregressive integrated moving average model for video surveillance of crowded scenes in urban public places, focusing on real-time and detection effectiveness. We propose an unsupervised method for abnormal event detection by combining multi-instance visual feature selection and the autoregressive integrated moving average model. In the proposed method, each video clip is modeled as a visual feature bag containing several subvideo clips, each of which is regarded as an instance. The time-transform characteristics of the optical flow characteristics within each subvideo clip are considered as a visual feature instance, and time-series modeling is carried out for multiple visual feature instances related to all subvideo clips in a surveillance video clip. The abnormal events in each surveillance video clip are detected using the multi-instance fusion method. This approach is verified on publically available urban surveillance video datasets and compared with state-of-the-art alternatives. Experimental results demonstrate that the proposed method has better abnormal event detection performance for crowded scene of urban public places with an edge environment.  相似文献   

15.
李慧然  彭强  陈睿 《计算机应用》2008,28(2):385-388
针对H.264/AVC 经典流控算法JVT-G012对运动剧烈图像流控效率的不足,提出了一种基于图像运动剧烈程度的流控算法。对运动剧烈的图像,在同一复杂度区域内,用前一帧实际编码码率与目标码率的差值调整当前帧目标码率,并且编码时利用最小率失真模式的原始帧和重建帧的SAD估计MAD值,根据二次模型估计量化参数优化拉格朗日参数。仿真试验证明与JVT-G012以及Jiang等对H.264AVC的改进的码率控制算法相比,在运动剧烈或者场景切换时,虽然码率比JVT-G012略有增加但低于Jiang等的改进算法,并且变化剧烈的视频帧图像信噪比有明显的增加,平均信噪比也得到了提高。  相似文献   

16.
Constructing table-of-content for videos   总被引:15,自引:0,他引:15  
A fundamental task in video analysis is to extract structures from the video to facilitate user's access (browsing and retrieval). Motivated by the important role that the table of content (ToC) plays in a book, in this paper, we introduce the concept of ToC in the video domain. Some existing approaches implicitly use the ToC, but are mainly limited to low-level entities (e.g., shots and key frames). The drawbacks are that low-level structures (1) contain too many entries to be efficiently presented to the user; and (2) do not capture the underlying semantic structure of the video based on which the user may wish to browse/retrieve. To address these limitations, in this paper, we present an effective semantic-level ToC construction technique based on intelligent unsupervised clustering. It has the characteristics of better modeling the time locality and scene structure. Experiments based on real-world movie videos validate the effectiveness of the proposed approach. Examples are given to demonstrate the usage of the scene-based ToC in facilitating user's access to the video.  相似文献   

17.
基于长时间视频序列的背景建模方法研究   总被引:1,自引:0,他引:1  
针对现有背景建模算法难以处理场景非平稳变化的问题,提出一种基于长时间视频序列的背景建模方法.该方法包括训练、检索、更新三个主要步骤.在训练部分,首先将长时间视频分段剪辑并计算对应的背景图,然后通过图像降采样和降维找到背景描述子,并利用聚类算法对背景描述子进行分类,生成背景记忆字典.在检索部分,利用前景像素比例设计非平稳状态判断机制,如果发生非平稳变换,则计算原图描述子与背景字典中描述子之间的距离,距离最近的背景描述子对应的背景图片即为此时背景.在更新部分,利用前景像素比例设计更新判断机制,如果前景比例始终过大,则生成新背景,并更新背景字典以及背景图库.当出现非平稳变化时(如光线突变),本算法能够将背景模型恢复问题转化为背景检索问题,确保背景模型的稳定获得.将该框架与短时空域信息背景模型(以ViBe、MOG为例)融合,重点测试非平稳变化场景下的背景估计和运动目标检测结果.在多个视频序列上的测试结果表明,该框架可有效处理非平稳变化,有效改善目标检测效果,显著降低误检率.  相似文献   

18.
帧率上转(FRUC)是最常用的一种视频编辑技术,它在原始视频帧间周期性地插入新的帧,以便增加视频的帧率,这种技术经常用于两段不同帧率的视频拼接伪造中。为了减少视觉痕迹,高级的FRUC方法通常采用运动补偿的插值方式,这也带来了针对这种插值伪造检测的挑战。在本文,我们提出一种新的简单但有效的方法,可正确检测出这种伪造,并能估计出视频的原始帧率。该方法利用了FRUC算法生成的插值帧与相邻原始帧构成的视频序列再次插值重建得到的帧对在PSNR上的周期性差异。测试序列的实验结果表明本文方法检测准确率高,其中对有损压缩视频序列的测试结果进一步证实了该方法的实际使用价值。  相似文献   

19.
We present a fast and efficient non-rigid shape tracking method for modeling dynamic 3D objects from multiview video. Starting from an initial mesh representation, the shape of a dynamic object is tracked over time, both in geometry and topology, based on multiview silhouette and 3D scene flow information. The mesh representation of each frame is obtained by deforming the mesh representation of the previous frame towards the optimal surface defined by the time-varying multiview silhouette information with the aid of 3D scene flow vectors. The whole time-varying shape is then represented as a mesh sequence which can efficiently be encoded in terms of restructuring and topological operations, and small-scale vertex displacements along with the initial model. The proposed method has the ability to deal with dynamic objects that may undergo non-rigid transformations and topological changes. The time-varying mesh representations of such non-rigid shapes, which are not necessarily of fixed connectivity, can successfully be tracked thanks to restructuring and topological operations employed in our deformation scheme. We demonstrate the performance of the proposed method both on real and synthetic sequences.  相似文献   

20.
针对目前深度学习领域人体姿态估计算法计算复杂度高的问题,提出了一种基于光流的快速人体姿态估计算法.在原算法的基础上,首先利用视频帧之间的时间相关性,将原始视频序列分为关键帧和非关键帧分别处理(相邻两关键帧之间的图像和前向关键帧组成一个视频帧组,同一视频帧组内的视频帧相似),仅在关键帧上运用人体姿态估计算法,并通过轻量级光流场将关键帧识别结果传播到其他非关键帧.其次针对视频中运动场的动态特性,提出一种基于局部光流场的自适应关键帧检测算法,以根据视频的局部时域特性确定视频关键帧的位置.在OutdoorPose和HumanEvaI数据集上的实验结果表明,对于存在背景复杂、部件遮挡等问题的视频序列中,所提算法较原算法检测性能略有提升,检测速度平均可提升89.6%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号