期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Novel Video Coding Framework with Tensor Representation for Efficient Video Streaming

Athisayamani Suganya Dejey Dharma 《Wireless Personal Communications》2019,109(4):2699-2717

Video compression is one among the pre-processes in video streaming. While capturing moving objects with moving cameras, more amount of redundant data is recorded along with dynamic change. In this paper, this change is identified using various geometric transformations. To register all these dynamic relations with minimal storage, tensor representation is used. The amount of similarity between the frames is measured using canonical correlation analysis (CCA). The key frames are identified by comparing the canonical auto-correlation analysis score of the candidate key frame with CCA score of other frames. In this method, coded video is represented using tensor which consists of intra-coded key frame, a vector of P frame identifiers, transformation of each variable sized block and information fusion that has three levels of abstractions: measurements, characteristics and decisions that combine all these factors into a single entity. Each dimension can have variable sizes which facilitates storing all characteristics without missing any information. In this paper, the proposed video compression method is applied to under-water videos that have more redundancy as both the camera and the underwater species are in motion. This method is compared with H.264, H.265 and some recent compression methods. Metrics like Peak Signal to Noise Ratio and compression ratio for various bit rates are used to evaluate the performance. From the results obtained, it is obvious that the proposed method performs compression with a high compression ratio, and the loss is comparatively less.

相似文献

2.

HEVC压缩域的视频摘要关键帧提取方法

下载免费PDF全文

朱树明王凤随程海鹰《信号处理》2019,35(3):481-489

为了提高关键帧提取的准确率，改善视频摘要的质量，提出了一种HEVC压缩域的视频摘要关键帧提取方法。首先，对视频序列进行编解码，在解码中统计HEVC帧内编码PU块的亮度预测模式数目。然后，特征提取是利用统计得到的模式数目构建成模式特征向量，并将其作为视频帧的纹理特征用于关键帧的提取。最后，利用融合迭代自组织数据分析算法(ISODATA）的自适应聚类算法对模式特征向量进行聚类，在聚类结果中选取每个类内中间向量对应的帧作为候选关键帧，并通过相似度对候选关键帧进行再次筛选，剔除冗余帧，得到最终的关键帧。实验结果表明，在Open Video Project数据集上进行的大量实验验证，该方法提取关键帧的精度为79.9%、召回率达到93.6%、F-score为86.2%，有效地改善了视频摘要的质量。相似文献

3.

基于压缩感知和熵计算的关键帧提取算法

潘磊束鑫程科张明《光电子．激光》2014,(10):1977-1982

针对关键帧提取问题,提出了一种基于压缩感知理论和熵计算的关键帧提取算法, 首先通过构造符合有限等距性质要求的稀疏随机投影矩阵,将高维多尺度帧图像特征变换为低维多尺度帧图像特征, 并形成视频镜头低维多尺度特征列向量组;然后通过随机权值向量与低维多尺度特征向量的阿达玛乘积运算生成各帧图像的匹配特征,并根据匹配特征的相似性度量完成镜头内部的子镜头分割;最后通过交叉熵计算在每个子镜头中得到可能的关键帧,并由图像熵计算确定最终的关键帧。实验表明,与传统方法相比,本文算法提取的关键帧能够更精确、更稳定描述视频镜头内容。相似文献

4.

Video key frame extraction through dynamic Delaunay clustering with a structural constraint

Sanjay K. Kuanar Rameswar Panda Ananda S. Chowdhury 《Journal of Visual Communication and Image Representation》2013,24(7):1212-1227

Key frame based video summarization has emerged as an important area of research for the multimedia community. Video key frames enable an user to access any video in a friendly and meaningful way. In this paper, we propose an automated method of video key frame extraction using dynamic Delaunay graph clustering via an iterative edge pruning strategy. A structural constraint in form of a lower limit on the deviation ratio of the graph vertices further improves the video summary. We also employ an information-theoretic pre-sampling where significant valleys in the mutual information profile of the successive frames in a video are used to capture more informative frames. Various video key frame visualization techniques for efficient video browsing and navigation purposes are incorporated. A comprehensive evaluation on 100 videos from the Open Video and YouTube databases using both objective and subjective measures demonstrate the superiority of our key frame extraction method. 相似文献

5.

Video frame deletion detection based on time–frequency analysis

《Journal of Visual Communication and Image Representation》2022

With the emergence of diverse multimedia editing software, a great number of edited or tampered video resources appear on the Internet, some of which can mix with the genuine ones. Digital video authenticity is an important step to make the best use of these video resources. As a common video forgery operation, frame tampering can change the video content and confuse viewers by removing or inserting some specific frames. In this paper, we explore the traces created by compression process and propose a new method to detect frame tampering based on the high-frequency features of reconstructed DCT coefficients in the tampered sequences. Experimental results demonstrate that our proposed method can effectively detect frame tampering operation, and accurately locate the breakpoint of frame tampering in the streams. 相似文献

6.

Two-dimensional mesh-based mosaic representation for manipulationof video objects with occlusion

Toklu C. Tanju Erdem A. Murat Tekalp A. 《IEEE transactions on image processing》2000,9(9):1617-1630

We present a two-dimensional (2-D) mesh-based mosaic representation, consisting of an object mesh and a mosaic mesh for each frame and a final mosaic image, for video objects with mildly deformable motion in the presence of self and/or object-to-object (external) occlusion. Unlike classical mosaic representations where successive frames are registered using global motion models, we map the uncovered regions in the successive frames onto the mosaic reference frame using local affine models, i.e., those of the neighboring mesh patches. The proposed method to compute this mosaic representation is tightly coupled with an occlusion adaptive 2-D mesh tracking procedure, which consist of propagating the object mesh frame to frame, and updating of both object and mosaic meshes to optimize texture mapping from the mosaic to each instance of the object. The proposed representation has been applied to video object rendering and editing, including self transfiguration, synthetic transfiguration, and 2-D augmented reality in the presence of self and/or external occlusion. We also provide an algorithm to determine the minimum number of still views needed to reconstruct a replacement mosaic which is needed for synthetic transfiguration. Experimental results are provided to demonstrate both the 2-D mesh-based mosaic synthesis and two different video object editing applications on real video sequences. 相似文献

7.

New architecture for MPEG video streaming system with backward playback support.

Chang-Hong Fu Yui-Lam Chan Tak-Piu Ip Wan-Chi Siu 《IEEE transactions on image processing》2007,16(9):2169-2183

MPEG digital video is becoming ubiquitous for video storage and communications. It is often desirable to perform various video cassette recording (VCR) functions such as backward playback in MPEG videos. However, the predictive processing techniques employed in MPEG severely complicate the backward-play operation. A straightforward implementation of backward playback is to transmit and decode the whole group-of-picture (GOP), store all the decoded frames in the decoder buffer, and play the decoded frames in reverse order. This approach requires a significant buffer in the decoder, which depends on the GOP size, to store the decoded frames. This approach could not be possible in a severely constrained memory requirement. Another alternative is to decode the GOP up to the current frame to be displayed, and then go back to decode the GOP again up to the next frame to be displayed. This approach does not need the huge buffer, but requires much higher bandwidth of the network and complexity of the decoder. In this paper, we propose a macroblock-based algorithm for an efficient implementation of the MPEG video streaming system to provide backward playback over a network with the minimal requirements on the network bandwidth and the decoder complexity. The proposed algorithm classifies macroblocks in the requested frame into backward macroblocks (BMBs) and forward/backward macroblocks (FBMBs). Two macroblock-based techniques are used to manipulate different types of macroblocks in the compressed domain and the server then sends the processed macroblocks to the client machine. For BMBs, a VLC-domain technique is adopted to reduce the number of macroblocks that need to be decoded by the decoder and the number of bits that need to be sent over the network in the backward-play operation. We then propose a newly mixed VLC/DCT-domain technique to handle FBMBs in order to further reduce the computational complexity of the decoder. With these compressed-domain techniques, the proposed architecture only manipulates macroblocks either in the VLC domain or the quantized DCT domain resulting in low server complexity. Experimental results show that, as compared to the conventional system, the new streaming system reduces the required network bandwidth and the decoder complexity significantly. 相似文献

8.

Adaptive key frame extraction for video summarization using an aggregation mechanism

Naveed Ejaz Tayyab Bin Tariq Sung Wook Baik 《Journal of Visual Communication and Image Representation》2012,23(7):1031-1040

Video summarization is a method to reduce redundancy and generate succinct representation of the video data. One of the mechanisms to generate video summaries is to extract key frames which represent the most important content of the video. In this paper, a new technique for key frame extraction is presented. The scheme uses an aggregation mechanism to combine the visual features extracted from the correlation of RGB color channels, color histogram, and moments of inertia to extract key frames from the video. An adaptive formula is then used to combine the results of the current iteration with those from the previous. The use of the adaptive formula generates a smooth output function and also reduces redundancy. The results are compared to some of the other techniques based on objective criteria. The experimental results show that the proposed technique generates summaries that are closer to the summaries created by humans. 相似文献

9.

Robust color histogram descriptors for video segment retrieval and identification 总被引：3，自引：0，他引：3

Ferman A.M. Tekalp A.M. Mehrotra R. 《IEEE transactions on image processing》2002,11(5):497-508

相似文献

10.

Interactive key frame selection model 总被引：1，自引：0，他引：1

Jian-quan Ouyang Jin-tao Li Huanrong Tang 《Journal of Visual Communication and Image Representation》2006,17(6):1145-1163

Video summarization can provide a fine representation of the content of video stream and reduce a large amount of data involved in video indexing, browsing, and retrieval. Moreover, Key frame selection is an important step in the research of content-based video analysis and retrieval. Although there exist a variety of methods for key frame selection, they are heuristic and closed systems, which cannot dynamically generate video summary with user’s preference. In this paper, an M-estimator and epipolar line distance constraint camera motion estimation algorithm is introduced as camera parameters is an important motion feature for key frame selection, and Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is applied to optimize estimated parameters. Moreover, since Interactive Computing is a novel-computing model that represents the transition of algorithm to interaction, an interactive model of key frame selection (IKFS) is presented as a result of improving the model of key frame selection (KFS). The model of KFS and IKFS are proved to satisfy the criterion of induction and coinduction, respectively. Experimental results show that the processing scheme generates flexible and desirable summarizations whose distortion rate is lower than current method. Above all, IKFS is an extension to KFS. 相似文献

11.

Wyner–Ziv-based bidirectionally decodable video coding

Xiaopeng Fan Oscar C. Au Yan Chen Jiantao Zhou Mengyao Ma Peter H.W. Wong 《Journal of Visual Communication and Image Representation》2009,20(6):365-376

In this paper, we propose a novel Wyner–Ziv-based video compression scheme which supports encoding a new type of inter frame called ‘M-frame’. Different from traditional multi-hypothesis inter frames, the M-frame is specially compressed with its two neighbor frames as reference at the encoder, but can be identically reconstructed by using any one of them as prediction at the decoder. Based on this, the proposed Wyner–Ziv-based bidirectionally decodable video compression scheme supports decoding the frames in a video stream in both temporal order and reverse order. Unlike the other schemes which support reverse playback, our scheme achieves the reversibility with low extra cost of storage and bandwidth. In error-resilient test, our scheme outperforms H.264 based schemes up to 3.5 dB at same bit rate. The proposed scheme also provides more flexibility for stream switching. 相似文献

12.

探讨基于Tsallis熵的视频关键帧提取技术

杨振《电子测试》2017,(16)

关键帧是视频中的一组有限数量的帧的子集,一个视频的关键帧序列能够合理地概括该视频信息,从而减少过大的视频数据对生产生活带来的承载负重.本文讨论了基于Tsallis熵的Jensen距离公式——JTD在视频关键帧提取中的应用.根据得到的差异性距离值JTD,首先检查子镜头边界,进而从每个子镜头中抽取一帧作为该镜头的代表帧,最终得到该段视频的关键帧序列. 相似文献

13.

Video abstraction based on the visual attention model and online clustering

Qing-Ge Ji Zhi-Dang Fang Zhen-Hua Xie Zhe-Ming Lu 《Signal Processing: Image Communication》2013,28(3):241-253

With the fast evolution of digital video, research and development of new technologies are greatly needed to lower the cost of video archiving, cataloging and indexing, as well as improve the efficiency and accessibility of stored video sequences. A number of methods to respectively meet these requirements have been researched and proposed. As one of the most important research topics, video abstraction helps to enable us to quickly browse a large video database and to achieve efficient content access and representation. In this paper, a video abstraction algorithm based on the visual attention model and online clustering is proposed. First, shot boundaries are detected and key frames in each shot are extracted so that consecutive key frames in a shot have the same distance. Second, the spatial saliency map indicating the saliency value of each region of the image is generated from each key frame and regions of interest (ROI) is extracted according to the saliency map. Third, key frames, as well as their corresponding saliency map, are passed to a specific filter, and several thresholds are used so that the key frames containing less information are discarded. Finally, key frames are clustered using an online clustering method based on the features in ROIs. Experimental results demonstrate the performance and effectiveness of the proposed video abstraction algorithm. 相似文献

14.

基于图中心和自动阈值的关键帧提取方法 总被引：4，自引：0，他引：4

智敏张轶群蔡安妮《微电子学与计算机》2005,22(11):53-55

随着因特网的迅猛发展，迫切需要对越来越多的视频信息进行迅速且低成本的管理.而关键帧的使用可以大大减少视频索引的数据量，同时也为查询和检索视频提供了一个组织框架.文章中利用图与视频结构的相似性，将求关键帧的问题转化为求图的P中心问题;同时利用镜头之间的活动度不同，自动选取阈值.实验结果表明提取的关键帧集能够很好地表示镜头的主要内容，且有较低的计算复杂度。相似文献

15.

MPEG4视频流量预测自适应算法的研究

刘亚伟卢燕飞冯玉珉《现代传输》2004,4(4):95-99

为了保证用户的服务质量(QOS),宽带分组网在传送视频信息时需要进行动态带宽分配,而视频流量预测在动态带宽分配中发挥着重要的作用。本文从自相关性、自相似性的Hurst参数两个方面,阐明GOP时间尺度上的流量能够体现原始帧序列的流量特性,并在固定步长的LMS自适应算法(FSSA)的基础上提出的一种新的可变步长自适应算法(VSSA),在GOP的大时间尺度上预测MPEG4视频流量,通过大量的仿真实验表明,VSSA算法可以明显地改善预测性能。相似文献

16.

Rate-driven key frame selection using temporal variation of visualcontent

Hun-Cheol Lee Seong-Dae Kim 《Electronics letters》2002,38(5):217-218

A simple and fast method to select the assigned number of key frames in a video shot is presented. The algorithm selects the key frames so that the temporal variation of visual content within a video shot is equally distributed to each key frame. Simulation results on a real video sequence are shown to be in agreement with the human visual perception 相似文献

17.

分布视频编码中基于帧间相关性的自适应关键帧选取算法

张晓星《光电子．激光》2010,(10):1536-1541

针对分布式视频编码(DVC)系统中固定周期关键帧选取(PKFS)方法忽视了帧间相关性的缺陷,提出了一种自适应关键帧选取(AKFS)算法。利用图像特征点检测与匹配的方法,将相邻图像的非匹配点作为帧间相关性的近似,把累积或平均非匹配点数超过阈值的帧判定为关键帧。在此基础上,提出改进的帧内插方案,以适应不同长度序列组的边信息生成;将零运动强度的关联帧合并为一帧图像参与编解码,进一步提高了系统的压缩效率。实验结果表明,对于不同运动特性的序列,本文提出的算法可以明显提升边信息帧的重建质量,使系统的率失真性能提高0.9～2.0 dB,并有效降低了编码传输码率。相似文献

18.

Arbitrary Frame Rate Transcoding Through Temporal and Spatial Complexity 总被引：1，自引：0，他引：1

《Broadcasting, IEEE Transactions on》2009,55(4):767-775

In this paper, an arbitrary frame rate transcoding joint considering temporal and spatial complexity of frames in the adaptive length sliding window is proposed. The length of a sliding window can be adjusted according to bandwidth variation in order to decide the number of skipped frames. The proposed method preserves significant frames and drops non-significant ones using the complexity measurements. Moreover, the motion vector composition algorithm is proposed to reduce the computations of motion estimation process by adopting the coding feature of variable block sizes in H.264/AVC video transcoder. Experimental results show that the proposed method achieves higher visual quality compared to other existing methods. After combining with the proposed fast motion composition algorithm, our proposed algorithm reduces encoding time significantly with slight visual quality degradation. 相似文献

19.

Efficient visual attention based framework for extracting key frames from videos

Naveed Ejaz Irfan Mehmood Sung Wook Baik 《Signal Processing: Image Communication》2013,28(1):34-44

The huge amount of video data on the internet requires efficient video browsing and retrieval strategies. One of the viable solutions is to provide summaries of the videos in the form of key frames. The video summarization using visual attention modeling has been used of late. In such schemes, the visually salient frames are extracted as key frames on the basis of theories of human attention modeling. The visual attention modeling schemes have proved to be effective in video summarization. However, the high computational costs incurred by these techniques limit their applicability in practical scenarios. In this context, this paper proposes an efficient visual attention model based key frame extraction method. The computational cost is reduced by using the temporal gradient based dynamic visual saliency detection instead of the traditional optical flow methods. Moreover, for static visual saliency, an effective method employing discrete cosine transform has been used. The static and dynamic visual attention measures are fused by using a non-linear weighted fusion method. The experimental results indicate that the proposed method is not only efficient, but also yields high quality video summaries. 相似文献

20.

Key-frame selection for video summarization: an approach of multidimensional time series analysis

Zhen Gao Guoliang Lu Peng Yan 《Multidimensional Systems and Signal Processing》2018,29(4):1485-1505

This paper presents a novel method of key-frame selection for video summarization based on multidimensional time series analysis. In the proposed scheme, the given video is first segmented into a set of sequential clips containing a number of similar frames. Then the key frames are selected by a clustering procedure as the frames closest to the cluster centres in each resulting video clip. The proposed algorithm is implemented experimentally on a wide range of testing data, and compared with state-of-the-art approaches in the literature, which demonstrates excellent performance and outperforms existing methods on frame selection in terms of fidelity-based metric and subjective perception. 相似文献