首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Most studies in the literature for video quality assessment have been focused on the evaluation of quantized video sequences at fixed and high spatial and temporal resolutions. Only limited work has been reported for assessing video quality under different spatial and temporal resolutions. In this paper, we consider a wider scope of video quality assessment in the sense of considering multiple dimensions. In particular, we address the problem of evaluating perceptual visual quality of low bit-rate videos under different settings and requirements. Extensive subjective view tests for assessing the perceptual quality of low bit-rate videos have been conducted, which cover 150 test scenarios and include five distinctive dimensions: encoder type, video content, bit rate, frame size, and frame rate. Based on the obtained subjective testing results, we perform thorough statistical analysis to study the influence of different dimensions on the perceptual quality and some interesting observations are pointed out. We believe such a study brings new knowledge into the topic of cross-dimensional video quality assessment and it has immediate applications in perceptual video adaptation for scalable video over mobile networks.   相似文献   

2.
Mobile video quality assessment plays an essential role in multimedia systems and services. In the case of scalable video coding, which enables dynamic adaptation based on terminal capabilities and heterogeneous network, variable resolution is one of the most prominent types of video distortions. In this paper, we propose a new hybrid spatial and temporal distortion metric for evaluating video streaming quality with variable spatio-temporal resolution. The key idea is to project video sequence into feature domain and calculate the distortion of content information from the projected principal component matrix and its eigenvectors. This metric can measures the degree of content information degradation especially in spatio-temporal resolution scalable video. The performance of the proposed metric is evaluated and compared to some state-of-the-art quality evaluation metrics in the literature. Our results show that the proposed metric achieves good correlations with the subjective evaluations of the EPFL scale video database.  相似文献   

3.
Video summarization and retrieval using singular value decomposition   总被引:2,自引:0,他引:2  
In this paper, we propose novel video summarization and retrieval systems based on unique properties from singular value decomposition (SVD). Through mathematical analysis, we derive the SVD properties that capture both the temporal and spatial characteristics of the input video in the singular vector space. Using these SVD properties, we are able to summarize a video by outputting a motion video summary with the user-specified length. The motion video summary aims to eliminate visual redundancies while assigning equal show time to equal amounts of visual content for the original video program. On the other hand, the same SVD properties can also be used to categorize and retrieve video shots based on their temporal and spatial characteristics. As an extended application of the derived SVD properties, we propose a system that is able to retrieve video shots according to their degrees of visual changes, color distribution uniformities, and visual similarities.  相似文献   

4.
冯欣  杨丹  张凌 《自动化学报》2011,37(11):1322-1331
针对网络中受丢包损伤的视频提出了一种基于视觉注意力变化的全参考客观质量评估方法.该方法基于视觉显著性检测在视频数据上的应用,考察受网络丢包失真影响的视频数据与标准参考数据在空间和时间上引起的视觉注意力变化,并根据此变化相应的视觉显著性在空间和时间上的差异,提出了一组客观质量评估方法.文中采用17个受丢包损伤的视频数据进行测试,并实施了主观评价实验作为评价标准.与传统的没有考虑人眼视觉显著特性的质量评估方法,以及目前主流的基于视觉显著区域/感兴趣区域对失真像素进行加权的方法进行对比,实验结果表明, 基于视觉注意力变化的方法较后两者与主观质量评估结果有更好的相关性, 能够更有效地评估丢包损伤视频的质量.  相似文献   

5.
6.
Efficiency of a video coding process, as well as accuracy of an objective video quality evaluation can be significantly improved by introduction of the human visual system (HVS) characteristics. In this paper we analyze one of these characteristics; namely, visual acuity reduction due to the foveated vision and object movements in a video sequence. We propose a new video quality metric called Foveated Mean Squared Error (FMSE) that takes into account a variable resolution of the HVS across the visual field. The highest visual acuity is at the point of fixation that falls into fovea, an area at retina with the highest density of photoreceptors. Visual acuity decreases rapidly for image regions which are further with respect to the fixation point. FMSE also utilizes the effect of additional spatial acuity reduction due to motion in a video sequence. The quality measures calculated by FMSE have shown a high correlation with experimental results obtained by subjective video quality assessment.  相似文献   

7.
结合HVS特性的视频图像质量评价   总被引:1,自引:0,他引:1       下载免费PDF全文
在分析视觉非线性、对比敏感度、多通道结构等人类基本视觉特性的基础上,提出了一种结合人眼视觉系统特性(HVS)的视频图像质量评价新方法。该方法首先对图像信号进行小波分解,基于小波域提取视觉特征参数,然后建立符合人眼视觉感知特性的图像质量评价指标,在此基础上给出具体评价算法。仿真实验结果表明,在视频图像压缩编码中采用本算法的图像质量评价性能优于传统的客观评价方法。  相似文献   

8.
立体图像质量评价对立体视频技术的发展起着非常重要的作用。常用的PSNR(峰值信噪比)方法不能反映人类视觉感知特性,也不能直接应用到立体图像质量评价中。针对人类视觉对立体图像深度感知和重点关注感兴趣区的特点,提出基于纹理图和深度图感兴趣区的质量评价方法。首先对纹理图和对应的深度图利用视觉关注度提取工具提取感兴趣区,然后在评价的过程中对各感兴趣区根据感兴趣程度分配基于像素的权重系数,最后将权重系数应用在立体图像的各个区域中进行评价。实验结果表明该方法得到的立体图像质量客观评价结果与主观评价结果之间具有更好的一致性,符合人眼视觉系统感知特性。  相似文献   

9.
视频失真主要来源于空间和时间失真导致的视频质量退化。针对这两种视频质量退化,提出了一种结合时空特征和视觉感知的全参考视频质量评价方法STPFVQA。首先,使用ResNet50卷积网络从参考视频和失真视频中提取空间感知特征;其次将提取的空间感知特征送入transformer编解码器中,用来构建视频的序列化关系,同时对比参考视频和失真视频,探索失真对视频序列关系造成的影响;然后将transformer的输出送入预测头,形成帧级分数;最后为了模拟人类视觉系统感知的滞后性,从短期、长期和全局记忆效应来综合考虑获得最终的视频质量分数。为了验证方法的可行性,在LIVE、IVC-IC、CSIQ和IVPL四个公开数据集上进行了实验。实验结果表明提出模型更符合人类视觉系统感知情况。在IVC-IC和CSIQ数据集上相比最先进的序列依赖模型(serial dependence modeling,SDM),SROCC评价指标分别高出2.6%和3.1%,KROCC评价指标高出6.1%和7.9%,PLCC评价指标高出2.3%和5.5%。  相似文献   

10.
A coherent computational approach to model bottom-up visual attention   总被引:5,自引:0,他引:5  
Visual attention is a mechanism which filters out redundant visual information and detects the most relevant parts of our visual field. Automatic determination of the most visually relevant areas would be useful in many applications such as image and video coding, watermarking, video browsing, and quality assessment. Many research groups are currently investigating computational modeling of the visual attention system. The first published computational models have been based on some basic and well-understood human visual system (HVS) properties. These models feature a single perceptual layer that simulates only one aspect of the visual system. More recent models integrate complex features of the HVS and simulate hierarchical perceptual representation of the visual input. The bottom-up mechanism is the most occurring feature found in modern models. This mechanism refers to involuntary attention (i.e., salient spatial visual features that effortlessly or involuntary attract our attention). This paper presents a coherent computational approach to the modeling of the bottom-up visual attention. This model is mainly based on the current understanding of the HVS behavior. Contrast sensitivity functions, perceptual decomposition, visual masking, and center-surround interactions are some of the features implemented in this model. The performances of this algorithm are assessed by using natural images and experimental measurements from an eye-tracking system. Two adequate well-known metrics (correlation coefficient and Kullbacl-Leibler divergence) are used to validate this model. A further metric is also defined. The results from this model are finally compared to those from a reference bottom-up model.  相似文献   

11.
In this paper, we study (normalized) disjoint information as a metric for image comparison and its applications to perceptual image quality assessment, image registration, and video tracking. Disjoint information is the joint entropy of random variables excluding the mutual information. This measure of statistical dependence and information redundancy satisfies more rigorous metric conditions than mutual information, including self-similarity, minimality, symmetry and triangle inequality. It is applicable to two or more random variables, and can be computed by vector histogramming, vector Parzen window density approximation, and upper bound approximation involving fewer variables. We show such a theoretic advantage does have implications in practice. In the domain of digital image and video, multiple visual features are extracted and (normalized) compound disjoint information is derived from a set of marginal densities of the image distributions, thus enriching the vocabulary of content representation. The proposed metric matching functions are applied to several domain applications to demonstrate their efficacy.  相似文献   

12.
一种基于内容的图像质量评价测度   总被引:3,自引:1,他引:3       下载免费PDF全文
鉴于传统的图像质量评价测度,如峰值信噪比,不能有效地反映人对图像的视觉感知。为此,提出了一种基于内容的图像质量评价测度;在改进基于结构相似度(structural similarity,SSIM)的图像质量测度基础上,根据图像的内容将图像分成边缘、纹理和平滑区域3部分,在每个区域又利用模糊积分融入了结构相似性的数量信息,从而充分利用了图像结构信息相似性及其在位置和数量上的融合信息来全面评价图像质量。实验结果表明,利用该测度所得到的图像质量评价结果与主观评价结果有着很好的相关性,能较准确地反映人对图像质量的主观感受。  相似文献   

13.
基于纹理和亮度感知特性的率失真优化策略   总被引:1,自引:1,他引:0       下载免费PDF全文
率失真优化(RDO)策略在视频编码体系中对编码效率有着重要影响。当前主流的率失真优化策略以MSE或类似的方法描述失真,不能很好反映人眼的主观感受。为了提高视频编码的主观感知质量,首先建立一种新的感知失真模型。该模型利用人眼对纹理和亮度的敏感特性,使评价结果能够更好地反映主观质量。在此基础上,提出一种基于纹理和亮度感知特性的率失真优化策略,简称TL-RDO(texture and luminance based RDO)策略。该策略对不同的区域自适应地调整拉格朗日乘子,使得编码结果更好地符合人眼的观察特性。实验结果表明,TL-RDO相比现有最常用的QP-RDO方法,编码效率显著提高;与一些典型的基于感知失真特性的率失真优化策略相比,TL-RDO策略计算复杂度较低,适合于实时编码系统。  相似文献   

14.
As the demand for high-quality stereo images has grown in recent years, stereoscopic image quality assessment (SIQA) has become an important research area in modern image processing technology.In this paper, we propose a no-reference stereoscopic image quality assessment (NR-SIQA) model using heterogeneous ensemble learning ‘quality-aware’ features from luminance image, chrominance image, disparity and cyclopean images via quaternion wavelet transform (QWT). Firstly, luminance image and chrominance image are generated by CIELAB color space as monocular perception, and the novel disparity and cyclopean images are utilized to complement with monocular information. Then, a number of ‘quality-aware’ features in the quaternion wavelet domain are discovered, including entropy, texture features, energy features, energy differences features and MSCN coefficients of high frequency sub-band. Finally, a heterogeneous ensemble model via support vector regression (SVR) & extreme learning machine (ELM) & random forest (RF) is proposed to predict quality score, and bootstrap sampling and rotated feature space are used to increase the diversity of data distribution. Comparing with the state-of-the-art NR-SIQA models, experimental results on four public databases prove the accuracy and robustness of the proposed model.  相似文献   

15.
Traditional video compression methods consider the statistical redundancy among pixels as the only adversary of compression, with the perceptual redundancy totally neglected. However, it is well-known that none criterion is as eloquent as the visual quality of an image. To reach higher compression ratios without perceptually degrading the reconstructed signal, the properties of the human visual system (HVS) need to be better exploited. Recent research indicates that HVS has different sensitivities towards different image content, based on which a novel perceptual video coding method is explored in this paper to achieve better perceptual coding quality while spending fewer bits. A new texture segmentation method exploiting just noticeable distortion (JND) profile is first devised to detect and classify texture regions in video scenes. To effectively remove temporal redundancies while preserving high visual quality, an auto-regressive (AR) model is then applied to synthesize the texture regions and combine with other regions which are encoded by the traditional hybrid coding scheme. To demonstrate the performance, the proposed scheme is integrated into the H.264/AVC video coding system. Experimental results show that on various sequences with different types of texture regions, we can reduce the bit-rate for 15% to 58% while maintaining good perceptual quality.  相似文献   

16.
宋健飞  高莉 《计算机应用》2015,35(3):826-829
针对基于亮度和色度的彩色图像边缘检测在检测过程中忽略亮度和色度之间关联性而导致部分边缘不能有效地被检测出来的问题,提出了一种基于四元数的改进型最小核值相似区(SUSAN)边缘检测算法。首先,利用四元数矢量旋转原理将HSI颜色空间的三维信息映射成二维平面信息实现空间降维,同时引入标量V来综合表示H、S、I三通道之间的关系;然后,将标量V作为算子的核函数;最后,利用改进的SUSAN算子完成图像的边缘检测。实验结果表明,提出的算法针对色度相同、饱和度存在差异以及饱和度相同、色度存在差异的彩色图像,在边缘检测的定位误差率上降低了1.5%。在实际的应用中,能够更好地获得图像中的目标信息,同时也为后续的分割和识别研究提供更好的先验知识。  相似文献   

17.
Discovery of a perceptual distance function for measuring image similarity   总被引:3,自引:0,他引:3  
For more than a decade, researchers have actively explored the area of image/video analysis and retrieval. Yet one fundamental problem remains largely unsolved: how to measure perceptual similarity between two objects. For this purpose, most researchers employ a Minkowski-type metric. Unfortunately, the Minkowski metric does not reliably find similarities in objects that are obviously alike. Through mining a large set of visual data, our team has discovered a perceptual distance function. We call the discovered function the dynamic partial function (DPF). When we empirically compare DPF to Minkowski-type distance functions in image retrieval and in video shot-transition detection using our image features, DPF performs significantly better. The effectiveness of DPF can be explained by similarity theories in cognitive psychology.  相似文献   

18.
A good objective metric of image quality assessment (IQA) should be consistent with the subjective judgment of human beings. In this paper, a four-stage perceptual approach for full reference IQA is presented. In the first stage, the visual features are extracted by 2-D Gabor filter that has the excellent performance of modeling the receptive fields of simple cells in the primary visual cortex. Then in the second stage, the extracted features are post-processed by the divisive normalization transform to reflect the nonlinear mechanisms in human visual systems. In the third stage, mutual information between the visual features of the reference and distorted images is employed to measure the visual quality. And in the last pooling stage, the mutual information is converted to the final objective quality score. Experimental results show that the proposed metic has a high correlation with the subjective assessment and outperforms other state-of-the-art metrics.  相似文献   

19.
Space-time super-resolution   总被引:3,自引:0,他引:3  
We propose a method for constructing a video sequence of high space-time resolution by combining information from multiple low-resolution video sequences of the same dynamic scene. Super-resolution is performed simultaneously in time and in space. By "temporal super-resolution," we mean recovering rapid dynamic events that occur faster than regular frame-rate. Such dynamic events are not visible (or else are observed incorrectly) in any of the input sequences, even if these are played in "slow-motion." The spatial and temporal dimensions are very different in nature, yet are interrelated. This leads to interesting visual trade-offs in time and space and to new video applications. These include: 1) treatment of spatial artifacts (e.g., motion-blur) by increasing the temporal resolution and 2) combination of input sequences of different space-time resolutions (e.g., NTSC, PAL, and even high quality still images) to generate a high quality video sequence. We further analyze and compare characteristics of temporal super-resolution to those of spatial super-resolution. These include: the video cameras needed to obtain increased resolution; the upper bound on resolution improvement via super-resolution; and, the temporal analogue to the spatial "ringing" effect.  相似文献   

20.
目的 视频动作质量评估旨在评估视频中特定动作的执行情况和完成质量。自动化的动作质量评估能够有效地减少人力资源的损耗,可以更加精准、公正地对视频内容进行评估。传统动作质量评估方法主要存在以下问题: 1)视频中动作主体的多尺度时空特征问题; 2)认知差异导致的标记内在模糊性问题; 3)多头自注意力机制的注意力头冗余问题。针对以上问题,提出了一种能够感知视频序列中不同时空位置、生成细粒度标记的动作质量评估模型SALDL (self-attention and label distribution learning)。方法 SALDL提出Attention-Inc (attention-inception)结构,该结构通过Embedding、多头自注意力以及多层感知机将自注意力机制渐进式融入Inception结构,使模型能够获得不同尺度卷积特征之间的上下文信息。提出一种正负时间注意力模块PNTA (pos-neg temporal attention),通过PNTA损失挖掘时间注意力特征,从而减少自注意力头冗余并提取不同片段的注意力特征。SALDL模型通过标记增强及标记分布学习生成细粒度的动作质量标记。结果 提出的SALDL模型在MTL-AQA (multitask learning-action quality assessment)和JIGSAWS (JHU-ISI gesture and skill assessment working set)等数据集上进行了大量对比及消融实验,斯皮尔曼等级相关系数分别为0.941 6和0.818 3。结论 SALDL模型通过充分挖掘不同尺度的时空特征解决了多尺度时空特征问题,并引入符合标记分布的先验知识进行标记增强,达到了解决标记的内在模糊性问题以及注意力头的冗余问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号