期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

仲亚丽张净黄河《软件》2011,32(4):30-33

为了克服无线多媒体传感器网络中的视频传感节点的处理能力、存储空间和能量严重受限的问题,本文提出一种基于人的视觉系统的分布式视频编码的方案。由于人眼睛的潜在的感受性和视觉掩蔽性能,最小可觉差值以下的变化几乎不能被人的视觉系统感知到。因此,原始帧和边信息之间在感觉阀限之外的信号误差不需要得到纠正。本文基于这种思想,将JND模型应用到WZ编码器中,并对其仿真,实验结果表明,本文所提出的算法在不改变主观质量,甚至提高主观质量时可以较大程度对视频序列进行压缩,降低比特数,具有较好的率失真性能。相似文献

2.

Adaptively imperceptible video watermarking based on the local motion entropy

Zhi Li Xiao-Wei Chen Jianhua Ma 《Multimedia Tools and Applications》2015,74(8):2781-2802

This paper proposes an adaptively imperceptible video watermarking algorithm using the entropy model for local motion characterization. The algorithm firstly combines Human Visual System (HVS) with the block- matching techniques to get the motion-related information. Then it utilizes the entropy model to statistically analyze above motion-related information to obtain the motion entropy of frame. Successively, this algorithm divides every frame into local regions, and then local motion entropy can be obtained according to the motion-related information in local region. By combining the local motion entropy with the motion entropy of frame, the motion characteristics visual masking is adaptively calculated. Based on the motion characteristics visual masking and the contents of video frames, the maximum strength of watermarking is calculated in every block. Experiments indicate that using entropy to local motion characteristics can significantly improve the watermarking imperceptibility, effectively resist common attacks for video watermarking and consequently achieve higher robustness. 相似文献

3.

基于JND的小波域水印算法

杨琦李乔良刘娟梅《计算机与现代化》2008,(3):80-82

基于JND的小波域水印算法,本文利用人类视觉系统的对比感性和掩蔽现象对图像的分块进行分类,然后将水印以不同的强度系数进行嵌入,这有效地兼顾了稳健性和不可感知性的要求。实验表明该方法对常见的图像处理有良好的鲁棒性。相似文献

4.

A coherent computational approach to model bottom-up visual attention 总被引：5，自引：0，他引：5

Le Meur O Le Callet P Barba D Thoreau D 《IEEE transactions on pattern analysis and machine intelligence》2006,28(5):802-817

Visual attention is a mechanism which filters out redundant visual information and detects the most relevant parts of our visual field. Automatic determination of the most visually relevant areas would be useful in many applications such as image and video coding, watermarking, video browsing, and quality assessment. Many research groups are currently investigating computational modeling of the visual attention system. The first published computational models have been based on some basic and well-understood human visual system (HVS) properties. These models feature a single perceptual layer that simulates only one aspect of the visual system. More recent models integrate complex features of the HVS and simulate hierarchical perceptual representation of the visual input. The bottom-up mechanism is the most occurring feature found in modern models. This mechanism refers to involuntary attention (i.e., salient spatial visual features that effortlessly or involuntary attract our attention). This paper presents a coherent computational approach to the modeling of the bottom-up visual attention. This model is mainly based on the current understanding of the HVS behavior. Contrast sensitivity functions, perceptual decomposition, visual masking, and center-surround interactions are some of the features implemented in this model. The performances of this algorithm are assessed by using natural images and experimental measurements from an eye-tracking system. Two adequate well-known metrics (correlation coefficient and Kullbacl-Leibler divergence) are used to validate this model. A further metric is also defined. The results from this model are finally compared to those from a reference bottom-up model. 相似文献

5.

人类深度感知立体图像质量评价方法 总被引：1，自引：0，他引：1

下载免费PDF全文

东野升云王世刚韦健陈丽伟吕源治《中国图象图形学报》2011,16(9):1758-1762

立体图像质量评价是感知与显示的基础,也是立体视频系统设计的依据。由于图像的最终接受者是人,所以评价图像质量的关键在于其是否符合人类的视觉系统特性。通过对视觉非线性、对比敏感度、多通道结构和掩蔽效应等人类视觉特性的分析,提出一种符合人类主观感知的立体图像质量评价方法。该方法首先对图像进行5级小波分解,将图像的空间频率按视觉系统的掩盖效应特点分成6个频带分别进行滤波,以改变原始图像的空间频率,然后对每个频带进行相似度度量。根据对比敏感度特性对各个频带的质量评价结果进行加权平均,得到最终的质量评价尺度。实验结果表明,本文方法优于传统的客观质量评价方法,与人的主观感受有更好的一致性,能够反映图像的质量以及立体感。相似文献

6.

联合时空SIFT特征的同源视频检测 总被引：2，自引：1，他引：1

张瑞年于洪涛李邵梅《电子技术应用》2012,38(3):130-133

通过对视频帧序列时空特性的分析,采用"局部趋同,全局异化"的策略,提出了一种联合时域和空域SIFT点特征的特征提取方法。实验表明,基于该特征的同源视频检测方法对于一定的视频变化具有较好的鲁棒性和检测精度。相似文献

7.

符合人眼视觉特性的视频质量评价模型 总被引：12，自引：3，他引：12

下载免费PDF全文

王楠楠李桂苓《中国图象图形学报》2001,6(6):523-527

视频技术的发展为其质量评价的出了新的课题,但由于评价图像质量的关键在于所用视觉模型是否符合人的感知特性,因此评价图象质量必须考虑以视觉锐度,对比度敏感度,多通道结构和掩盖特性为基础的人眼视觉特性（HVS）,为了使人们对基于人眼视觉特性的视频质量评价模型研究现状有所了解,介绍了几种目前比较成功的基于HVS的视频质量评价模型,并分析和总结了它们的性能,最后展望了评价模型的发展。相似文献

8.

分形理论引导的图像临界差异感知阈值估计

下载免费PDF全文

郭嘉骏姜求平邵枫《中国图象图形学报》2022,27(11):3303-3315

目的图像的临界差异(just noticeable difference, JND)阈值估计对提升图像压缩比以及信息隐藏效率具有重要意义。亮度适应性和空域掩蔽效应是决定JND阈值大小的两大核心因素。现有的空域掩蔽模型主要考虑对比度掩蔽和纹理掩蔽两方面。然而,当前采用的纹理掩蔽模型不能有效地描述与纹理粗糙度相关的掩蔽效应对图像JND阈值的影响。对此,本文提出一种基于分形理论的JND阈值估计模型。方法首先,考虑到人眼视觉系统对具有粗糙表面的图像内容变化具有较低的分辨能力,通过经典的分形理论来计算图像局部区域的分形维数,并以此作为对纹理粗糙度的度量,并在此基础上提出一种新的基于纹理粗糙度的纹理掩蔽模型。然后,将提出的纹理掩蔽模型与传统的亮度适应性相结合估计得到初步的JND阈值。最后,考虑到人眼的视觉注意机制,进一步考虑图像内容的视觉显著性,对JND阈值进行感知一致性修正,估计得到最终的JND阈值。结果选取4种相关方法进行对比,结果表明,在注入相同甚至更多噪声的情况下,相较于对比方法中的最优结果,本文方法的平均VSI(visual saliency-induced index)和平均MO... 相似文献

9.

面向视频压缩的显著性协同检测JND模型

李承欣叶锋涂钦陈家祯许力《计算机系统应用》2016,25(11):208-215

为了更好的将人眼感知特性用于视频压缩系统,提出了一种改进的基于显著性协同检测的恰可察觉失真模型（Just Noticeable Distortion,JND）.该模型通过像素域和变换域下联合建模计算得到的最优JND模型,基于上下文感知的显著性算法得到相应的显著图,并将检测结果用于JND模型权值分配.提出的JND残差滤波器可以嵌入到HEVC视频编码框架中.实验结果表明:在全I帧配置下,提出的算法编码结果与HM16相比,在视觉主观感知质量一致的情况下,平均码率可节省10.7%. 相似文献

10.

Continuous frame motion sensitive self-supervised collaborative network for video representation learning

《Advanced Engineering Informatics》2023

Motion, as a feature of video that changes in temporal sequences, is crucial to visual understanding. The powerful video representation and extraction models are typically able to focus attention on motion features in challenging dynamic environments to complete more complex video understanding tasks. However, previous approaches discriminate mainly based on similar features in the spatial or temporal domain, ignoring the interdependence of consecutive video frames. In this paper, we propose the motion sensitive self-supervised collaborative network, a video representation learning framework that exploits a pretext task to assist feature comparison and strengthen the spatiotemporal discrimination power of the model. Specifically, we first propose the motion-aware module, which extracts consecutive motion features from the spatial regions by frame difference. The global–local contrastive module is then introduced, with context and enhanced video snippets being defined as appropriate positive samples for a broader feature similarity comparison. Finally, we introduce the snippet operation prediction module, which further assists contrastive learning to obtain more reliable global semantics by sensing changes in continuous frame features. Experimental results demonstrate that our work can effectively extract robust motion features and achieve competitive performance compared with other state-of-the-art self-supervised methods on downstream action recognition and video retrieval tasks. 相似文献

11.

Bottom-up spatiotemporal visual attention model for video analysis 总被引：3，自引：0，他引：3

Rapantzikos K. Tsapatsoulis N. Avrithis Y. Kollias S. 《Image Processing, IET》2007,1(2):237-248

The human visual system (HVS) has the ability to fixate quickly on the most informative (salient) regions of a scene and therefore reducing the inherent visual uncertainty. Computational visual attention (VA) schemes have been proposed to account for this important characteristic of the HVS. A video analysis framework based on a spatiotemporal VA model is presented. A novel scheme has been proposed for generating saliency in video sequences by taking into account both the spatial extent and dynamic evolution of regions. To achieve this goal, a common, image-oriented computational model of saliency-based visual attention is extended to handle spatiotemporal analysis of video in a volumetric framework. The main claim is that attention acts as an efficient preprocessing step to obtain a compact representation of the visual content in the form of salient events/objects. The model has been implemented, and qualitative as well as quantitative examples illustrating its performance are shown. 相似文献

12.

Wavelet-based image watermarking with visibility range estimation based on HVS and neural networks

Hung-Hsu Tsai^{Author Vitae} Chi-Chih Liu Author Vitae 《Pattern recognition》2011,44(4):751-763

This work proposes a wavelet-based image watermarking (WIW) technique, based on the human visible system (HVS) model and neural networks, for image copyright protection. A characteristic of the HVS, which is called the just noticeable difference (JND) profile, is employed in the watermark embedding to enhance the imperceptibility of the technique. First, we derive the allowable visibility ranges of the JND thresholds for all coefficients of a wavelet-transformed image. The WIW technique exploits the ranges to compute the adaptive strengths to be superimposed in the wavelet coefficients while embedding watermarks. An artificial neural network (ANN) is then used to memorize the relationships between the original wavelet coefficients and its watermark version. Consequently, the trained ANN is utilized for estimating the watermark without the original image. Many existing schemes require the original image to be involved in the calculation of the JND profile of the image. Finally, computer simulations demonstrate that both transparency and robustness of the WIW technique are superior to that of other proposed methods. 相似文献

13.

基于主观质量的JPEGXR量化参数选择

刘致远陈耀武《计算机工程》2014,(1):239-245

在JPEGXR图像标准的基础上,提出一种提高其压缩效率的编码方法。该方法利用人类视觉系统对图片的感知特点,设计基于图像内容的自适应量化参数选择算法。根据最小可觉差模型,以图像的局部纹理和局部亮度为参数,将图像压缩过程中的宏块分为6类,对每类宏块的直流、低频、高频系数赋予不同的量化参数,从而使得整幅图像的码率根据纹理复杂度和亮度合理分布,在保持主观质量不变的情况下,减小图像码率,最终提高压缩效率。实验结果表明,相对于固定量化参数算法,该算法可使图像压缩效率得到最高10％的提升。相似文献

14.

Disparity-based just-noticeable-difference model for perceptual stereoscopic video coding using depth of focus blur effect

《Displays》2016

Human 3D perception provides an important clue to the removal of redundancy in stereoscopic 3D (S3D) videos. Because objects outside the binocular fusion limit cannot be fused on retina, the human visual system (HVS) makes them blur according to the depth-of-focus (DOF) effect to increase the binocular fusion limit and suppress diplopia, i.e. double vision. Based on human depth perception, we propose a disparity-based just-noticeable-difference model (DJND) to save bit-rate and improve visual comfort in S3D videos. We combine the DOF blur effect with conventional JND models in the pixel domain into DJND. Firstly, we use disparity information to get the average disparity value of each block. Then, we integrate the DOF blur effect into luminance JND (LJND) by a selective low pass Gaussian filter to minimize the visual stimulus in S3D videos. Finally, we incorporate disparity information into the filtered JND models to obtain DJND. Experimental results demonstrate that the proposed method successfully improves both image quality and visual comfort in viewing S3D videos without increasing the bit-rate. 相似文献

15.

Compressed-domain-based no-reference video quality assessment model considering fast motion and scene change

Hong Zhang Fan Li Na Li 《Multimedia Tools and Applications》2017,76(7):9485-9502

Due to the variability of wireless channel state, video quality monitoring became very important for guaranteeing users’ Quality of Experience (QoE). QoE presents the overall perceptual quality of service from the subjective users’ perspective. However, because of diverse characteristics of video content, Human Visual System (HVS) cannot give the same attention to whole scene simultaneously when facing video sequence. In this paper, we proposed a video quality assessment model by considering the influence of fast motion and scene change. The motion change contribution factor and scene change contribution factor are defined to quantify the characteristics of video content, which is closely related to the users’ QoE. Based on G.1070, our proposed model considers the influential factors of loss nature of video coding, variability of practical network and video features. Also, the proposed model owns low computational complexity due to the compressed domain approach for the estimation of the model parameters. Therefore, the video quality is assessed without fully decoding the video stream. The performance of our proposed model has been compared with five existing models and the results also shown that our model has high prediction accuracy closing to human perception. 相似文献

16.

Perceptually adaptive spread transform image watermarking scheme using Hadamard transform

Santi P. Maity Malay K. Kundu 《Information Sciences》2011,181(3):450-465

The present paper proposes a digital image watermarking scheme using the characteristics of the human visual system (HVS), spread transform technique and statistical information measure. Spread transform (ST) scheme is implemented using the transform coefficients of both the host and the watermark signal. Watermark embedding strength is adaptively adjusted using frequency sensitivity, luminance, contrast and entropy masking of HVS model. The choice of Hadamard transform as watermark embedding domain offers several advantages, such as low loss in image information (higher image fidelity), greater reliability of watermark detection and higher data hiding capacity at high degree of compression. Performance of the proposed method is compared with a number of recently reported watermarking schemes based on spread spectrum (SS) and quantization index modulation (QIM). 相似文献

17.

A robust transformer GAN for unpaired data makeup transfer

Yan Chen Jiajian Xie Jiajun Xue Hui Wang 《Concurrency and Computation》2024,36(9):e7994

The objective of makeup transfer is to apply the makeup style of, thereby creating the similar appearance as if it was professionally done. This technique has significant practical applications in fashion, beauty, and video special effects industries. However, there are several challenges faced by current makeup transfer models: (1) Low-resolution images can only achieve partial makeup transfer in mainstream models. (2) Difficulty arises in obtaining paired data consisting of both makeup and non-makeup images. (3) Spatial displacement occurs due to differences in subject and pose between reference and source images, affecting corresponding feature regions. (4) Mainstream models primarily focus on local feature characteristics while lacking global feature perception. To address these challenges, this paper proposes a nonpaired data makeup transfer model based on swin transformer generative adversarial networks. Additionally, an improved progressive generative adversarial network model (PSC-GAN), incorporating semantic perception and channel attention mechanisms, is proposed to enhance the effectiveness of makeup transfer. 相似文献

18.

面向手术器械语义分割的半监督时空Transformer网络

李耀仟李才子刘瑞强司伟鑫金玥明王平安《软件学报》2022,33(4):1501-1515

基于内窥镜的微创手术机器人在临床上的应用日益广泛,为医生提供内窥镜视频中精准的手术器械分割信息,对提高医生操作的准确度、改善患者预后有重要意义.现阶段,深度学习框架训练手术器械分割模型需要大量精准标注的术中视频数据,然而视频数据标注成本较高,在一定程度上限制了深度学习在该任务上的应用.目前的半监督方法通过预测与插帧,可... 相似文献

19.

Perceptual auto-regressive texture synthesis for video coding

Zhihua Bao Chen Xu Chong Wang 《Multimedia Tools and Applications》2013,64(3):535-547

Traditional video compression methods consider the statistical redundancy among pixels as the only adversary of compression, with the perceptual redundancy totally neglected. However, it is well-known that none criterion is as eloquent as the visual quality of an image. To reach higher compression ratios without perceptually degrading the reconstructed signal, the properties of the human visual system (HVS) need to be better exploited. Recent research indicates that HVS has different sensitivities towards different image content, based on which a novel perceptual video coding method is explored in this paper to achieve better perceptual coding quality while spending fewer bits. A new texture segmentation method exploiting just noticeable distortion (JND) profile is first devised to detect and classify texture regions in video scenes. To effectively remove temporal redundancies while preserving high visual quality, an auto-regressive (AR) model is then applied to synthesize the texture regions and combine with other regions which are encoded by the traditional hybrid coding scheme. To demonstrate the performance, the proposed scheme is integrated into the H.264/AVC video coding system. Experimental results show that on various sequences with different types of texture regions, we can reduce the bit-rate for 15% to 58% while maintaining good perceptual quality. 相似文献

20.

Modelling salient visual dynamics in videos

Duan-Yu Chen 《Multimedia Tools and Applications》2011,53(1):271-284

Automatic video annotation is a critical step for content-based video retrieval and browsing. Detecting the focus of interest in video frames automatically can benefit the tedious manual labeling process. However, producing an appropriate extent of visually salient regions in video sequences is a challenging task. Therefore, in this work, we propose a novel approach for modeling dynamic visual attention based on spatiotemporal analysis. Our model first detects salient points in three-dimensional video volumes, and then uses the points as seeds to search the extent of salient regions in a novel motion attention map. To determine the extent of attended regions, we use the maximum entropy in the spatial domain to analyze the dynamics derived by spatiotemporal analysis. Our experiment results show that the proposed dynamic visual attention model achieves high precision value of 70% and reveals its robustness in successive video volumes. 相似文献