首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 171 毫秒
1.
对图像局部进行风格迁移通常会导致风格溢出和较小的区域风格化后效果不明显,针对该问题,提出一种图像显著性区域风格迁移方法.首先,根据人眼视觉注意机制的特点,对训练图像数据集中的显著性区域进行标注,采用快速语义分割模型进行训练,得出包含图像显著性区域的二值掩码图.然后,通过精简快速神经风格迁移模型网络层结构,并在生成网络部...  相似文献   

2.
周莺  张基宏  梁永生  柳伟 《计算机科学》2015,42(11):118-122
为了更准确有效地提取人眼观察视频的显著性区域,提出一种基于视觉运动特性的视频时空显著性区域提取方法。该方法首先通过分析视频每帧的频域对数谱得到空域显著图,利用全局运动估计和块匹配得到时域显著图,再结合人眼观察视频时的视觉特性,根据对不同运动特性视频的主观感知,动态融合时空显著图。实验分析从主客观两个方面衡量。视觉观测和量化指标均表明, 与其他经典方法相比,所提方法提取的显著性区域能够更准确地反映人眼的视觉注视区域。  相似文献   

3.
提出一种基于视觉注意机制的运动目标跟踪方法。该方法借鉴人类的视觉注意机制的研究成果,建立视觉注意机制的计算模型,计算视频中各部分内容的视觉显著性。结合视觉显著性计算结果,提取视频图像中的显著性目标。利用颜色分布模型作为目标的特征表示模型,与视频中各显著目标进行特征匹配,实现目标的跟踪。在多个视频序列中进行实验,并给出相应的实验结果及分析。实验结果表明,提出的目标检测与跟踪算法是正确有效的。  相似文献   

4.
进行客观视频质量评价时,为了与主观评价结果尽可能一致,需要考虑视频的动态特性和人眼观看的视觉特性,因此本文提出一种基于显著区域和运动特性加权的视频质量评价方法。该评价指标基于传统的结构相似性指数(Structural similarity index measurement, SSIM)方法并在此基础上作了改进。首先通过频谱分析得到空域显著度,通过视觉注意模型并结合运动特性获取时域显著度,并根据时、空显著度动态融合得到帧级显著度。以帧级显著度加权SSIM指数,便可得到整个视频帧的质量评价指标。在LIVE VQA标准数据集上的实验结果表明,该评价指标更加接近于人眼对视频质量的主观评价值。  相似文献   

5.
基于时空注意模型的视频分割算法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对已有视频分割算法对复杂动态背景下所出现的误分割问题,提出通过显著性映射构造时空注意特征,并采用分层条件随机场进行视频分割,提高分割准确率。算法首先根据视觉注意理论提取时域和空域特征,并建立加权混合模型。其次,采用该混合模型计算运动目标的显著性映射概率分布,有效地提取出运动目标区域。最后,在显著性映射概率分布基础上,采用高斯混合模型建立前景和背景的能量函数,构造分层条件随机场模型对这些特征能量函数进行分割建模,精确地提取出运动对象目标。实验结果表明,该算法即使对复杂动态背景下的视频也能够得到稳定的分割效果,有效地去除摄像机运动等所导致的误分割问题。  相似文献   

6.
目的 为研究多场景下的行人检测,提出一种视觉注意机制下基于语义特征的行人检测方法。方法 首先,在初级视觉特征基础上,结合行人肤色的语义特征,通过将自下而上的数据驱动型视觉注意与自上而下的任务驱动型视觉注意有机结合,建立空域静态视觉注意模型;然后,结合运动信息的语义特征,采用运动矢量熵值计算运动显著性,建立时域动态视觉注意模型;在此基础上,以特征权重融合的方式,构建时空域融合的视觉注意模型,由此得到视觉显著图,并通过视觉注意焦点的选择完成行人检测。结果 选用标准库和实拍视频,在Matlab R2012a平台上,进行实验验证。与其他视觉注意模型进行对比仿真,本文方法具有良好的行人检测效果,在实验视频上的行人检测正确率达93%。结论 本文方法在不同的场景下具有良好的鲁棒性能,能够用于提高现有视频监控系统的智能化性能。  相似文献   

7.
罗晓林  罗雷 《计算机科学》2016,43(Z6):171-174, 183
针对多视点视频的压缩问题,提出一种基于视觉显著性分析的编码算法。该算法根据人眼对显著性区域的失真更加敏感这一特性,通过控制显著性区域与非显著性区域的编码质量来有效提高多视点视频编码的效率。首先,利用融合颜色与运动信息的视频显著性滤波器提取出多视点视频图像像素级精度的视觉显著性图;然后,将所有视点视频的视觉显著性图转换为编码宏块的显著性表示;最后,利用感知视频编码的原理实现基于显著性的宏块质量自适应控制。实验结果表明,该算法有效地提高了多视点视频编码的率失真效率及主观视频质量。  相似文献   

8.
在遥感影像上,道路被认为是颜色、纹理、形状相似的狭长线状目标,基于此特征可知,整个道路网在影像上会呈现非常显著的特征,极易引起人眼的注意,我们称之为感兴趣区域。感兴趣区域是场景中最能引起用户兴趣、体现图像主要内容的区域,视觉认知理论的研究表明:通过视觉注意机制可以模拟人眼的观察过程,找出遥感影像上的显著区域。本文提出应用视觉注意机制辅助遥感影像道路网提取的思想,通过对影像的显著区域进行分析和处理,得到最终的道路网。对比实验表明该算法可以有效的提高道路网提取的准确率和完整性。  相似文献   

9.
基于脉冲余弦变换的选择性视觉注意模型   总被引:1,自引:0,他引:1  
提出一种基于脉冲余弦变换的视觉注意模型,它模仿自底向上视觉注意的形成机制。该模型结构简单,计算速度快,能够应用于实时处理系统。在该模型中,视觉显著性可表示为二元编码,这与人脑神经元脉冲放电方式相符合。运动显著性也可通过这些二元编码生成。此外,该模型还可推广为基于Hebb学习规则的神经网络。实验结果表明,在人眼注视点预测性能上,该模型优于其它经典视觉注意模型。  相似文献   

10.
提出一种基于深度学习的图像显著性区域检测方法,该方法对2种视觉注意机制所涉及的低级对比特征和高级语义特征分别进行提取,并结合2类特征进行模型训练最终得到基于分类思想的图像显著性区域检测模型--SCS检测模型。通过对比实验得出:该方法训练得到的检测模型在检测准确度上具有显著的优势。  相似文献   

11.
目的 经典的人眼注视点预测模型通常采用跳跃连接的方式融合高、低层次特征,容易导致不同层级之间特征的重要性难以权衡,且没有考虑人眼在观察图像时偏向中心区域的问题。对此,本文提出一种融合注意力机制的图像特征提取方法,并利用高斯学习模块对提取的特征进行优化,提高了人眼注视点预测的精度。方法 提出一种新的基于多重注意力机制(multiple attention mechanism, MAM)的人眼注视点预测模型,综合利用3种不同的注意力机制,对添加空洞卷积的ResNet-50模型提取的特征信息分别在空间、通道和层级上进行加权。该网络主要由特征提取模块、多重注意力模块和高斯学习优化模块组成。其中,空洞卷积能够有效获取不同大小的感受野信息,保证特征图分辨率大小的不变性;多重注意力模块旨在自动优化获得的低层丰富的细节信息和高层的全局语义信息,并充分提取特征图通道和空间信息,防止过度依赖模型中的高层特征;高斯学习模块用来自动选择合适的高斯模糊核来模糊显著性图像,解决人眼观察图像时的中心偏置问题。结果 在公开数据集SALICON(saliency in context)上的实验表明,提出的方法相较于同结...  相似文献   

12.

Saliency prediction models provide a probabilistic map of relative likelihood of an image or video region to attract the attention of the human visual system. Over the past decade, many computational saliency prediction models have been proposed for 2D images and videos. Considering that the human visual system has evolved in a natural 3D environment, it is only natural to want to design visual attention models for 3D content. Existing monocular saliency models are not able to accurately predict the attentive regions when applied to 3D image/video content, as they do not incorporate depth information. This paper explores stereoscopic video saliency prediction by exploiting both low-level attributes such as brightness, color, texture, orientation, motion, and depth, as well as high-level cues such as face, person, vehicle, animal, text, and horizon. Our model starts with a rough segmentation and quantifies several intuitive observations such as the effects of visual discomfort level, depth abruptness, motion acceleration, elements of surprise, size and compactness of the salient regions, and emphasizing only a few salient objects in a scene. A new fovea-based model of spatial distance between the image regions is adopted for considering local and global feature calculations. To efficiently fuse the conspicuity maps generated by our method to one single saliency map that is highly correlated with the eye-fixation data, a random forest based algorithm is utilized. The performance of the proposed saliency model is evaluated against the results of an eye-tracking experiment, which involved 24 subjects and an in-house database of 61 captured stereoscopic videos. Our stereo video database as well as the eye-tracking data are publicly available along with this paper. Experiment results show that the proposed saliency prediction method achieves competitive performance compared to the state-of-the-art approaches.

  相似文献   

13.
利用视觉显著性和粒子滤波的运动目标跟踪   总被引:1,自引:1,他引:0       下载免费PDF全文
针对运动目标跟踪问题,提出一种利用视觉显著性和粒子滤波的目标跟踪算法.借鉴人类视觉注意机制的研究成果,根据目标的颜色、亮度和运动等特征形成目标的视觉显著性特征,与目标的颜色分布模型一起作为目标的特征表示模型,利用粒子滤波进行目标跟踪.该算法能够克服利用单一颜色特征所带来的跟踪不稳定问题,并能有效解决由于目标形变、光照变化以及目标和背景颜色分布相似而产生的跟踪困难问题,具有较强的鲁棒性.在多个视频序列中进行实验,并给出相应的实验结果和分析.实验结果表明,该算法用于实现运动目标跟踪是正确有效的.  相似文献   

14.
目的 立体视频能提供身临其境的逼真感而越来越受到人们的喜爱,而视觉显著性检测可以自动预测、定位和挖掘重要视觉信息,可以帮助机器对海量多媒体信息进行有效筛选。为了提高立体视频中的显著区域检测性能,提出了一种融合双目多维感知特性的立体视频显著性检测模型。方法 从立体视频的空域、深度以及时域3个不同维度出发进行显著性计算。首先,基于图像的空间特征利用贝叶斯模型计算2D图像显著图;接着,根据双目感知特征获取立体视频图像的深度显著图;然后,利用Lucas-Kanade光流法计算帧间局部区域的运动特征,获取时域显著图;最后,将3种不同维度的显著图采用一种基于全局-区域差异度大小的融合方法进行相互融合,获得最终的立体视频显著区域分布模型。结果 在不同类型的立体视频序列中的实验结果表明,本文模型获得了80%的准确率和72%的召回率,且保持了相对较低的计算复杂度,优于现有的显著性检测模型。结论 本文的显著性检测模型能有效地获取立体视频中的显著区域,可应用于立体视频/图像编码、立体视频/图像质量评价等领域。  相似文献   

15.
Visual saliency is an important research topic in the field of computer vision due to its numerous possible applications. It helps to focus on regions of interest instead of processing the whole image or video data. Detecting visual saliency in still images has been widely addressed in literature with several formulations. However, visual saliency detection in videos has attracted little attention, and is a more challenging task due to additional temporal information. A common approach for obtaining a spatio-temporal saliency map is to combine a static saliency map and a dynamic saliency map. In our work, we model the dynamic textures in a dynamic scene with local binary patterns to compute the dynamic saliency map, and we use color features to compute the static saliency map. Both saliency maps are computed using a bio-inspired mechanism of human visual system with a discriminant formulation known as center surround saliency, and are fused in a proper way. The proposed model has been extensively evaluated with diverse publicly available datasets which contain several videos of dynamic scenes, and comparison with state-of-the art methods shows that it achieves competitive results.  相似文献   

16.
In this paper we propose a system for the analysis of user generated video (UGV). UGV often has a rich camera motion structure that is generated at the time the video is recorded by the person taking the video, i.e., the ?camera person.? We exploit this structure by defining a new concept known as camera view for temporal segmentation of UGV. The segmentation provides a video summary with unique properties that is useful in applications such as video annotation. Camera motion is also a powerful feature for identification of keyframes and regions of interest (ROIs) since it is an indicator of the camera person's interests in the scene and can also attract the viewers' attention. We propose a new location-based saliency map which is generated based on camera motion parameters. This map is combined with other saliency maps generated using features such as color contrast, object motion and face detection to determine the ROIs. In order to evaluate our methods we conducted several user studies. A subjective evaluation indicated that our system produces results that is consistent with viewers' preferences. We also examined the effect of camera motion on human visual attention through an eye tracking experiment. The results showed a high dependency between the distribution of fixation points of the viewers and the direction of camera movement which is consistent with our location-based saliency map.  相似文献   

17.
There is need to detect regions of small defects in a large background, when product surface quality in line is inspected by machine vision systems. A computational model of visual attention was developed for solving the problem, inspired by the behavior and the neuronal architecture of human visual attention. Firstly, the global feature was extracted from input image by law’s rules, then local features were extracted and evaluated with an improved saliency map model of Itti. The local features were fused into a single topographical saliency map by a multi-feature fusion operator differenced from Itti model, in which the better feature has the higher weighting coefficient and more contribution to fusion of the feature’s images. Finally, the regions were “popped out” in the map. Experimental results show that the model can locate regions of interest and exclude the most background regions.  相似文献   

18.
杨凡  蔡超 《计算机应用》2016,36(11):3217-3221
针对已有视觉注意模型在整合对象特征方面的不足,提出一种新的结合高层对象特征和低层像素特征的视觉注意方法。首先,利用已训练的卷积神经网(CNN)对多类目标的强大理解能力,获取待处理图像中对象的高层次特征图;然后结合实际的眼动跟踪数据,训练多个对象特征图的加权系数,给出对象级突出图;紧接着提取像素级突出图,并和对象级突出图融合获得显著图;最后,在OSIE和MIT数据集上验证了该方法,并与国际上流行的视觉注意方法进行对比,结果显示该算法在OSIE数据集上获得的AUC值相对更高。实验结果表明,所提方法能够更加充分地利用图像中对象信息,提高显著性预测的准确率。  相似文献   

19.
This paper presents a spatio-temporal saliency model that predicts eye movement during video free viewing. This model is inspired by the biology of the first steps of the human visual system. The model extracts two signals from video stream corresponding to the two main outputs of the retina: parvocellular and magnocellular. Then, both signals are split into elementary feature maps by cortical-like filters. These feature maps are used to form two saliency maps: a static and a dynamic one. These maps are then fused into a spatio-temporal saliency map. The model is evaluated by comparing the salient areas of each frame predicted by the spatio-temporal saliency map to the eye positions of different subjects during a free video viewing experiment with a large database (17000 frames). In parallel, the static and the dynamic pathways are analyzed to understand what is more or less salient and for what type of videos our model is a good or a poor predictor of eye movement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号