共查询到17条相似文献,搜索用时 140 毫秒
1.
2.
3.
现有的视觉替代方法均是在特定环境中,通过目标识别获得映射特征,缺少普遍的适用性.针对这一问题,提出了一种基于注意模型的视觉替代方法.利用人眼的视觉特性,提取图像的感兴趣区域,并根据听觉显示的基本原理,提出了将感兴趣区域的位置、尺寸和颜色映射为音符响度、音长和音调的PSC映射方法.实验结果表明:将图像中引起视觉注意的区域映射为电子音符,符合人类视觉认知过程,有利于盲人获得外部环境的重要信息,降低了盲人训练和学习的难度,并且听感良好,不会造成听觉疲劳. 相似文献
4.
分析了基于自底向上的视觉注意计算模型的感兴趣区域检测方法;它分别提取颜色、灰度、纹理三个特征图像,然后进行线性融合得到综合显著图。而显著目标通常自身灰度相近,但与背景灰度不同,根据这个特性结合灰度概率统计方法对视觉注意计算模型进行改进。实验结果验证了,该模型能够更好的模拟视觉注意的过程,而且计算复杂度较低。 相似文献
5.
《现代电子技术》2018,(10):183-186
针对Itti视觉选择性注意模型不具有子特征图显著图归一化过程中权值随任务改变而改变的问题,借鉴自主发育在视觉选择性注意学习的研究成果,提出一种权值可发育视觉选择性注意模型作为图像特征提取的学习机制。该算法采用三层自组织神经网络和Itti视觉选择性注意模型相结合的决策进行寻优,通过对模型的训练学习获取最优权值更新。这样既可以保证在初期特征提取内容的完整性,又降低了系统对不同任务条件的约束性,提高了模型特征提取能力。利用权值可发育视觉选择性注意模型对图像进行感兴趣区域特征提取实验,结果表明,该方法能够提高特征提取准确性、减少运算时间,获得了良好的动态性能。 相似文献
6.
7.
8.
9.
10.
无人机在复杂飞行过程中,因大气气流及光学设备成像等影响造成采集到的红外图像分辨率过低;另外,因各帧图像分辨率不同,基于固定层数分解的金字塔模型在同一区域下的显著图提取结果存在差异,无法借助视觉技术实现无人机目标定位及自主导航。提出一种改进Itti模型下的红外图像感兴趣区域提取及SR重建算法。算法首先引入多特征对红外图像序列进行金字塔动态分层模型构建;然后,针对不同分辨率下的多帧红外图像进行感兴趣区域的动态提取来克服传统Itti算法的不足;最后,提出基于共轭梯度法的目标函数最小化红外图像超分辨率重建算法,对感兴趣区域进行空间SR重建,提高感兴趣区域目标的空间分辨率。实验验证了提出算法的有效性及准确性。 相似文献
11.
高分辨率遥感影像感兴趣区域快速检测 总被引:5,自引:3,他引:2
传统高分辨率遥感影像感兴趣区域的检测方法通常要利用先验知识库对整幅影像进行全局分析与搜索,具有很高计算复杂度。从人眼视觉特性出发,提出一种新的高分辨率遥感影像感兴趣区域快速检测算法。基于视觉关注模型对高分辨率遥感影像进行空间降维,确定视觉关注焦点;根据关注焦点位置在原始遥感影像中描述出相应的感兴趣区域。实验结果表明,新方法不仅具有较低计算复杂度,而且有效避免了影像分割、特征检测等计算复杂度较高的全图搜索方法,提高了高分辨率遥感影像感兴趣区域的检测效率。 相似文献
12.
In the near future, traditional narrow and fixed viewpoint video services will be replaced by high‐quality panorama video services. This paper proposes a visual‐attention‐aware progressive region of interest (RoI) trick mode streaming service (VA‐PRTS) that prioritizes video data to transmit according to the visual attention and transmits prioritized video data progressively. VA‐PRTS enables the receiver to speed up the time to display without degrading the perceptual quality. For the proposed VA‐PRTS, this paper defines a cutoff visual attention metric algorithm to determine the quality of the encoded video slice based on the capability of visual attention and the progressive streaming method based on the priority of RoI video data. Compared to conventional methods, VA‐PRTS increases the bitrate saving by over 57% and decreases the interactive delay by over 66%, while maintaining a level of perceptual video quality. The experiment results show that the proposed VA‐PRTS improves the quality of the viewer experience for interactive panoramic video streaming services. The development results show that the VA‐PRTS has highly practical real‐field feasibility. 相似文献
13.
Identifying visual attention plays an important role in understanding human behavior and optimizing relevant multimedia applications. In this paper, we propose a visual attention identification method based on random walks. In the proposed method, fixations recorded by the eye tracker are partitioned into clusters where each cluster presents a particular area of interest (AOI). In each cluster, we estimate the transition probabilities of the fixations based on their point-to-point adjacency in their spatial positions. We obtain the initial coefficients for the fixations according to their density. We utilizing random walks to iteratively update the coefficients until their convergency. Finally, the center of the AOI is calculated according to the convergent coefficients of the fixations. Experimental results demonstrate that our proposed method which combines the fixations’ spatial and temporal relations, highlights the fixations of higher densities and eliminates the errors inside the cluster. It is more robust and accurate than traditional methods. 相似文献
14.
This paper presents a new framework for capturing intrinsic visual search behavior of different observers in image understanding by analysing saccadic eye movements in feature space. The method is based on the information theory for identifying salient image features based on which visual search is performed. We demonstrate how to obtain feature space fixation density functions that are normalized to the image content along the scan paths. This allows a reliable identification of salient image features that can be mapped back to spatial space for highlighting regions of interest and attention selection. A two-color conjunction search experiment has been implemented to illustrate the theoretical framework of the proposed method including feature selection, hot spot detection, and back-projection. The practical value of the method is demonstrated with computed tomography image of centrilobular emphysema, and we discuss how the proposed framework can be used as a basis for decision support in medical image understanding. 相似文献
15.
为了提高由图像生成文字描述的准确率,文中提出了一种基于传统的编码解码框架,分别在编码端和解码端融入视觉注意力机制的方法,即在编码端加入空间注意力机制和图像通道级注意力机制相结合的方法。在解码端运用自适应视觉注意力机制的方法,即在传统的解码端上加入一个额外的“视觉哨兵”模块。文中提出的方法在生成文字描述的过程中自动决定是依赖图像特征还是依赖语义特征,并传递给相应的注意力机制。实验证明,相比较单一的视觉注意力机制,文中方法取得了较高的图像描述语句的正确率,具有更好的图像描述性能。 相似文献
16.
The huge amount of video data on the internet requires efficient video browsing and retrieval strategies. One of the viable solutions is to provide summaries of the videos in the form of key frames. The video summarization using visual attention modeling has been used of late. In such schemes, the visually salient frames are extracted as key frames on the basis of theories of human attention modeling. The visual attention modeling schemes have proved to be effective in video summarization. However, the high computational costs incurred by these techniques limit their applicability in practical scenarios. In this context, this paper proposes an efficient visual attention model based key frame extraction method. The computational cost is reduced by using the temporal gradient based dynamic visual saliency detection instead of the traditional optical flow methods. Moreover, for static visual saliency, an effective method employing discrete cosine transform has been used. The static and dynamic visual attention measures are fused by using a non-linear weighted fusion method. The experimental results indicate that the proposed method is not only efficient, but also yields high quality video summaries. 相似文献
17.
Qiuxia Wu Zhiyong Wang Feiqi Deng Yong Xia Wenxiong Kang David Dagan Feng 《Journal of Visual Communication and Image Representation》2013,24(7):1064-1074
Constructing the bag-of-features model from Space–time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions. 相似文献