首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 923 毫秒
1.
目的词袋模型在图像分类领域中的分类效果主要受限于局部特征的量化误差。针对这一点,提出一种融合多尺度码本的全局编码图像分类方法,有效减少特征量化误差。方法通过使用多尺度特征密集采样,构建多尺度码本,使码本具备一种层次结构,通过充分利用图像特征的流形结构,计算码本全局信息,实现全局编码。通过本文方法得到的编码系数比较平滑和准确。最后使用多路径方法,分别将不同尺度的特征表示进行级联,得到最终的图像特征表示。这种特征表示具备了一定程度上的尺度不变性。结果在UIUC-8和Caltech-101两个常用的标准图像数据集上进行测试,分类准确率分别达到88.0%和83.2%。结论实验结果表明,相比于基于固定尺度码本的局部编码方法,本文方法在分类识别率方面有了显著提升。  相似文献   

2.
词袋模型是图像检索中的一种关键技术。词袋模型中每张图像表示为视觉词在码本中的频率直方图。这样的检索方式忽视了视觉词间对于图像表示很重要的空间信息。提出一种全新的基于最长公共视觉词串的图像检索方法。词串的提取基于视觉词间的拓扑关系,包含很多图像的空间信息。在Holiday数据集上的实验结果表明提出的方法提升了词袋模型的检索效果。  相似文献   

3.
视觉注意力机制在细粒度图像分类中得到了广泛的应用。现有方法多是构建一个注意力权重图对特征进行简单加权处理。对此,本文提出了一种基于可端对端训练的深度神经网络模型实现的多通道视觉注意力机制,首先通过多视觉注意力图描述对应于视觉物体的不同区域,然后提取对应高阶统计特性得到相应的视觉表示。在多个标准的细粒度图像分类测试任务中,基于多通道视觉注意的视觉表示方法均优于近年主流方法。  相似文献   

4.
霍华  赵刚 《计算机工程》2012,38(22):276-278
针对传统视觉词袋模型对图像尺度变化较为敏感的缺点,提出一种基于改进视觉词袋模型的图像标注方法。该方法引入图像的多尺度空间信息,对图像进行多尺度变换并构建多尺度视觉词汇表,将图像表示为不同尺度特征,结合多核学习的方法优化各尺度特征的相应权重,获取特征表示。实验结果验证了该方法的有效性,其标注准确率比传统BoVW模型提高17.8%~25.7%。  相似文献   

5.
采用上下文金字塔特征的场景分类   总被引:2,自引:0,他引:2  
为了能有效地表述场景图像的语义特性,提出一种基于图像块上下文信息的场景图像分类框架.首先用规则网格将图像分块,并提取每个块的SIFT特征;然后用K均值算法对训练图像的块特征聚类,形成块类型的码本;再根据此码本对图像块进行量化,得到图像的视觉词汇表示,形成视觉词汇图,并在其上建立2类视觉词汇模型:相邻共现的不同视觉词汇对模型和连续共现的相同视觉词汇群模型;最后应用空间金字塔匹配建立视觉词汇的上下文金字塔特征,并采用SVM分类器进行分类.实验结果证明,在常用的场景图像库上,文中方法比已有的典型方法具有更好的场景分类性能.  相似文献   

6.
提出一种基于类别相关码本的方法,该方法为每一个类别产生一个码本,在训练和测试任意2个类别之间的分类器时,仅考虑与这2类相关码本上形成的图像向量.在保留相关类别码本多样性的同时,能降低输入分类器的图像向量的维数,有效避免维数灾难.实验结果表明,与传统的基于单个全局码本的方法相比,该方法具有更好的分类性能.  相似文献   

7.
传统基于超图的图像分类方法在构建超图时并未考虑各超边之间的关系,导致最终分类效果不理想.文中结合图像视觉信息和标注信息量化超边间相关性,提出一种基于超边相关性的图像分类方法,有效地将图像相关的标注信息作为判定图像类别的指标引入到图像分类中,进而对图像进行更准确的分类.在LabelMe和UIUC数据集上的实验验证该方法的有效性.  相似文献   

8.
一种用于图像分类的多视觉短语学习方法   总被引:2,自引:0,他引:2  
针对词袋图像表示模型的语义区分性和描述能力有限的问题,以及由于传统的基于词袋模型的分类方法性能容易受到图像中背景、遮挡等因素影响的问题,本文提出了一种用于图像分类的多视觉短语学习方法.通过构建具有语义区分性和空间相关性的视觉短语取代视觉单词,以改善图像的词袋模型表示的准确性.在此基础上,结合多示例学习思想,提出一种多视觉短语学习方法,使最终的分类模型能反映图像类别的区域特性.在一些标准测试集合如Calrech-101[1]和Scene-15[2]上的实验结果验证了本文所提方法的有效性,分类性能分别相对提高了约9%和7%.  相似文献   

9.
高分辨率遥感图像中飞机目标的检测和识别具有重要的军事和民用价值,针对以往方法易受灰度分布和形态变化及伪装干扰等缺点,提出一种基于视觉词袋模型的高分辨率遥感图像飞机目标检测的新方法。为了精简飞机视觉码本得到最具鉴别力的视觉单词,结合相关性及冗余度分析去除视觉码本中不相关、弱相关以及冗余的视觉单词,选择对飞机目标检测最为重要的视觉单词,减少了计算复杂度,提高了算法的检测性能。  相似文献   

10.
梁晔  刘宏哲  于剑 《计算机科学》2014,41(8):281-285
BoF特征是目前应用最广泛的图像表示方法。针对BoF特征编码简单、缺乏空间信息的缺点,对传统BoF流程中的特征编码和特征汇集阶段进行改进,提出了用于图像分类的新图像表示方法。首先对图像进行了基于多环划分的特征汇集的区域选择,嵌入了更多的空间信息;其次,根据密采样的特征描述子符合长尾分布的事实以及场景中特征分布比较均匀的特点,提出了适合于场景图像分类的多视觉词硬编码的编码方法。新的图像表示方法保存了BoF范式的优点,且特征表示更加紧凑、空间信息更加丰富。实验结果证明了所提方法的有效性。  相似文献   

11.
The problem of object category classification by committees or ensembles of classifiers, each of which is based on one diverse codebook, is addressed in this paper. Two methods of constructing visual codebook ensembles are proposed in this study. The first technique introduces diverse individual visual codebooks using different clustering algorithms. The second uses various visual codebooks of different sizes for constructing an ensemble with high diversity. Codebook ensembles are trained to capture and convey image properties from different aspects. Based on these codebook ensembles, different types of image representations can be acquired. A classifier ensemble can be trained based on different expression datasets from the same training image set. The use of a classifier ensemble to categorize new images can lead to improved performance. Detailed experimental analysis on a Pascal VOC challenge dataset reveals that the present ensemble approach performs well, consistently improves the performance of visual object classifiers, and results in state-of-the-art performance in categorization.  相似文献   

12.
The Bag of Words (BoW) model is one of the most popular and effective image representation methods and has been drawn increasing interest in computer vision filed. However, little attention is paid on it in visual tracking. In this paper, a visual tracking method based on Bag of Superpixels (BoS) is proposed. In BoS, the training samples are oversegmented to generate enough superpixel patches. Then K-means algorithm is performed on the collected patches to form visual words of the target and a superpixel codebook is constructed. Finally the tracking is accomplished via searching for the highest likelihood between candidates and codebooks within Bayesian inference framework. In this process, an effective updating scheme is adopted to help our tracker resist occlusions and deformations. Experimental results demonstrate that the proposed method outperforms several state-of-the-art trackers.  相似文献   

13.
小波树结构快速矢量量化编码方法   总被引:3,自引:0,他引:3       下载免费PDF全文
提出了基于人眼视觉属性和应用小波树结构2快速图象编码的矢量量化图象编码方法,简称为树结构快速矢量量化编码。在分析此方法矢量量化特点之后,设计产生码本的统计方法,并提出了矢量量化编码的快速算法。  相似文献   

14.
对典型的竞争学习算法进行了研究和分析,提出了一种基于神经元获胜概率的概率敏感竞争虎法。与传统竞争学习算法只有一个神经元获胜而得到学习不同,PSCL算法按照各种凶的获胜概率并通过对失真距离的调整使每个神经元均得到不同的学习,可以有效地克服神经元欠利用问题。  相似文献   

15.
When images are described with visual words based on vector quantization of low-level color, texture, and edge-related visual features of image regions, it is usually referred as “bag-of-visual words (BoVW)”-based presentation. Although it has proved to be effective for image representation similar to document representation in text retrieval, the hard image encoding approach based on one-to-one mapping of regions to visual words is not expressive enough to characterize the image contents with higher level semantics and prone to quantization error. Each word is considered independent of all the words in this model. However, it is found that the words are related and their similarity of occurrence in documents can reflect the underlying semantic relations between them. To consider this, a soft image representation scheme is proposed by spreading each region’s membership values through a local fuzzy membership function in a neighborhood to all the words in a codebook generated by self-organizing map (SOM). The topology preserving property of the SOM map is exploited to generate a local membership function. A systematic evaluation of retrieval results of the proposed soft representation on two different image (natural photographic and medical) collections has shown significant improvement in precision at different recall levels when compared to different low-level and “BoVW”-based feature that consider only probability of occurrence (or presence/absence) of a word.  相似文献   

16.
17.
An image compression technique is proposed that attempts to achieve both robustness to transmission bit errors common to wireless image communication, as well as sufficient visual quality of the reconstructed images. Error robustness is achieved by using biorthogonal wavelet subband image coding with multistage gain-shape vector quantization (MS-GS VQ) which uses three stages of signal decomposition in an attempt to reduce the effect of transmission bit errors by distributing image information among many blocks. Good visual quality of the reconstructed images is obtained by applying genetic algorithms (GAs) to codebook generation to produce reconstruction capabilities that are superior to the conventional techniques. The proposed decomposition scheme also supports the use of GAs because decomposition reduces the problem size. Some simulations for evaluating the performance of the proposed coding scheme on both transmission bit errors and distortions of the reconstructed images are performed. Simulation results show that the proposed MS-GS VQ with good codebooks designed by GAs provides not only better robustness to transmission bit errors but also higher peak signal-to-noise ratio even under high bit error rate conditions  相似文献   

18.
Recently, image representation based on bag-of-visual-words (BoW) model has been popularly applied in image and vision domains. In BoW, a visual codebook of visual words is defined, usually by clustering local features, to represent any novel image with the occurrence of its contained visual words. Given a set of images, we argue that the significance of each image is determined by the significance of its contained visual words. Traditionally, the significances of visual words are defined by term frequency-inverse document frequency (tf-idf), which cannot necessarily capture the intrinsic visual context. In this paper, we propose a new scheme of latent visual context learning (LVCL). The visual context among images and visual words is formulated from latent semantic context and visual link graph analysis. With LVCL, the importance of visual words and images will be distinguished from each other, which will facilitate image level applications, such as image re-ranking and canonical image selection.We validate our approach on text-query based search results returned by Google Image. Experimental results demonstrate the effectiveness and potentials of our LVCL in applications of image re-ranking and canonical image selection, over the state-of-the-art approaches.  相似文献   

19.
董健 《计算机应用》2014,34(4):1172-1176
针对传统的视觉词袋模型中视觉词典对底层特征量化时容易引入量化误差,以及视觉单词的适用性不足等问题,提出了基于加权特征空间信息视觉词典的图像检索模型。从产生视觉词典的常用聚类算法入手,分析和探讨了聚类算法的特点,考虑聚类过程中特征空间的特征分布统计信息,通过实验对不同的加权方式进行对比,得出效果较好的均值加权方案,据此对视觉单词的重要程度加权,提高视觉词典的描述能力。对比实验表明,在ImageNet图像数据集上,相对于同源视觉词典,非同源视觉词典对视觉空间的划分影响较小,且基于加权特征空间信息视觉词典在大数据集上更加有效。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号