首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Typically, k-means clustering or sparse coding is used for codebook generation in the bag-of-visual words (BoW) model. Local features are then encoded by calculating their similarities with visual words. However, some useful information is lost during this process. To make use of this information, in this paper, we propose a novel image representation method by going one step beyond visual word ambiguity and consider the governing regions of visual words. For each visual application, the weights of local features are determined by the corresponding visual application classifiers. Each weighted local feature is then encoded not only by considering its similarities with visual words, but also by visual words’ governing regions. Besides, locality constraint is also imposed for efficient encoding. A weighted feature sign search algorithm is proposed to solve the problem. We conduct image classification experiments on several public datasets to demonstrate the effectiveness of the proposed method.  相似文献   

2.
The bag of visual words (BOW) model is an efficient image representation technique for image categorization and annotation tasks. Building good visual vocabularies, from automatically extracted image feature vectors, produces discriminative visual words, which can improve the accuracy of image categorization tasks. Most approaches that use the BOW model in categorizing images ignore useful information that can be obtained from image classes to build visual vocabularies. Moreover, most BOW models use intensity features extracted from local regions and disregard colour information, which is an important characteristic of any natural scene image. In this paper, we show that integrating visual vocabularies generated from each image category improves the BOW image representation and improves accuracy in natural scene image classification. We use a keypoint density-based weighting method to combine the BOW representation with image colour information on a spatial pyramid layout. In addition, we show that visual vocabularies generated from training images of one scene image dataset can plausibly represent another scene image dataset on the same domain. This helps in reducing time and effort needed to build new visual vocabularies. The proposed approach is evaluated over three well-known scene classification datasets with 6, 8 and 15 scene categories, respectively, using 10-fold cross-validation. The experimental results, using support vector machines with histogram intersection kernel, show that the proposed approach outperforms baseline methods such as Gist features, rgbSIFT features and different configurations of the BOW model.  相似文献   

3.
4.
提出了一种基于图像全局和局部颜色特征的图像检索方法.首先在符合视觉感知特性的Lab颜色空间中提取全局颜色特征;再对图像进行图像子块划分,同时利用具有人眼视觉特性的高斯加权系数对其进行加权,然后利用二值化得到的颜色位图作为局部颜色特征,并进一步加入了方向性的考虑,对图像子块进行垂直和水平投影,最后合理地融合了全局和局部颜色特征的相似性进行图像检索.对Corel图像数据库的实验结果表明,此算法具有良好的检索效率.  相似文献   

5.
6.
7.
8.
论文针对视觉词袋(BOVW)模型放弃图像空间结构的缺点,提出一种基于Hesse稀疏编码的图像检索算法。首先,建立n-words模型,获得图像局部特征表示。n-words模型由一系列连续视觉词获得,是图像特征的一种高级描述。该文从n=1到n=5进行试验,寻找最恰当的n值;其次,将二阶Hesse能量函数融入标准稀疏编码的目标函数,得到Hesse稀疏编码公式;最后,以获得的n-words序列作为编码特征,利用特征符号搜索算法求解最优Hesse系数,计算相似度,返回检索结果。实验在两类数据集上进行,与BOVW模型和已有的算法相比,新算法极大地提高了图像检索的准确率。  相似文献   

9.
颜文  金炜  符冉迪 《电信科学》2016,32(12):80-85
为了实现快速准确的图像检索目标,提出一种结合VLAD(局部聚合描述符)特征和稀疏表示的图像检索方法。首先,根据图像具有结构细节丰富、局部视觉特征差异明显的特点,提取图像的局部旋转不变SURF特征,并采用局部聚合描述符方法,构造具有旋转不变性的图像VLAD特征,然后将VLAD特征与稀疏表示相结合,设计基于稀疏表示的相似性检索度量准则,实现图像的查询检索。实验结果表明,提出方法在查准率(precision)及平均归一化修正检索排序等指标上,均优于其他几种典型方法,并具有较高的计算效率。  相似文献   

10.
11.
12.
13.
In this paper, we assess three standard approaches to build irregular pyramid partitions for image retrieval in the bag-of-bags of words model that we recently proposed. These three approaches are: kernel \(k\)-means to optimize multilevel weighted graph cuts, normalized cuts and graph cuts, respectively. The bag-of-bags of words (BBoW) model is an approach based on irregular pyramid partitions over the image. An image is first represented as a connected graph of local features on a regular grid of pixels. Irregular partitions (subgraphs) of the image are further built by using graph partitioning methods. Each subgraph in the partition is then represented by its own signature. The BBoW model with the aid of graph extends the classical bag-of-words model, by embedding color homogeneity and limited spatial information through irregular partitions of an image. Compared with existing methods for image retrieval, such as spatial pyramid matching, the BBoW model does not assume that similar parts of a scene always appear at the same location in images of the same category. The extension of the proposed model to pyramid gives rise to a method we name irregular pyramid matching. The experiments on Caltech-101 benchmark demonstrate that applying kernel \(k\)-means to graph clustering process produces better retrieval results, as compared with other graph partitioning methods such as graph cuts and normalized cuts for BBoW. Moreover, this proposed method achieves comparable results and outperforms SPM in 19 object categories on the whole Caltech-101 dataset.  相似文献   

14.
Task-dependent visual-codebook compression   总被引:1,自引:0,他引:1  
  相似文献   

15.
特征子空间学习是图像识别及分类任务的关键技术之一,传统的特征子空间学习模型面临两个主要的问题。一方面是如何使样本在投影到特征空间后有效地保持其局部结构和判别性。另一方面是当样本含噪时传统学习模型所发生的失效问题。针对上述两个问题,该文提出一种基于低秩表示(LRR)的判别特征子空间学习模型,该模型的主要贡献包括:通过低秩表示探究样本的局部结构,并利用表示系数作为样本在投影空间的相似性约束,使投影子空间能够更好地保持样本的局部近邻关系;为提高模型的抗噪能力,构造了一种利用低秩重构样本的判别特征学习约束项,同时增强模型的判别性和鲁棒性;设计了一种基于交替优化技术的迭代数值求解方案来保证算法的收敛性。该文在多个视觉数据集上进行分类任务的对比实验,实验结果表明所提算法在分类准确度和鲁棒性方面均优于传统特征学习方法。  相似文献   

16.
17.
Several deep supervised hashing techniques have been proposed to allow for extracting compact and efficient neural network representations for various tasks. However, many deep supervised hashing techniques ignore several information-theoretic aspects of the process of information retrieval, often leading to sub-optimal results. In this paper, we propose an efficient deep supervised hashing algorithm that optimizes the learned compact codes using an information-theoretic measure, the Quadratic Mutual Information (QMI). The proposed method is adapted to the needs of efficient image hashing and information retrieval leading to a novel information-theoretic measure, the Quadratic Spherical Mutual Information (QSMI). Apart from demonstrating the effectiveness of the proposed method under different scenarios and outperforming existing state-of-the-art image hashing techniques, this paper provides a structured way to model the process of information retrieval and develop novel methods adapted to the needs of different applications.  相似文献   

18.
Representation of image content is an important part of image annotation and retrieval, and it has become a hot issue in computer vision. As an efficient and accurate image content representation model, bag-of-words (BoW) has attracted more attention in recent years. After segmentation, BoW treats all of the image regions equally. In fact, some regions of image are more important than others in image retrieval, such as salient object or region of interest. In this paper, a novel region of interest based bag-of-words model (RoI-BoW) for image representation is proposed. At first, the difference of Gaussian (DoG) is adopted to find key points in an image and generates different size grid as RoI to construct visual words by the BoW model. Furthermore, we analyze the influence of different size segmentation on image content representation by content based image retrieval. Experiments on Corel 5K verify the effectiveness of RoI-BoW on image content representation, and prove that RoI-BoW outperforms the BoW model significantly. Moreover, amounts of experiments illustrate the influence of different size segmentation on image representation based on the Bow model and RoI-BoW model respectively. This work is helpful to choose appropriate grid size in different situations when representing image content, and meaningful to image classification and retrieval.  相似文献   

19.
一种基于随机化视觉词典组和查询扩展的目标检索方法   总被引:1,自引:0,他引:1  
在目标检索领域,当前主流的解决方案是视觉词典法(Bag of Visual Words, BoVW),然而,传统的BoVW方法具有时间效率低、内存消耗大以及视觉单词同义性和歧义性的问题。针对以上问题,该文提出了一种基于随机化视觉词典组和查询扩展的目标检索方法。首先,该方法采用精确欧氏位置敏感哈希(Exact Euclidean Locality Sensitive Hashing, E2LSH)对训练图像库的局部特征点进行聚类,生成一组支持动态扩充的随机化视觉词典组;然后,基于这组词典构建视觉词汇分布直方图和索引文件;最后,引入一种查询扩展策略完成目标检索。实验结果表明,与传统方法相比,该文方法有效地增强了目标对象的可区分性,能够较大地提高目标检索精度,同时,对大规模数据库有较好的适用性。  相似文献   

20.
针对仿射畸变问题,首先构建了基于最大稳定极值区域(MSER)的仿射不变性检测子:根据分离集合森林以及并查集算法提取极值区域,结合成分树和最大稳定判定条件提取MSER。以MSER为底层局部特征区域,生成SIFT描述子并聚类成视觉关键词表。利用标准加权思想,在检索图像上框选查询对象,根据库图像与查询对象的相似度对检索结果进行排序;同时,基于搜索单元区域匹配法的空间一致性度量准则,得到最终的检索结果。实验表明,该极值区域具有可靠的仿射不变性,所开发的检索机制也能显著提升图像检索系统的性能与可靠性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号