首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 125 毫秒
1.
三角形约束下的词袋模型图像分类方法   总被引:1,自引:0,他引:1  
汪荣贵  丁凯  杨娟  薛丽霞  张清杨 《软件学报》2017,28(7):1847-1861
视觉词袋模型广泛地应用于图像分类与图像检索等领域.在传统词袋模型中,视觉单词统计方法忽略了视觉词之间的空间信息以及分类对象形状信息,导致图像特征表示区分能力不足.本文提出了一种改进的视觉词袋方法,结合显著区域提取和视觉单词拓扑结构,不仅能够产生更具代表性的视觉单词,而且能够在一定程度上避免复杂背景信息和位置变化带来的干扰.首先,通过对训练图像进行显著区域提取,在得到的显著区域上构建视觉词袋模型.其次,为了更精确的描述图像的特征,抵抗多变的位置和背景信息的影响,该方法采用视觉单词拓扑结构策略和三角剖分方法,融入全局信息和局部信息.通过仿真实验,并与传统的词袋模型及其他模型进行比较,结果表明本文提出的方法获得了更高的分类准确率.  相似文献   

2.
针对以往场景识别研究中将图像分割成大小相等的矩形区域进行特征提取而导致识别率低的问题,提出了一种基于超像素空间金字塔模型的场景识别方法:先对图像做不同分辨率的超像素分割,在得到的每个图像子区域中提取PACT特征,然后利用K-means聚类构建出图像集的视觉词典。在进行场景识别时,将每幅图像所有分割子区域的PACT特征连接成一个特征向量,并加入bag of words特征进行分类,最终的场景分类结果在支持向量机LIBSVM上获得。实验结果表明该算法能够有效提高识别率。  相似文献   

3.
针对空间金字塔词袋模型缺少对局部特征之间语义分布关系的表达,提出了一种基于语义短语的空间金字塔词袋模型图像表示方法.首先,将局部特征映射为具有一定语义信息的视觉单词,通过统计局部特征邻域范围内其他相关特征点的语义分布情况来构造语义短语.其次,将语义短语采用稀疏编码进行量化生成语义词典,图像则表示成基于语义词典的空间金字塔式稀疏统计直方图向量.最后,将图像表示向量代入分类器中进行训练和测试.实验结果表明,本文方法能够较大幅度地提高图像分类的准确率.  相似文献   

4.
图像自动标注是计算机视觉与模式识别等领域中的重要问题.针对现有模型未对文本关键词的视觉描述形式进行建模,导致标注结果中大量出现与图像视觉内容无关的标注词等问题,提出了基于相关视觉关键词的图像自动标注模型VKRAM.该模型将标注词分为非抽象标注词与抽象标注词.首先建立非抽象标注词的视觉关键词种子,并提出了一个新方法抽取非抽象标注词对应的视觉关键词集合;接着根据抽象关键词的特点,运用提出的基于减区域的算法抽取抽象关键词对应的视觉关键词种子与视觉关键词集合;然后提出一个自适应参数方法与快速求解算法用于确定不同视觉关键词的相似度阈值;最后将上述方法相结合并用于图像自动标注中.该模型能从一定程度上解决标注结果中出现的大量无关标注词问题.实验结果表明,该模型在大多数指标上相比以往模型均有所提高.  相似文献   

5.
视觉词袋模型在基于内容的图像检索中已经得到了广泛应用,传统的视觉词袋模型一般采用SIFT描述子进行特征提取.针对SIFT描述子的高复杂度、特征提取时间较长的缺点,本文提出采用更加快速的二进制特征描述子ORB来对图像进行特征提取,建立视觉词典,用向量间的距离来比较图像的相似性,从而实现图像的快速检索.实验结果表明,本文提出的方法在保持较高鲁棒性的同时,明显高了图像检索的效率.  相似文献   

6.
针对词袋模型完全忽略空间位置信息的问题,提出了一种多方向空间词袋模型的物体识别方法。该算法通过空间金字塔划分,形成图像的空间子区域特征表达;分别在水平、垂直和倾斜±45°上对图像局部特征向量进行投影,得到图像在多方向上的空间结构信息;采用样本视觉词典方法,既减少了不同物体类别样本带来的冗余影响,又降低了特征维数。在Caltech101和Caltech256物体库上进行了对比实验,实验结果验证了算法的有效性。  相似文献   

7.
针对目前词袋(BoF)特征压缩算法忽略编码矢量之间空间关系的问题,本文给出了压缩算法与金字塔模型相配合的图像分类步骤。同时以多个公开图像数据集为实验对象,对典型词袋特征压缩算法的性能进行比较性研究报道。实验结果表明,压缩算法对于视觉单词数目以及编码方法具有良好的鲁棒性;其中基于子空间方法的压缩算法在高层图像特征空间中的分类性能最优,在多个图像数据集上的分类性能最优,时间开销最小。  相似文献   

8.
传统词袋模型仅仅是将图像表示成视觉单词的直方图,并没有考虑到物体的形状信息,也没有考虑到视觉特征的空间信息.因此将金字塔模型引入到词袋模型中,建立金字塔词袋模型,将金字塔词袋模型与金字塔直方图模型相结合,两种信息相互补充,共同来表征图像;在分类器设计方面采用SVM进行分类.通过在Caltech 101数据库进行实验,验证了论文方法的有效性,实验结果表明,该方法能够大幅度提高图像分类的性能.  相似文献   

9.
近年来,基于bag-of-words模型的图像表示方法由于丢弃了视觉词汇之间的空间位置关系,且存在冗余信息,从而不能有效地表示该类图像。针对传统词袋模型视觉词汇之间相对位置关系利用不足,以及语义信息不明确的问题,提出采用基于支持区域的视觉短语来表示图像。通过支持区域探测得到图像中对分类起重要作用的支持区域,然后对支持区域上的视觉词进行空间建模得到视觉短语用于分类。最后在标准数据集UIUC-Sports8图像库和Scene-15图像库上进行对比实验,实验结果表明该算法具有良好的图像分类性能。  相似文献   

10.
基于高层语义视觉词袋的色情图像过滤模型   总被引:1,自引:0,他引:1  
针对目前色情图像过滤算法对比基尼图像和类肤色图像误检率过高,且不能有效过滤带有淫秽动作的多人色情图像的缺点,提出一种基于高层语义视觉词袋的色情图像过滤模型。该模型首先通过改进的SURF算法提取色情场景局部特征点,然后融合视觉单词的上下文和空间相关高层语义特征,从而构建色情图像的高层语义词典。实验结果表明,该模型检测带有淫秽动作的多人色情图像准确率可达87.6%,明显高于现有的视觉词袋色情图像过滤算法。  相似文献   

11.
Previous works about spatial information incorporation into a traditional bag-of-visual-words (BOVW) model mainly consider the spatial arrangement of an image, ignoring the rich textural information in land-use remote-sensing images. Hence, this article presents a 2-D wavelet decomposition (WD)-based BOVW model for land-use scene classification, since the 2-D wavelet decomposition method does well not only in textural feature extraction, but also in the multi-resolution representation of an image, which is favourable for the use of both spatial arrangement and textural information in land-use images. The proposed method exploits the textural structures of an image with colour information transformed into greyscale. Moreover, it works first by decomposing the greyscale image into different sub-images using 2-D discrete wavelet transform (DWT) and then by extracting local features of the greyscale image and all the decomposed images with dense regions in which a given image is evenly sampled by a regular grid with a specified grid space. After that, the method generates the corresponding visual vocabularies and computes histograms of visual word occurrences of local features found in each former image. Specifically, the soft-assignment or multi-assignment (MA) technique is employed, accounting for the impact of clustering on visual vocabulary creation that two similar image patches may be clustered into different clusters when increasing the size of visual vocabulary. The proposed method is evaluated on a ground truth image dataset of 21 land-use classes manually extracted from high-resolution remote-sensing images. Experimental results demonstrate that the proposed method significantly outperforms previous methods, such as the traditional BOVW model, the spatial pyramid representation-based BOVW method, the multi-resolution representation-based BOVW method, and so on, and even exceeds the best result obtained from the creator of the land-use dataset. Therefore, the proposed approach is very suitable for land-use scene classification tasks.  相似文献   

12.
13.
针对视频检测效率低下问题,提出了一种基于多特征融合及特征阈值的视频场景分类方法—阈值判定分类法。首先,提取场景视频的平均关键帧。然后,根据其结构化特征以及不同空间结构对场景识别的贡献度对平均关键帧进行划分与重组,得到感兴趣区域及次感兴趣区域;接着,分别提取这两个区域的场景特征,并利用多特征融合技术分别得到两者的综合特征。最后,根据综合特征并利用特征阈值,进行场景动态分类。实验结果表明,该方法充分利用了视频的结构化特征,实验准确率达到80%,在一定程度上证明了该分类方法的有效性。  相似文献   

14.
Feature grouping and local soft match for mobile visual search   总被引:1,自引:0,他引:1  
More powerful mobile devices stimulate mobile visual search to become a popular and unique image retrieval application. A number of challenges come up with such application, resulting from appearance variations in mobile images. Performance of state-of-the-art image retrieval systems is improved using bag-of-words approaches. However, for visual search by mobile images with large variations, there are at least two critical issues unsolved: (1) the loss of features discriminative power due to quantization; and (2) the underuse of spatial relationships among visual words. To address both issues, this paper presents a novel visual search method based on feature grouping and local soft match, which considers properties of mobile images and couples visual and spatial information consistently. First features of the query image are grouped using both matched visual features and their spatial relationships; and then grouped features are softly matched to alleviate quantization loss. An efficient score scheme is devised to utilize inverted file index and compared with vocabulary-guided pyramid kernels. Finally experiments on Stanford mobile visual search database and a collected database with more than one million images show that the proposed method achieves promising improvement over the approach with a vocabulary tree, especially when large variations exist in query images.  相似文献   

15.
图像场景分类中视觉词包模型方法综述   总被引:1,自引:1,他引:0       下载免费PDF全文
目的关于图像场景分类中视觉词包模型方法的综述性文章在国内外杂志上还少有报导,为了使国内外同行对图像场景分类中的视觉词包模型方法有一个较为全面的了解,对这些研究工作进行了系统总结。方法在参考国内外大量文献的基础上,对现有图像场景分类(主要指针对单一图像场景的分类)中出现的各种视觉词包模型方法从低层特征的选择与局部图像块特征的生成、视觉词典的构建、视觉词包特征的直方图表示、视觉单词优化等多方面加以总结和比较。结果回顾了视觉词包模型的发展历程,对目前存在的多种视觉词包模型进行了归纳,比较常见方法各自的优缺点,总结了视觉词包模型性能评价方法,并对目前常用的标准场景库进行汇总,同时给出了各自所达到的最高精度。结论图像场景分类中视觉词包模型方法的研究作为计算机视觉领域方兴未艾的热点研究领域,在国内外研究中取得了不少进展,在计算机视觉领域的研究也不再局限于直接应用模型描述图像内容,而是更多地考虑图像与文本的差异。虽然视觉词包模型在图像场景分类的应用中还存在很多亟需解决的问题,但是这丝毫不能掩盖其研究的重要意义。  相似文献   

16.
17.
We have recognized the regions of scene images for image recognition. First, the proposed segmentation method classifies images into several segments without using the Euclidian distance. We need several features to recognize regions. However, they are different for chromatic and achromatic colors. The regions are divided into three categories (black, achromatic, and chromatic). In this article, we focus on the achromatic category. The averages of the intensity and the fractal dimension features of the regions in the achromatic category are calculated. We recognize the achromatic region by using a neural network with suitable features. In order to show the effectiveness of the proposed method, we have recognized the regions. This work was presented in part at the 10th International Symposium on Artificial Life and Robotics, Oita, Japan, February 4–6, 2005  相似文献   

18.
Visual vocabulary representation approach has been successfully applied to many multimedia and vision applications, including visual recognition, image retrieval, and scene modeling/categorization. The idea behind the visual vocabulary representation is that an image can be represented by visual words, a collection of local features of images. In this work, we will develop a new scheme for the construction of visual vocabulary based on the analysis of visual word contents. By considering the content homogeneity of visual words, we design a visual vocabulary which contains macro-sense and micro-sense visual words. The two types of visual words are appropriately further combined to describe an image effectively. We also apply the visual vocabulary to construct image retrieving and categorization systems. The performance evaluation for the two systems indicates that the proposed visual vocabulary achieves promising results.  相似文献   

19.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号