首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recently, image representation based on bag-of-visual-words (BoW) model has been popularly applied in image and vision domains. In BoW, a visual codebook of visual words is defined, usually by clustering local features, to represent any novel image with the occurrence of its contained visual words. Given a set of images, we argue that the significance of each image is determined by the significance of its contained visual words. Traditionally, the significances of visual words are defined by term frequency-inverse document frequency (tf-idf), which cannot necessarily capture the intrinsic visual context. In this paper, we propose a new scheme of latent visual context learning (LVCL). The visual context among images and visual words is formulated from latent semantic context and visual link graph analysis. With LVCL, the importance of visual words and images will be distinguished from each other, which will facilitate image level applications, such as image re-ranking and canonical image selection.We validate our approach on text-query based search results returned by Google Image. Experimental results demonstrate the effectiveness and potentials of our LVCL in applications of image re-ranking and canonical image selection, over the state-of-the-art approaches.  相似文献   

2.
When images are described with visual words based on vector quantization of low-level color, texture, and edge-related visual features of image regions, it is usually referred as “bag-of-visual words (BoVW)”-based presentation. Although it has proved to be effective for image representation similar to document representation in text retrieval, the hard image encoding approach based on one-to-one mapping of regions to visual words is not expressive enough to characterize the image contents with higher level semantics and prone to quantization error. Each word is considered independent of all the words in this model. However, it is found that the words are related and their similarity of occurrence in documents can reflect the underlying semantic relations between them. To consider this, a soft image representation scheme is proposed by spreading each region’s membership values through a local fuzzy membership function in a neighborhood to all the words in a codebook generated by self-organizing map (SOM). The topology preserving property of the SOM map is exploited to generate a local membership function. A systematic evaluation of retrieval results of the proposed soft representation on two different image (natural photographic and medical) collections has shown significant improvement in precision at different recall levels when compared to different low-level and “BoVW”-based feature that consider only probability of occurrence (or presence/absence) of a word.  相似文献   

3.
提出了一种基于对偶树复小波变换的模糊纹理图像分割算法,该方法包括纹理特征提取和纹理分类两个阶段,其中,特征提取在对偶树复小波变换的基础上进行;纹理分类可以直接用模糊C均值算法进行聚类从而完成纹理的分割,但由于该算法中隶属度函数是基于样本到类中心的距离设计的,这对非球形分布数据很不合理,针对该问题,引入样本与样本的紧致度来度量类中各个样本之间的关系从而修正隶属度函数,并将其用于纹理分类。实验结果表明与模糊C均值算法在运行时间上相差不大的情况下,改进的方法在分割精度、边缘准确性和区域一致性上都得到了明显的改善。  相似文献   

4.
5.
6.
基于词袋模型的图像表示方法的有效性主要受限于局部特征的量化误差。文中提出一种基于多视觉码本的图像表示方法,通过综合考虑码本构建和编码方法这两个方面的因素加以改进。具体包括:1)多视觉码本构建,以迭代方式构建多个紧凑且具有互补性的视觉码本;2)图像表示,首先针对多码本的情况,依次从各码本中选择相应的视觉单词并采用线性回归估计编码系数,然后结合图像的空间金字塔结构形成最终的图像表示。在一些标准测试集合的图像分类结果验证文中方法的有效性。  相似文献   

7.
An Image Retrieval Method Using DCT Features   总被引:1,自引:0,他引:1       下载免费PDF全文
  相似文献   

8.
This paper proposes an efficient technique for learning a discriminative codebook for scene categorization. A state-of-the-art approach for scene categorization is the Bag-of-Words (BoW) framework, where codebook generation plays an important role in determining the performance of the system. Traditionally, the codebook generation methods adopted in the BoW techniques are designed to minimize the quantization error, rather than optimize the classification accuracy. In view of this, this paper tries to address the issue by careful design of the codewords such that the resulting image histograms for each category will retain strong discriminating power, while the online categorization of the testing image is as efficient as in the baseline BoW. The codewords are refined iteratively to improve their discriminative power offline. The proposed method is validated on UIUC Scene-15 dataset and NTU Scene-25 dataset and it is shown to outperform other state-of-the-art codebook generation methods in scene categorization.  相似文献   

9.
10.
The relationship between visual words and local feature (words structure) or the distribution among images (images structure) is important in feature encoding to approximate the intrinsically discriminative structure of images in the Bag-of-Words (BoW) model. However, in recently most methods, the intrinsic invariance in intra-class images is difficultly captured using words structure or images structure for large variability image classification. To overcome this limitation, we propose a local visual feature coding based on heterogeneous structure fusion (LVFC-HSF) that explores the nonlinear relationship between words structure and images structure in feature space, as follows. First, we utilize high-order topology to describe the dependence of the visual words, and use the distance measurement based on the local feature to represent the distribution of images. Then, we construct the unitedly optimal framework according to the relevance between words structure and images structure to solve the projection matrix of local feature and the weight coefficient, which can exploit the nonlinear relationship of heterogeneous structure to balance their interaction. Finally, we adopt the improving fisher kernel(IFK) to fit the distribution of the projected features for obtaining the image feature. The experimental results on ORL, 15 Scenes, Caltech 101 and Caltech 256 demonstrate that heterogeneous structure fusion significantly enhances the intrinsic structure construction, and consequently improves the classification performance in these data sets.  相似文献   

11.
基于模糊梯度法的边缘检测方法   总被引:7,自引:0,他引:7  
根据图象边缘灰度的梯度变化,构造图象灰度模糊矩阵和描述边缘点的隶属函数,利用遗传算法实现隶属函数各参数的寻优过程,并由输出隶属度判断提取图象边缘点,实现了图象的边缘检测。实验表明,该方法能有效地描述边缘的穿越过程,并可改善检测结果。  相似文献   

12.
针对三维掌纹特征表示的鲁棒性和准确性问题,提出一种融合曲面的几何特征和 方向特征的三维掌纹识别方法。基于现有的曲面类型编码提取掌纹几何特征的基础上,提出使 用基于形状指数的编码来共同表达三维掌纹的几何特征,从而有效减少由阈值所引起的错误编 码带来的准确性上的影响。此外,提出一种多尺度的改进竞争编码来表达掌纹的方向特征。在 决策层,使用基于多字典的协同表示框架融合上述几何特征和方向特征以完成掌纹识别。在公 开的三维掌纹数据集上的大量实验表明,所提方法可以在保持较低计算复杂度的同时实现最佳 的识别精度。  相似文献   

13.
张天刚  张景安  康苏明 《软件》2012,(8):28-31,50
由于人脸面貌特征与性别存在着一定的不确定性,提出了基于模糊隶属度的人脸图像性别识别。用对光照、灰度变化具有较强鲁棒性的局部二进制模式(LocalBinary Pattern,LBP)提取人脸特征,首先将人脸均分为多个子窗口,对所有子窗口提取LBP直方图,然后将这些直方图顺次连接来描述人脸。细致推导了适用于人脸图像性别识别的模糊函数,根据最大隶属度原则,来识别人脸的性别。在FG-NET人脸库及自建的FID人脸库中进行了实验,取得了96%的最高识别率。  相似文献   

14.
The problem of sharp boundary widely exists in image classification algorithms that use traditional association rules. This problem makes classification more difficult and inaccurate. On the other hand, massive image data will produce a lot of redundant association rules, which greatly decrease the accuracy and efficiency of image classification. To relieve the influence of these two problems, this paper proposes a novel approach integrating fuzzy association rules and decision tree to accomplish the task of automatic image annotation. According to the original features with membership functions, the approach first obtains fuzzy feature vectors, which can describe the ambiguity and vagueness of images. Then fuzzy association rules are generated from fuzzy feature vectors with fuzzy support and fuzzy confidence. Fuzzy association rules can capture correlations between low-level visual features and high-level semantic concepts of images. Finally, to tackle the large size of fuzzy association rules base, we adopt decision tree to reduce the unnecessary rules. As a result, the algorithm complexity is decreased to a large extent. We conduct the experiments on two baseline datasets, i.e. Corel5k and IAPR-TC12. The evaluation measures include precision, recall, F-measure and rule number. The experimental results show that our approach performs better than many state-of-the-art automatic image annotation approaches.  相似文献   

15.
针对传统扣件检测方法式效率低、可靠性差,不能满足现代铁路检修的需要,提出了一种基于计算机视觉的扣件缺失自动检测方法。在对灰度图像进行Canny边缘检测处理后采用十字交叉定位法对扣件位置进行定位,得到120×200像素的扣件区域,并提取扣件图像的20个边缘特征值;最后,利用模糊C均值聚类算法对这两类的特征量进行聚类分析,通过计算待诊断对象与标准模式的隶属度实现对扣件状态的分类。应用验证表明:采用的图像处理方法和识别分类算法能够有效检出轨道扣件缺失,检测速度快,鲁棒性好,检出率达96%。  相似文献   

16.
基于模糊矢量量化图象编码的研究   总被引:4,自引:0,他引:4       下载免费PDF全文
分析了模糊矢量量化(FVQ)图象编码的原理,给出了FVQ设计三要素。提出了用于图象编码的指数型模糊矢量量化算法(FVQE)。实验结果表明,FVQE的图象编码性能与FVQ相当,但收敛速度要略快于FVQ算法。  相似文献   

17.
场景理解是机器人在多样化环境中自主执行任务的前提,而场景发现是场景理解的一个重要内容.由于具体场景在空间和时间上存在连续性,可以假定移动机器人在某一段时间内处于同一场景,并且属于同一场景的图像序列的视觉观感是相似的,因此提出无需先验知识的增量式室外场景发现,通过分层词袋模型建立图像和场景的联系,使得场景发现过程更加类似人类认知模式.对于机器人实时获取的每一副图像,首先将其分块,然后利用动态聚类算法增量式地得到相应的低层词典,并据此词典提取高层词袋模型特征,接下来,再用另一动态聚类算法增量式地完成场景发现,从而判断当前图像属于一个已经历场景,或未经历场景,直到发现新场景.实验结果证明,该方法能够在没有先验知识的情况下有效完成自主场景发现.  相似文献   

18.
标签的制作是深度学习应用的关键步骤,为了克服无人机平台的复杂运动、光照条件不足、地物轮廓复杂等导致遥感影像的地物轮廓提取和标注的难点,文中提出一种改进的Live-wire算法并用于无人机遥感影像的典型地物的标签标注;通过改进模糊隶属度函数克服了Pal-King隶属函数灰度覆盖空间不足的缺陷并结合双阈值方法实现边缘点的提取,以改进的Pal-King的模糊边缘检测方法替代Live-Wire算法的拉普拉斯边缘提取方法;通过增加节点之间梯度幅值的变化特征优化代价函数,以提高Live-Wire算法的轮廓跟踪的连续性;大量的对比实验证明,相较于传统方法,改进的Live-Wire方法的轮廓提取和跟踪的稳健性、效率更高.  相似文献   

19.
Along with the rapid development of mobile terminal devices, landmark recognition applications based on mobile devices have been widely researched in recent years. Due to the fast response time requirement of mobile users, an accurate and efficient landmark recognition system is thus urgent for mobile applications. In this paper, we propose a landmark recognition framework by employing a novel discriminative feature selection method and the improved extreme learning machine (ELM) algorithm. The scalable vocabulary tree (SVT) is first used to generate a set of preliminary codewords for landmark images. An efficient codebook learning algorithm derived from the word mutual information and Visual Rank technique is proposed to filter out those unimportant codewords. Then, the selected visual words, as the codebook for image encoding, are used to produce a compact Bag-of-Words (BoW) histogram. The fast ELM algorithm and the ensemble approach using the ELM classifier are utilized for landmark recognition. Experiments on the Nanyang Technological University campus’s landmark database and the Fifteen Scene database are conducted to illustrate the advantages of the proposed framework.  相似文献   

20.
由于人脸面貌特征与年龄存在着较大的不确定性,提出了基于模糊隶属度的人脸图像年龄估计.用对光照、尺度变化具有很强鲁棒性的Gabor小波变换提取人脸特征,为了避免维数灾难,降低后续计算量,利用主成份分析方法对提取到的特征进行降维,细致推导了适用于人脸图像年龄估计的模糊函数,根据最大隶属度原则,来估计人脸的年龄.在FG-NET人脸库及自建的FAID人脸库中进行了实验,取得了94%的最高识别率.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号