首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
提出了一种新的基于生成-判别模型的目标检测与跟踪方法。利用DAISY特征描述子所具有的对光照、形变、视角、尺度的不变性以及计算高效的特点,提取目标稳定的特征点并加以表达,形成生成模型;采用霍夫森林分类器作为判别模型,用以训练目标图像块。在后续视频序列中利用目标的检测结果和判别码本的相似性测量对模型进行更新,构建一个动态自适应的判别码本。实验证明这种将快速有效的DAISY描述子和识别率高、鲁棒性强的霍夫森林分类器相结合的算法,跟踪精度高、实时性较好,具有目标局部防遮挡能力和不同分辨率下的识别能力。  相似文献   

3.
An activity monitoring system allows many applications to assist in care giving for elderly in their homes. In this paper we present a wireless sensor network for unintrusive observations in the home and show the potential of generative and discriminative models for recognizing activities from such observations. Through a large number of experiments using four real world datasets we show the effectiveness of the generative hidden Markov model and the discriminative conditional random fields in activity recognition.  相似文献   

4.
Activity recognition in smart homes enables the remote monitoring of elderly and patients. In healthcare systems, reliability of a recognition model is of high importance. Limited amount of training data and imbalanced number of activity instances result in over-fitting thus making recognition models inconsistent. In this paper, we propose an activity recognition approach that integrates the distance minimization (DM) and probability estimation (PE) approaches to improve the reliability of recognitions. DM uses distances of instances from the mean representation of each activity class for label assignment. DM is useful in avoiding decision biasing towards the activity class with majority instances; however, DM can result in over-fitting. PE on the other hand has good generalization abilities. PE measures the probability of correct assignments from the obtained distances, while it requires a large amount of data for training. We apply data oversampling to improve the representation of classes with less number of instances. Support vector machine (SVM) is applied to combine the outputs of both DM and PE, since SVM performs better with imbalanced data and further improves the generalization ability of the approach. The proposed approach is evaluated using five publicly available smart home datasets. The results demonstrate better performance of the proposed approach compared to the state-of-the-art activity recognition approaches.  相似文献   

5.
This paper proposes an efficient technique for learning a discriminative codebook for scene categorization. A state-of-the-art approach for scene categorization is the Bag-of-Words (BoW) framework, where codebook generation plays an important role in determining the performance of the system. Traditionally, the codebook generation methods adopted in the BoW techniques are designed to minimize the quantization error, rather than optimize the classification accuracy. In view of this, this paper tries to address the issue by careful design of the codewords such that the resulting image histograms for each category will retain strong discriminating power, while the online categorization of the testing image is as efficient as in the baseline BoW. The codewords are refined iteratively to improve their discriminative power offline. The proposed method is validated on UIUC Scene-15 dataset and NTU Scene-25 dataset and it is shown to outperform other state-of-the-art codebook generation methods in scene categorization.  相似文献   

6.
Recently hybrid generative discriminative approaches have emerged as an efficient knowledge representation and data classification engine. However, little attention has been devoted to the modeling and classification of non-Gaussian and especially proportional vectors. Our main goal, in this paper, is to discover the true structure of this kind of data by building probabilistic kernels from generative mixture models based on Liouville family, from which we develop the Beta-Liouville distribution, and which includes the well-known Dirichlet as a special case. The Beta-Liouville has a more general covariance structure than the Dirichlet which makes it more practical and useful. Our learning technique is based on a principled purely Bayesian approach which resulted models are used to generate support vector machine (SVM) probabilistic kernels based on information divergence. In particular, we show the existence of closed-form expressions of the Kullback-Leibler and Rényi divergences between two Beta-Liouville distributions and then between two Dirichlet distributions as a special case. Through extensive simulations and a number of experiments involving synthetic data, visual scenes and texture images classification, we demonstrate the effectiveness of the proposed approaches.  相似文献   

7.
8.
This paper proposes a method for scene categorization by integrating region contextual information into the popular Bag-of-Visual-Words approach. The Bag-of-Visual-Words approach describes an image as a bag of discrete visual words, where the frequency distributions of these words are used for image categorization. However, the traditional visual words suffer from the problem when faced these patches with similar appearances but distinct semantic concepts. The drawback stems from the independently construction each visual word. This paper introduces Region-Conditional Random Fields model to learn each visual word depending on the rest of the visual words in the same region. Comparison with the traditional Conditional Random Fields model, there are two areas of novelty. First, the initial label of each patch is automatically defined based on its visual feature rather than manually labeling with semantic labels. Furthermore, the novel potential function is built under the region contextual constraint. The experimental results on the three well-known datasets show that Region Contextual Visual Words indeed improves categorization performance compared to traditional visual words.  相似文献   

9.
目的 由于图像检索中存在着低层特征和高层语义之间的“语义鸿沟”,图像自动标注成为当前的关键性问题.为缩减语义鸿沟,提出了一种混合生成式和判别式模型的图像自动标注方法.方法 在生成式学习阶段,采用连续的概率潜在语义分析模型对图像进行建模,可得到相应的模型参数和每幅图像的主题分布.将这个主题分布作为每幅图像的中间表示向量,那么图像自动标注的问题就转化为一个基于多标记学习的分类问题.在判别式学习阶段,使用构造集群分类器链的方法对图像的中间表示向量进行学习,在建立分类器链的同时也集成了标注关键词之间的上下文信息,因而能够取得更高的标注精度和更好的检索效果.结果 在两个基准数据集上进行的实验表明,本文方法在Corel5k数据集上的平均精度、平均召回率分别达到0.28和0.32,在IAPR-TC12数据集上则达到0.29和0.18,其性能优于大多数当前先进的图像自动标注方法.此外,从精度—召回率曲线上看,本文方法也优于几种典型的具有代表性的标注方法.结论 提出了一种基于混合学习策略的图像自动标注方法,集成了生成式模型和判别式模型各自的优点,并在图像语义检索的任务中表现出良好的有效性和鲁棒性.本文方法和技术不仅能应用于图像检索和识别的领域,经过适当的改进之后也能在跨媒体检索和数据挖掘领域发挥重要作用.  相似文献   

10.
纪野  戴亚平  廣田薰  邵帅 《控制与决策》2024,39(4):1305-1314
针对动态场景下的图像去模糊问题,提出一种对偶学习生成对抗网络(dual learning generative adversarial network, DLGAN),该网络可以在对偶学习的训练模式下使用非成对的模糊图像和清晰图像进行图像去模糊计算,不再要求训练图像集合必须由模糊图像与其对应的清晰图像成对组合而成. DLGAN利用去模糊任务与重模糊任务之间的对偶性建立反馈信号,并使用这个信号约束去模糊任务和重模糊任务从两个不同的方向互相学习和更新,直到收敛.实验结果表明,在结构相似度和可视化评估方面, DLGAN与9种使用成对数据集训练的图像去模糊方法相比具有更好的性能.  相似文献   

11.
目的 图像文本信息在日常生活中无处不在,其在传递信息的同时,也带来了信息泄露问题,而图像文字去除算法很好地解决了这个问题,但存在文字去除不干净以及文字去除后的区域填充结果视觉感受不佳等问题。为此,本文提出了一种基于门循环单元(gate recurrent unit,GRU)的图像文字去除模型,可以高质量和高效地去除图像中的文字。方法 通过由门循环单元组成的笔画级二值掩膜检测模块精确地获得输入图像的笔画级二值掩膜;将得到的笔画级二值掩膜作为辅助信息,输入到基于生成对抗网络的文字去除模块中进行文字的去除和背景颜色的回填,并使用本文提出的文字损失函数和亮度损失函数提升文字去除的效果,以实现对文字高质量去除,同时使用逆残差块代替普通卷积,以实现高效率的文字去除。结果 在1 080组通过人工处理得到的真实数据集和使用文字合成方法合成的1 000组合成数据集上,与其他3种文字去除方法进行了对比实验,实验结果表明,在峰值信噪比和结构相似性等图像质量指标以及视觉效果上,本文方法均取得了更好的性能。结论 本文提出的基于门循环单元的图像文字去除模型,与对比方法相比,不仅能够有效解决图像文字去除不干净以及文字去除后的区域与背景不一致问题,并能有效地减少模型的参数量和计算量,最终整体计算量降低了72.0%。  相似文献   

12.
On combining classifier mass functions for text categorization   总被引:4,自引:0,他引:4  
Experience shows that different text classification methods can give different results. We look here at a way of combining the results of two or more different classification methods using an evidential approach. The specific methods we have been experimenting with in our group include the support vector machine, kNN (nearest neighbors), kNN model-based approach (kNNM), and Rocchio methods, but the analysis and methods apply to any methods. We review these learning methods briefly, and then we describe our method for combining the classifiers. In a previous study, we suggested that the combination could be done using evidential operations and that using only two focal points in the mass functions gives good results. However, there are conditions under which we should choose to use more focal points. We assess some aspects of this choice from an reasoning perspective and suggest a refinement of the approach.  相似文献   

13.
Dou  Jianfang  Qin  Qin  Tu  Zimei 《Multimedia Tools and Applications》2017,76(14):15839-15866
Multimedia Tools and Applications - Effective object appearance model is one of the key issues for the success of visual tracking. Since the appearance of a target and the environment changes...  相似文献   

14.
陈雷  陈启军 《控制与决策》2012,27(9):1320-1324
在机器人场景识别问题中,将连续场景的相关性通过基于隐马尔可夫模型的上下文模型进行描述.采用不同于传统的使用生成模型方法学习上下文场景识别模型的方式,首先引入稀疏贝叶斯学习机对上下文模型中图像特征的后验概率进行建模,然后通过贝叶斯原理将稀疏贝叶斯模型与隐马尔可夫模型结合,提出一种能够实现上下文场景识别模型的判别学习方法.在真实场景数据库上的实验结果表明,由该方法得到的上下文场景识别系统具有很好的场景识别能力和泛化特性.  相似文献   

15.
In this work, we study the problem of cross-domain video concept detection, where the distributions of the source and target domains are different. Active learning can be used to iteratively refine a source domain classifier by querying labels for a few samples in the target domain, which could reduce the labeling effort. However, traditional active learning method which often uses a discriminative query strategy that queries the most ambiguous samples to the source domain classifier for labeling would fail, when the distribution difference between two domains is too large. In this paper, we tackle this problem by proposing a joint active learning approach which combines a novel generative query strategy and the existing discriminative one. The approach adaptively fits the distribution difference and shows higher robustness than the ones using single strategy. Experimental results on two synthetic datasets and the TRECVID video concept detection task highlight the effectiveness of our joint active learning approach.  相似文献   

16.
17.
Twitter is a worldwide social media platform where millions of people frequently express ideas and opinions about any topic. This widespread success makes the analysis of tweets an interesting and possibly lucrative task, being those tweets rarely objective and becoming the targeting for large-scale analysis. In this paper, we explore the idea of integrating two fundamental aspects of a tweet, the proper textual content and its underlying structural information, when addressing the tweet categorization task. Thus, not only we analyze textual content of tweets but also analyze the structural information provided by the relationship between tweets and users, and we propose different methods for effectively combining both kinds of feature models extracted from the different knowledge sources. In order to test our approach, we address the specific task of determining the political opinion of Twitter users within their political context, observing that our most refined knowledge integration approach performs remarkably better (about 5 points above) than the textual-based classic model.  相似文献   

18.
Breakthrough performances have been achieved in computer vision by utilizing deep neural networks. In this paper we propose to use random forest to classify image representations obtained by concatenating multiple layers of learned features of deep convolutional neural networks for scene classification. Specifically, we first use deep convolutional neural networks pre-trained on the large-scale image database Places to extract features from scene images. Then, we concatenate multiple layers of features of the deep neural networks as image representations. After that, we use random forest as the classifier for scene classification. Moreover, to reduce feature redundancy in image representations we derived a novel feature selection method for selecting features that are suitable for random forest classification. Extensive experiments are conducted on two benchmark datasets, i.e. MIT-Indoor and UIUC-Sports. Obtained results demonstrated the effectiveness of the proposed method. The contributions of the paper are as follows. First, by extracting multiple layers of deep neural networks, we can explore more information of image contents for determining their categories. Second, we proposed a novel feature selection method that can be used to reduce redundancy in features obtained by deep neural networks for classification based on random forest. In particular, since deep learning methods can be used to augment expert systems by having the systems essentially training themselves, and the proposed framework is general, which can be easily extended to other intelligent systems that utilize deep learning methods, the proposed method provide a potential way for improving performances of other expert and intelligent systems.  相似文献   

19.
20.
We propose a new supervised object retrieval method based on the selection of local visual features learned with the BLasso algorithm. BLasso is a boosting-like procedure that efficiently approximates the Lasso path through backward regularization steps. The advantage compared to a classical boosting strategy is that it produces a sparser selection of visual features. This allows us to improve the efficiency of the retrieval and, as discussed in the paper, it facilitates human visual interpretation of the models generated. We carried out our experiments on the Caltech-256 dataset with state-of-the-art local visual features. We show that our method outperforms AdaBoost in effectiveness while significantly reducing the model complexity and the prediction time. We discuss the evaluation of the visual models obtained in terms of human interpretability.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号