首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Probabilistic Models of Appearance for 3-D Object Recognition   总被引:6,自引:0,他引:6  
We describe how to model the appearance of a 3-D object using multiple views, learn such a model from training images, and use the model for object recognition. The model uses probability distributions to describe the range of possible variation in the object's appearance. These distributions are organized on two levels. Large variations are handled by partitioning training images into clusters corresponding to distinctly different views of the object. Within each cluster, smaller variations are represented by distributions characterizing uncertainty in the presence, position, and measurements of various discrete features of appearance. Many types of features are used, ranging in abstraction from edge segments to perceptual groupings and regions. A matching procedure uses the feature uncertainty information to guide the search for a match between model and image. Hypothesized feature pairings are used to estimate a viewpoint transformation taking account of feature uncertainty. These methods have been implemented in an object recognition system, OLIVER. Experiments show that OLIVER is capable of learning to recognize complex objects in cluttered images, while acquiring models that represent those objects using relatively few views.  相似文献   

3.
4.
Many conventional design techniques can be only applied to ideal controlled objects without parameter variations and disturbances. This paper proposes a new method by which these techniques can be applied to practical objects with parameter variations and disturbances. The principle is to construct a model-following system which makes the practical object output follow the model output and to convert the practical object to the ideal object which has extremely low parameter and disturbance sensitivity.  相似文献   

5.
6.
To track objects in video sequences, many studies have been done to characterize the target with respect to its color distribution. Most often, the Gaussian mixture model (GMM) is used to represent the object color density. In this paper, we propose to extend the normality assumption to more general families of distributions issued from the Pearson’s system. Precisely, we propose a method called Pearson mixture model (PMM), used in conjunction with Gaussian copula, which is dynamically updated to adapt itself to the appearance change of the object during the sequence. This model is combined with Kalman filtering to predict the position of the object in the next frame. Experimental results on gray-level and color video sequences show tracking improvements compared to classical GMM. Especially, the PMM seems robust to illumination variations, pose and scale changes, and also to partial occlusions, but its computing time is higher than the computing time of GMM.  相似文献   

7.
In this paper, we propose an approach for learning appearance models of moving objects directly from compressed video. The appearance of a moving object changes dynamically in video due to varying object poses, lighting conditions, and partial occlusions. Efficiently mining the appearance models of objects is a crucial and challenging technology to support content-based video coding, clustering, indexing, and retrieval at the object level. The proposed approach learns the appearance models of moving objects in the spatial-temporal dimension of video data by taking advantage of the MPEG video compression format. It detects a moving object and recovers the trajectory of each macroblock covered by the object using the motion vector present in the compressed stream. The appearances are then reconstructed in the DCT domain along the object's trajectory, and modeled as a mixture of Gaussians (MoG) using DCT coefficients. We prove that, under certain assumptions, the MoG model learned from the DCT domain can achieve pixel-level accuracy when transformed back to the spatial domain, and has a better band-selectivity compared to the MoG model learned in the spatial domain. We finally cluster the MoG models to merge the appearance models of the same object together for object-level content analysis.  相似文献   

8.
Robust visual tracking remains a technical challenge in real-world applications, as an object may involve many appearance variations. In existing tracking frameworks, objects in an image are often represented as vector observations, which discounts the 2-D intrinsic structure of the image. By considering an image in its actual form as a matrix, we construct the 3rd order tensor based object representation to preserve the spatial correlation within the 2-D image and fully exploit the useful temporal information. We perform incremental update of the object template using the N-mode SVD to model the appearance variations, which reduces the influence of template drifting and object occlusions. The proposed scheme efficiently learns a low-dimensional tensor representation through adaptively updating the eigenbasis of the tensor. Tensor based Bayesian inference in the particle filter framework is then utilized to realize tracking. We present the validation of the proposed tracking system by conducting the real-time facial expression recognition with video data and a live camera. Experiment evaluation on challenging benchmark image sequences undergoing appearance variations demonstrates the significance and effectiveness of the proposed algorithm.  相似文献   

9.
10.
11.
多重变精度粗糙集模型   总被引:1,自引:0,他引:1  
陆秋琴  和涛  黄光球 《计算机应用》2011,31(6):1634-1637
为了解决Zaike变精度粗糙集模型的论域划分不能重叠的问题,基于多重集合,对Zaike变精度粗糙集模型的论域进行了扩展,提出了基于多重集的多重变精度粗糙集模型,给出了该模型的完整定义、相关定理和重要性质,其中包括多重论域定义、多重变精度近似集的定义及其性质的证明、与Zaike变精度粗糙集的关系等。这些定义、定理和性质与Zaike变精度粗糙集既有区别又有联系。多重变精度粗糙集可充分反映知识颗粒间的重叠性,对象的重要度差别及其多态性,这样有利于用粗糙集理论从保存在关系数据库中的具有一对多、多对多依赖性的且认为不相关的数据中发现相关知识。  相似文献   

12.
用于遥感图像人造目标识别的三维建模方法研究   总被引:2,自引:0,他引:2  
该文研究了用于遥感图像人造地物目标识别的三维建模方法,文中分析了识别任务的特点,比较了一般的建模方法,介绍了一种基于广义锥思想的几何表示方法,并利用面向对象的技术来表示模型内部数据及其操作。  相似文献   

13.
This paper presents a probabilistic framework for discovering objects in video. The video can switch between different shots, the unknown objects can leave or enter the scene at multiple times, and the background can be cluttered. The framework consists of an appearance model and a motion model. The appearance model exploits the consistency of object parts in appearance across frames. We use maximally stable extremal regions as observations in the model and hence provide robustness to object variations in scale, lighting and viewpoint. The appearance model provides location and scale estimates of the unknown objects through a compact probabilistic representation. The compact representation contains knowledge of the scene at the object level, thus allowing us to augment it with motion information using a motion model. This framework can be applied to a wide range of different videos and object types, and provides a basis for higher level video content analysis tasks. We present applications of video object discovery to video content analysis problems such as video segmentation and threading, and demonstrate superior performance to methods that exploit global image statistics and frequent itemset data mining techniques.  相似文献   

14.
15.
16.
An autoregressive model approach to two-dimensional shape classification   总被引:8,自引:0,他引:8  
In this paper, a method of classifying objects is reported that is based on the use of autoregressive (AR) model parameters which represent the shapes of boundaries detected in digitized binary images of the objects. The object identification technique is insensitive to object size and orientation. Three pattern recognition algorithms that assign object names to unlabelled sets of AR model parameters were tested and the results compared. Isolated object tests were performed on five sets of shapes, including eight industrial shapes (mostly taken from the recognition literature), and recognition accuracies of 100 percent were obtained for all pattern sets at some model order in the range 1 to 10. Test results indicate the ability of the technique developed in this work to recognize partially occluded objects. Processing-speed measurements show that the method is fast in the recognition mode. The results of a number of object recognition tests are presented. The recognition technique was realized with Fortran programs, Imaging Technology, Inc. image-processing boards, and a PDP 11/60 computer. The computer algorithms are described.  相似文献   

17.
目的 针对多运动目标在移动背景情况下跟踪性能下降和准确度不高的问题,本文提出了一种基于OPTICS聚类与目标区域概率模型的方法。方法 首先引入了Harris-Sift特征点检测,完成相邻帧特征点匹配,提高了特征点跟踪精度和鲁棒性;再根据各运动目标与背景运动向量不同这一点,引入了改进后的OPTICS加注算法,在构建的光流图上聚类,从而准确的分离出背景,得到各运动目标的估计区域;对每个运动目标建立一个独立的目标区域概率模型(OPM),随着检测帧数的迭代更新,以得到运动目标的准确区域。结果 多运动目标在移动背景情况下跟踪性能下降和准确度不高的问题通过本文方法得到了很好地解决,Harris-Sift特征点提取、匹配时间仅为Sift特征的17%。在室外复杂环境下,本文方法的平均准确率比传统背景补偿方法高出14%,本文方法能从移动背景中准确分离出运动目标。结论 实验结果表明,该算法能满足实时要求,能够准确分离出运动目标区域和背景区域,且对相机运动、旋转,场景亮度变化等影响因素具有较强的鲁棒性。  相似文献   

18.
Scale-Invariant Visual Language Modeling for Object Categorization   总被引:2,自引:0,他引:2  
In recent years, ldquobag-of-wordsrdquo models, which treat an image as a collection of unordered visual words, have been widely applied in the multimedia and computer vision fields. However, their ignorance of the spatial structure among visual words makes them indiscriminative for objects with similar word frequencies but different word spatial distributions. In this paper, we propose a visual language modeling method (VLM), which incorporates the spatial context of the local appearance features into the statistical language model. To represent the object categories, models with different orders of statistical dependencies have been exploited. In addition, the multilayer extension to the VLM makes it more resistant to scale variations of objects. The model is effective and applicable to large scale image categorization. We train scale invariant visual language models based on the images which are grouped by Flickr tags, and use these models for object categorization. Experimental results show they achieve better performance than single layer visual language models and ldquobag-of-wordsrdquo models. They also achieve comparable performance with 2-D MHMM and SVM-based methods, while costing much less computational time.  相似文献   

19.
Many current recognition systems use variations on constrained tree search to locate objects in cluttered environments. If the system is simply finding instances of an object known to be in the scene, then previous formal analysis has shown that the expected amount of search is quadratic in the number of model and data features when all the data is known to come from a single object, but is exponential when spurious data is included. If one can group the data into subsets likely to have come from a single object, then terminating the search once a “good enough” interpretation is found reduces the expected search to cubic. Without successful grouping, terminated search is still exponential. These results apply to finding instances of a known object in the data. What happens when the object is not present? In this article, we turn to the problem of selecting models from a library, and examine the combinatorial cost of determining that an incorrectly chosen candidate object is not present in the data. We show that the expected search is again exponential, implying that naive approaches to library indexing are likely to carry an expensive overhead, since an exponential amount of work is needed to weed out each incorrect model. The analytic results are shown to be in agreement with empirical data for cluttered object recognition.  相似文献   

20.
In this paper we represent the object with multiple attentional blocks which reflect some findings of selective visual attention in human perception. The attentional blocks are extracted using a branch-and-bound search method on the saliency map, and meanwhile the weight of each block is determined. Independent particle filter tracking is applied to each attentional block and the tracking results of all the blocks are then combined in a linear weighting scheme to get the location of the entire target object. The attentional blocks are propagated to the object location found in each new frame and the state of the most likely particle in each block is also updated with the new propagated position. In addition, to avoid error accumulation caused by the appearance variations, the object template and the positions of the attentional blocks are adaptively updated while tracking. Experimental results show that the proposed algorithm is able to efficiently track salient objects and is better accounted for partial occlusions and large variations in appearance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号