首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 21 毫秒
1.
经典稀疏表示目标跟踪算法在处理复杂视频时不免出现跟踪不稳定情况且当目标发生遮挡时易发生漂移现象。针对这一问题,提出一种基于子区域匹配的稀疏表示跟踪算法。首先,将初始目标模板划分为若干子区域,利用LK图像配准算法建立观测模型预测下一帧目标运动状态。然后,对预测的目标模型区域进行同等划分,并在匹配过程中寻找最优子区域。最后,在模板更新过程中引入一种新的模板校正机制,能够有效克服漂移现象。将该算法与多种目标跟踪算法在不同视频序列下进行对比,实验结果表明在目标发生遮挡、运动、光照影响及复杂背景等情况下该算法具有较为理想的跟踪效果,并与经典稀疏表示跟踪算法相比具有较好的跟踪性能。  相似文献   

2.
针对视觉跟踪中描述目标能力的有限性和局部稀疏表示模型的有效性,提出了一种基于重要性加权的结构稀疏跟踪方法.该方法采用结构稀疏表示对目标表观建模,根据在表达目标表观时所起的作用,对每个局部图像进行加权处理;在粒子滤波框架下,应用最大后验概率对目标的状态进行估计;通过带有遮挡检测机制的模板更新策略对目标模板进行在线的更新以避免跟踪漂移.实验表明,该方法有效地减弱了目标表观变化对模型的影响,对于视频序列中的遮挡、光照变化和目标姿态改变等有稳健的跟踪效果.  相似文献   

3.
王杰    蒋明敏  花晓慧    鲁守银    李金屏   《智能系统学报》2015,10(5):775-782
为了在机器人机械手双目视觉伺服系统中跟踪并精确定位目标的空间位置,提出了一种利用投影直方图匹配和极线几何约束的目标跟踪方法。分别在2个视觉中人工标定目标,并提取目标在多颜色空间的水平、垂直投影直方图作为匹配模板;在一个视觉中利用目标的运动一致性原则和投影直方图匹配搜索并跟踪目标;在另一个视觉中依据双目视觉系统的极线几何原理限定目标搜索范围,搜索并定位目标。该方法利用水平、垂直投影直方图描述目标的结构信息,同时完成了双目视觉系统中的目标跟踪与配准功能,有利于目标的精确定位和视觉测量。实验结果表明,该方法可在双目视觉系统中有效跟踪目标,运算效率高,鲁棒性强。  相似文献   

4.
张伟俊  钟胜  徐文辉  WU Ying 《自动化学报》2021,47(7):1572-1588
主流的目标跟踪算法以矩形模板的形式建立被跟踪物体的视觉表征, 无法有效区分目标与背景像素, 在背景复杂、目标非刚体形变、复杂运动等挑战性因素影响下容易出现模型偏移的问题, 导致跟踪失败. 与此同时, 像素级的显著性信息与运动先验信息作为人类视觉系统有效区分目标与背景、识别运动物体的重要信号, 并没有在主流目标跟踪算法中得到有效的集成利用. 针对上述问题, 提出目标的像素级概率性表征模型, 并且建立与之对应的像素级目标概率推断方法, 能够有效利用像素级的显著性与运动观测信息, 实现与主流的相关滤波跟踪算法的融合; 提出基于显著性的观测模型, 通过背景先验与提出的背景距离模型, 能够在背景复杂的情况下得到高辨识度的像素级图像观测; 利用目标与相机运动的连续性来计算目标和背景的运动模式, 并以此为基础建立基于运动估计的图像观测模型. 实验结果表明, 提出的目标表征模型与融合方法能够有效集成上述像素级图像观测信息, 提出的跟踪方法总体跟踪精度优于多种当下最先进的跟踪器, 对跟踪场景中的背景复杂、目标形变、平面内旋转等挑战性因素具有更好的鲁棒性.  相似文献   

5.
The aim of the work is to build self-growing based architectures to support visual surveillance and human–computer interaction systems. The objectives include: identifying and tracking persons or objects in the scene or the interpretation of user gestures for interaction with services, devices and systems implemented in the digital home. The system must address multiple vision tasks of various levels such as segmentation, representation or characterization, analysis and monitoring of the movement to allow the construction of a robust representation of their environment and interpret the elements of the scene.It is also necessary to integrate the vision module into a global system that operates in a complex environment by receiving images from acquisition devices at video frequency and offering results to higher level systems, monitors and take decisions in real time, and must accomplish a set of requirements such as: time constraints, high availability, robustness, high processing speed and re-configurability.Based on our previous work with neural models to represent objects, in particular the Growing Neural Gas (GNG) model and the study of the topology preservation as a function of the parameters election, it is proposed to extend the capabilities of this self-growing model to track objects and represent their motion in image sequences under temporal restrictions.These neural models have various interesting features such as: their ability to readjust to new input patterns without restarting the learning process, adaptability to represent deformable objects and even objects that are divided in different parts or the intrinsic resolution of the problem of matching features for the sequence analysis and monitoring of the movement. It is proposed to build an architecture based on the GNG that has been called GNG-Seq to represent and analyze the motion in image sequences. Several experiments are presented that demonstrate the validity of the architecture to solve problems of target tracking, motion analysis or human–computer interaction.  相似文献   

6.
目的 针对多运动目标在移动背景情况下跟踪性能下降和准确度不高的问题,本文提出了一种基于OPTICS聚类与目标区域概率模型的方法。方法 首先引入了Harris-Sift特征点检测,完成相邻帧特征点匹配,提高了特征点跟踪精度和鲁棒性;再根据各运动目标与背景运动向量不同这一点,引入了改进后的OPTICS加注算法,在构建的光流图上聚类,从而准确的分离出背景,得到各运动目标的估计区域;对每个运动目标建立一个独立的目标区域概率模型(OPM),随着检测帧数的迭代更新,以得到运动目标的准确区域。结果 多运动目标在移动背景情况下跟踪性能下降和准确度不高的问题通过本文方法得到了很好地解决,Harris-Sift特征点提取、匹配时间仅为Sift特征的17%。在室外复杂环境下,本文方法的平均准确率比传统背景补偿方法高出14%,本文方法能从移动背景中准确分离出运动目标。结论 实验结果表明,该算法能满足实时要求,能够准确分离出运动目标区域和背景区域,且对相机运动、旋转,场景亮度变化等影响因素具有较强的鲁棒性。  相似文献   

7.
In this paper, we address the multiple target tracking problem as a maximum a posteriori problem. We adopt a graph representation of all observations over time. To make full use of the visual observations from the image sequence, we introduce both motion and appearance likelihood. The multiple target tracking problem is formulated as finding multiple optimal paths in the graph. Due to the noisy foreground segmentation, an object may be represented by several foreground regions and similarly one foreground region may correspond to multiple objects. To deal with this problem, we propose merge, split and mean shift operations to generate new hypotheses to the measurement graph. The proposed approach uses a sliding window framework, that aggregates information across a fixed number of frames. Experimental results on both indoor and outdoor data sets are reported. Furthermore, we provide a comparison between the proposed approach with the existing methods that do not merge/split detected blobs.  相似文献   

8.
Intelligent visual surveillance — A survey   总被引:3,自引:0,他引:3  
Detection, tracking, and understanding of moving objects of interest in dynamic scenes have been active research areas in computer vision over the past decades. Intelligent visual surveillance (IVS) refers to an automated visual monitoring process that involves analysis and interpretation of object behaviors, as well as object detection and tracking, to understand the visual events of the scene. Main tasks of IVS include scene interpretation and wide area surveillance control. Scene interpretation aims at detecting and tracking moving objects in an image sequence and understanding their behaviors. In wide area surveillance control task, multiple cameras or agents are controlled in a cooperative manner to monitor tagged objects in motion. This paper reviews recent advances and future research directions of these tasks. This article consists of two parts: The first part surveys image enhancement, moving object detection and tracking, and motion behavior understanding. The second part reviews wide-area surveillance techniques based on the fusion of multiple visual sensors, camera calibration and cooperative camera systems.  相似文献   

9.
移动机器人视觉定位方法的研究与实现   总被引:1,自引:0,他引:1  
针对移动机器人的局部视觉定位问题进行了研究。首先通过移动机器人视觉定位与目标跟踪系统求出目标质心特征点的位置时间序列,然后在分析二次成像法获取目标深度信息的缺陷的基础上,提出了一种获取目标的空间位置和运动信息的方法。该方法利用序列图像和推广卡尔曼滤波,目标获取采用了HIS模型。在移动机器人满足一定机动的条件下,较精确地得到了目标的空间位置和运动信息。仿真结果验证了该方法的有效性和可行性。  相似文献   

10.
This paper presents an object-based image retrieval using a method based on visual-pattern matching. A visual pattern is obtained by detecting the line edge from a square block using the moment-preserving edge detector. It is desirable and yet remains as a challenge for querying multimedia data by finding an object inside a target image. Given an object model, an added difficulty is that the object might be translated, rotated, and scaled inside a target image. Object segmentation and recognition is the primary step of computer vision for applying to image retrieval of higher-level image analysis. However, automatic segmentation and recognition of objects via object models is a difficult task without a priori knowledge about the shape of objects. Instead of segmentation and detailed object representation, the objective of this research is to develop and apply computer vision methods that explore the structure of an image object by visual-pattern detection to retrieve images from a database. A voting scheme based on generalized Hough transform is proposed to provide object search method, which is invariant to the translation, rotation, scaling of image data, and hence, invariant to orientation and position. Computer simulation results show that the proposed method gives good performance in terms of retrieval accuracy and robustness.  相似文献   

11.
目的 为克服单一颜色特征易受光照变化影响,以及图像的空间结构特征对目标形变较为敏感等问题,提出一种结合颜色属性的分层结构直方图。方法 首先,鉴于使用像素灰度值对图像进行分层易受光照变化影响,本文基于颜色属性对图像进行分层,即将输入的彩色图像从RGB空间映射到颜色属性空间,得到11种概率分层图;之后,将图像中的每一个像素仅投影到其概率值最大的分层中,使得各分层之间像素的交集为空,并集为整幅图像;对处理后的每一个分层,通过定义的结构图元来统计像素分布情况,得到每一分层的空间分布信息;最后,将每一分层的像素空间分布信息串联作为输入图像的分层结构直方图,以此来表征图像。结果 为证明本文特征的有效性,将该特征用于图像匹配和视觉跟踪,与参考特征相比,利用本文特征进行图像匹配时,峰值旁瓣比均值提升1.347 9;将本文特征用于视觉跟踪时,采用粒子滤波作为跟踪框架,成功率相对上升4%,精度相对上升4.6%。结论 该特征将图像的颜色特征与空间结构信息相结合,有效解决了单一特征分辨性较差的问题,与参考特征相比,该特征具有更强的分辨性和鲁棒性,因此本文特征可以更好地应用于图像处理应用中。  相似文献   

12.
Recovery of nonrigid motion and structure   总被引:6,自引:0,他引:6  
The authors introduce a physically correct model of elastic nonrigid motion. This model is based on the finite element method, but decouples the degrees of freedom by breaking down object motion into rigid and nonrigid vibration or deformation modes. The result is an accurate representation for both rigid and nonrigid motion that has greatly reduced dimensionality, capturing the intuition that nonrigid motion is normally coherent and not chaotic. Because of the small number of parameters involved, this representation is used to obtain accurate overstrained estimates of both rigid and nonrigid global motion. It is also shown that these estimates can be integrated over time by use of an extended Kalman filter, resulting in stable and accurate estimates of both three-dimensional shape and three-dimensional velocity. The formulation is then extended to include constrained nonrigid motion. Examples of tracking single nonrigid objects and multiple constrained objects are presented  相似文献   

13.
This paper presents a new visual aggregation model for representing visual information about moving objects in video data. Based on available automatic scene segmentation and object tracking algorithms, the proposed model provides eight operations to calculate object motions at various levels of semantic granularity. It represents trajectory, color and dimensions of a single moving object and the directional and topological relations among multiple objects over a time interval. Each representation of a motion can be normalized to improve computational cost and storage utilization. To facilitate query processing, there are two optimal approximate matching algorithms designed to match time-series visual features of moving objects. Experimental results indicate that the proposed algorithms outperform the conventional subsequence matching methods substantially in the similarity between the two trajectories. Finally, the visual aggregation model is integrated into a relational database system and a prototype content-based video retrieval system has been implemented as well.  相似文献   

14.
We have developed a prototype for a miniaturized, active vision system with a sensor architecture based on a logarithmically structured, space-variant, pixel geometry. The central part of the image has a high resolution, and the periphery has a a smoothly falling resolution. The human visual system uses a similar image architecture. Our system integrates a miniature CCD-based camera, a novel pantilt actuator/controller, general purpose processors, a video-telephone modem and a display. Due to the ability of space-variant sensors to cover large work spaces, yet provide high acuity with an extremely small number of pixels, architectures with space-variant, active vision systems provide a potential for reductions in system size and cost of several orders of magnitude. Cortex-I takes up less than a third of a cubic foot, including camera, actuators, control, computers, and power supply, and was built for a (one-off) parts cost of roughly US $2000. In this paper, we describe several applications that we have developed for Cortex-I such as tracking moving objects, visual attention, pattern recognition (license plate reading), and video-telephone communcications (teleoperation). We report here on the design of the camera and optics (8 × 8 × 8 mm), a method to convert the uniform image to a space-variant image, and a new miniature pan-tilt actuator, the spherical pointing motor (SPM), (4 × 5 × 6 cm). Finally, we discuss applications for motion tracking and license plate reading. Potential application domains for systems of this type include vision systems for mobile robots and robot manipulators, traffic monitoring systems, security and surveillance, telerobotics, and consumer video communications. The long-range goal of this project is to demonstrate that major new applications of robotics will become feasible when small, low-cost, machine-vision systems can be mass produced. We use the term commodity robotics to express the expected impact of the possibilities for opening up new application niches in robotics and machine vision, for what has until now been an expensive, and therefore limited, technology.  相似文献   

15.
《Advanced Robotics》2013,27(6):495-514
This paper presents an active method for locating target objects in images, which is aimed at improving the performance of detecting object boundaries by enhancing the behavioral characteristics of an active contour. The proposed active contour model simulates a mechanical system consisting of two main parts: the first is a rigid fixture, called the 'core', specifying the expected shape of target boundaries, while the second is an elastic rod attached to the rigid fixture. The elastic rod deforms or moves relative to the rigid core according to the classical laws of the mechanical system. When the initial contour is applied to an image data, it is attracted near the dominant image features, but tries to keep its home shape and simultaneously make the deformation smooth if a deformation is more natural for force equilibrium. This mechanism significantly improves the performance of detecting object boundaries in the presence of some disturbing image features. The active contour is scale invariant, thereby significantly relieving the difficulty in selecting proper values for the model parameters. The values for the model parameters can be selected to make the contour have the desired behaviors around the equilibrium position through the analysis of the vibration mode of the mechanical system. The performance of the proposed method is validated through a series of experiments, which include detection of heavily degraded objects, tracking of objects under non-rigid motion and comparisons with the original snake models.  相似文献   

16.
《Real》1997,3(6):415-432
Real-time motion capture plays a very important role in various applications, such as 3D interface for virtual reality systems, digital puppetry, and real-time character animation. In this paper we challenge the problem of estimating and recognizing the motion of articulated objects using theoptical motion capturetechnique. In addition, we present an effective method to control the articulated human figure in realtime.The heart of this problem is the estimation of 3D motion and posture of an articulated, volumetric object using feature points from a sequence of multiple perspective views. Under some moderate assumptions such as smooth motion and known initial posture, we develop a model-based technique for the recovery of the 3D location and motion of a rigid object using a variation of Kalman filter. The posture of the 3D volumatric model is updated by the 2D image flow of the feature points for all views. Two novel concepts – the hierarchical Kalman filter (KHF) and the adaptive hierarchical structure (AHS) incorporating the kinematic properties of the articulated object – are proposed to extend our formulation for the rigid object to the articulated one. Our formulation also allows us to avoid two classic problems in 3D tracking: the multi-view correspondence problem, and the occlusion problem. By adding more cameras and placing them appropriately, our approach can deal with the motion of the object in a very wide area. Furthermore, multiple objects can be handled by managing multiple AHSs and processing multiple HKFs.We show the validity of our approach using the synthetic data acquired simultaneously from the multiple virtual camera in a virtual environment (VE) and real data derived from a moving light display with walking motion. The results confirm that the model-based algorithm works well on the tracking of multiple rigid objects.  相似文献   

17.
Optical flow in log-mapped image plane - a new approach   总被引:1,自引:0,他引:1  
Foveating vision sensors are important in both machine and biological vision. The term space-variant or foveating vision refers to sensor architectures based on smooth variation of resolution across the visual field, like that of the human visual system. Traditional image processing techniques do not hold when applied directly to such an image representation since the translation symmetry and the neighborhood structure in the spatial domain is broken by the space-variant properties of the sensor. Unfortunately, there has been little systematic development of image processing tools that are explicitly designed for foveated vision. The author proposes a novel approach to compute the optical flow directly on log-mapped images. We propose the use of a generalized dynamic image model (GDIM) based method for computing the optical flow as opposed to the brightness constancy model (BCM) based method. We introduce a new notion of "variable window" and use the space-variant form of gradient operator while computing the spatio-temporal gradient in log-mapped images for a better accuracy and to ensure that the local neighborhood is preserved. We emphasize that the proposed method must be numerically accurate, provide a consistent interpretation, and be capable of computing the peripheral motion. Experimental results on both the synthetic and real images have been presented to show the efficacy of the proposed method  相似文献   

18.
《Knowledge》2002,15(1-2):111-118
We introduce a robotic-vision system which is able to extract object representations autonomously utilising a tight interaction of visual perception and robotic action within a perception action cycle [Ecological Psychology 4 (1992) 121; Algebraic Frames for the Perception and Action Cycle, 1997, 1]. Controlled movement of the object grasped by the robot enables us to compute the transformations of entities which are used to represent aspects of objects and to find correspondences of entities within an image sequence.A general accumulation scheme allows to acquire robust information from partly missing information extracted from single frames of an image sequence. Here we use this scheme with a preprocessing stage in which 3D-line segments are extracted from stereo images. However, the accumulation scheme can be used with any kind of preprocessing as long as the entities used to represent objects can be brought to correspondence by certain equivalence relations such as ‘rigid body motion’.We show that an accumulated representation can be applied within a tracking algorithm. The accumulation scheme is an important module of a vision based robot system on which we are currently working. In this system, objects are planned to be represented by different visual and tactile entities. The object representations are going to be learned autonomously. We discuss the accumulation scheme in the context of this project.  相似文献   

19.

This work presents the design of a real-time system to model visual objects with the use of self-organising networks. The architecture of the system addresses multiple computer vision tasks such as image segmentation, optimal parameter estimation and object representation. We first develop a framework for building non-rigid shapes using the growth mechanism of the self-organising maps, and then we define an optimal number of nodes without overfitting or underfitting the network based on the knowledge obtained from information-theoretic considerations. We present experimental results for hands and faces, and we quantitatively evaluate the matching capabilities of the proposed method with the topographic product. The proposed method is easily extensible to 3D objects, as it offers similar features for efficient mesh reconstruction.

  相似文献   

20.
视觉问答是一项计算机视觉与自然语言处理相结合的任务,需要理解图中的场景,特别是不同目标对象之间的交互关系。近年来,关于视觉问答的研究有了很大的进展,但传统方法采用整体特征表示,很大程度上忽略了所给图像的结构,无法有效锁定场景中的目标。而图网络依靠高层次图像表示,能捕获语义和空间关系,但以往利用图网络的视觉问答方法忽略了关系与问题间的关联在解答过程中的作用。据此提出基于同等注意力图网络的视觉问答模型EAGN,通过同等注意力机制赋予关系边与目标节点同等的重要性,两者结合使回答问题的依据更加充分。通过实验得出,相比于其他相关方法,EAGN模型性能优异且更具有竞争力,也为后续的相关研究提供了基础。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号