首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
In this paper, we present a new algorithm that utilizes low-quality red, green, blue and depth (RGB-D) data from the Kinect sensor for face recognition under challenging conditions. This algorithm extracts multiple features and fuses them at the feature level. A Finer Feature Fusion technique is developed that removes redundant information and retains only the meaningful features for possible maximum class separability. We also introduce a new 3D face database acquired with the Kinect sensor which has released to the research community. This database contains over 5,000 facial images (RGB-D) of 52 individuals under varying pose, expression, illumination and occlusions. Under the first three variations and using only the noisy depth data, the proposed algorithm can achieve 72.5 % recognition rate which is significantly higher than the 41.9 % achieved by the baseline LDA method. Combined with the texture information, 91.3 % recognition rate has achieved under illumination, pose and expression variations. These results suggest the feasibility of low-cost 3D sensors for real-time face recognition.  相似文献   

2.
Improved gait recognition by gait dynamics normalization   总被引:5,自引:0,他引:5  
Potential sources for gait biometrics can be seen to derive from two aspects: gait shape and gait dynamics. We show that improved gait recognition can be achieved after normalization of dynamics and focusing on the shape information. We normalize for gait dynamics using a generic walking model, as captured by a population Hidden Markov Model (pHMM) defined for a set of individuals. The states of this pHMM represent gait stances over one gait cycle and the observations are the silhouettes of the corresponding gait stances. For each sequence, we first use Viterbi decoding of the gait dynamics to arrive at one dynamics-normalized, averaged, gait cycle of fixed length. The distance between two sequences is the distance between the two corresponding dynamics-normalized gait cycles, which we quantify by the sum of the distances between the corresponding gait stances. Distances between two silhouettes from the same generic gait stance are computed in the linear discriminant analysis space so as to maximize the discrimination between persons, while minimizing the variations of the same subject under different conditions. The distance computation is constructed so that it is invariant to dilations and erosions of the silhouettes. This helps us handle variations in silhouette shape that can occur with changing imaging conditions. We present results on three different, publicly available, data sets. First, we consider the HumanlD Gait Challenge data set, which is the largest gait benchmarking data set that is available (122 subjects), exercising five different factors, i.e., viewpoint, shoe, surface, carrying condition, and time. We significantly improve the performance across the hard experiments involving surface change and briefcase carrying conditions. Second, we also show improved performance on the UMD gait data set that exercises time variations for 55 subjects. Third, on the CMU Mobo data set, we show results for matching across different walking speeds. It is worth noting that there was no separate training for the UMD and CMU data sets.  相似文献   

3.
深度学习单目深度估计研究进展   总被引:1,自引:0,他引:1       下载免费PDF全文
单目深度估计是从单幅图像中获取场景深度信息的重要技术,在智能汽车和机器人定位等领域应用广泛,具有重要的研究价值。随着深度学习技术的发展,涌现出许多基于深度学习的单目深度估计研究,单目深度估计性能也取得了很大进展。本文按照单目深度估计模型采用的训练数据的类型,从3个方面综述了近年来基于深度学习的单目深度估计方法:基于单图像训练的模型、基于多图像训练的模型和基于辅助信息优化训练的单目深度估计模型。同时,本文在综述了单目深度估计研究常用数据集和性能指标基础上,对经典的单目深度估计模型进行了性能比较分析。以单幅图像作为训练数据的模型具有网络结构简单的特点,但泛化性能较差。采用多图像训练的深度估计网络有更强的泛化性,但网络的参数量大、网络收敛速度慢、训练耗时长。引入辅助信息的深度估计网络的深度估计精度得到了进一步提升,但辅助信息的引入会造成网络结构复杂、收敛速度慢等问题。单目深度估计研究还存在许多的难题和挑战。利用多图像输入中包含的潜在信息和特定领域的约束信息,来提高单目深度估计的性能,逐渐成为了单目深度估计研究的趋势。  相似文献   

4.
Nowadays object recognition is a fundamental capability for an autonomous robot in interaction with the physical world. Taking advantage of new sensing technologies providing RGB-D data, the object recognition capabilities increase dramatically. Object recognition has been well studied, however, known object classifiers usually feature poor generality and, therefore, limited adaptivity to different application domains. Although some domain adaptation approaches have been presented for RGB data, little work has been done on understanding the effects of applying object classification algorithms using RGB-D for different domains. Addressing this problem, we propose and comprehensively investigate an approach for object recognition in RGB-D data that uses adaptive Support Vector Machines (aSVM) and, in this way, achieves an impressive robustness in cross-domain adaptivity. For evaluation, two datasets from different application domains were used. Moreover, a study of state-of-the-art RGB-D feature extraction techniques and object classification methods was performed to identify which combinations (object representation - classification algorithm) remain less affected in terms of performance while switching between different application domains.  相似文献   

5.
Ren  Bo  Wu  Jia-Cheng  Lv  Ya-Lei  Cheng  Ming-Ming  Lu  Shao-Ping 《计算机科学技术学报》2019,34(3):581-593

The Iterative Closest Point (ICP) scheme has been widely used for the registration of surfaces and point clouds. However, when working on depth image sequences where there are large geometric planes with small (or even without) details, existing ICP algorithms are prone to tangential drifting and erroneous rotational estimations due to input device errors. In this paper, we propose a novel ICP algorithm that aims to overcome such drawbacks, and provides significantly stabler registration estimation for simultaneous localization and mapping (SLAM) tasks on RGB-D camera inputs. In our approach, the tangential drifting and the rotational estimation error are reduced by: 1) updating the conventional Euclidean distance term with the local geometry information, and 2) introducing a new camera stabilization term that prevents improper camera movement in the calculation. Our approach is simple, fast, effective, and is readily integratable with previous ICP algorithms. We test our new method with the TUM RGB-D SLAM dataset on state-of-the-art real-time 3D dense reconstruction platforms, i.e., ElasticFusion and Kintinuous. Experiments show that our new strategy outperforms all previous ones on various RGB-D data sequences under different combinations of registration systems and solutions.

  相似文献   

6.
The multimodal perception of intelligent robots is essential for achieving collision-free and efficient navigation. Autonomous navigation is enormously challenging when perception is acquired using only vision or LiDAR sensor data due to the lack of complementary information from different sensors. This paper proposes a simple yet efficient deep reinforcement learning (DRL) with sparse rewards and hindsight experience replay (HER) to achieve multimodal navigation. By adopting the depth images and pseudo-LiDAR data generated by an RGB-D camera as input, a multimodal fusion scheme is used to enhance the perception of the surrounding environment compared to using a single sensor. To alleviate the misleading way for the agent to navigate with dense rewards, the sparse rewards are intended to identify its tasks. Additionally, the HER technique is introduced to address the sparse reward navigation issue for accelerating optimal policy learning. The results show that the proposed model achieves state-of-the-art performance in terms of success, crash, and timeout rates, as well as generalization capability.  相似文献   

7.
针对无人车(UGV)自主跟随目标车辆检测过程中需要对激光雷达(LiDAR)数据和摄像机图像进行信息融合的问题,提出了一种基于梯形棋盘格标定板对激光雷达和摄像机进行联合标定的方法。首先,利用激光雷达在梯形标定板上的扫描信息,获取激光雷达安装的俯仰角和安装高度;然后,通过梯形标定板上的黑白棋盘格标定出摄像机相对于车体的外参数;其次,结合激光雷达数据点与图像像素坐标之间的对应关系,对两个传感器进行联合标定;最后,综合激光雷达和摄像机的标定结果,对激光雷达数据和摄像机图像进行了像素级的数据融合。该方法只要让梯形标定板放置在车体前方,采集一次图像和激光雷达数据就可以满足整个标定过程,实现两种类型传感器的标定。实验结果表明,该标定方法的平均位置偏差为3.5691 pixel,折算精度为13 μm,标定精度高。同时从激光雷达数据和视觉图像融合的效果来看,所提方法有效地完成激光雷达与摄像机的空间对准,融合效果好,对运动中的物体体现出了强鲁棒性。  相似文献   

8.
Gait recognition has been considered as the emerging biometric technology for identifying the walking behaviors of humans. The major challenges addressed in this article is significant variation caused by covariate factors such as clothing, carrying conditions and view angle variations will undesirably affect the recognition performance of gait. In recent years, deep learning technique has produced a phenomenal performance accuracy on various challenging problems based on classification. Due to an enormous amount of data in the real world, convolutional neural network will approximate complex nonlinear functions in models to develop a generalized deep convolutional neural network (DCNN) architecture for gait recognition. DCNN can handle relatively large multiview datasets with or without using any data augmentation and fine-tuning techniques. This article proposes a color-mapped contour gait image as gait feature for addressing the variations caused by the cofactors and gait recognition across views. We have also compared the various edge detection algorithms for gait template generation and chosen the best from among them. The databases considered for our work includes the most widely used CASIA-B dataset and OULP database. Our experiments show significant improvement in the gait recognition for fixed-view, crossview, and multiview compared with the recent methodologies.  相似文献   

9.
Global security concerns have raised a proliferation of video surveillance devices. Intelligent surveillance systems seek to discover possible threats automatically and raise alerts. Being able to identify the surveyed object can help determine its threat level. The current generation of devices provide digital video data to be analysed for time varying features to assist in the identification process. Commonly, people queue up to access a facility and approach a video camera in full frontal view. In this environment, a variety of biometrics are available—for example, gait which includes temporal features like stride period. Gait can be measured unobtrusively at a distance. The video data will also include face features, which are short-range biometrics. In this way, one can combine biometrics naturally using one set of data. In this paper we survey current techniques of gait recognition and modelling with the environment in which the research was conducted. We also discuss in detail the issues arising from deriving gait data, such as perspective and occlusion effects, together with the associated computer vision challenges of reliable tracking of human movement. Then, after highlighting these issues and challenges related to gait processing, we proceed to discuss the frameworks combining gait with other biometrics. We then provide motivations for a novel paradigm in biometrics-based human recognition, i.e. the use of the fronto-normal view of gait as a far-range biometrics combined with biometrics operating at a near distance.  相似文献   

10.

This paper proposes a novel complete navigation system for autonomous flight of small unmanned aerial vehicles (UAVs) in GPS-denied environments. The hardware platform used to test the proposed algorithm is a small, custom-built UAV platform equipped with an onboard computer, RGB-D camera, 2D light detection and ranging (LiDAR), and altimeter. The error-state Kalman filter (ESKF) based on the dynamic model for low-cost IMU-driven systems is proposed, and visual odometry from the RGB-D camera and height measurement from the altimeter are fed into the measurement update process of the ESKF. The pose output of the ESKF is then integrated into the open-source simultaneous location and mapping (SLAM) algorithm for pose-graph optimization and loop closing. In addition, the computationally efficient collision-free path planning algorithm is proposed and verified through simulations. The software modules run onboard in real time with limited onboard computational capability. The indoor flight experiment demonstrates that the proposed system for small UAVs with low-cost devices can navigate without collision in fully autonomous missions while establishing accurate surrounding maps.

  相似文献   

11.
针对过去几乎都是在单目视觉的情况下进行步态识别研究的现状,提出一种基于立体视觉的步态识别方法。首先利用立体匹配技术获得人体轮廓的三维信息,并据此构造出三维人体轮廓描述子以获取人体的步态特征。接着通过平滑、去噪等预处理手段抑制噪声的影响,并采用流形学习构建低维流形进行特征降维。最后将最近邻分类器和最近邻模板分类器用于识别过程。采用该方法在PRLABⅡ立体步态数据库和不规则测试数据集ExN上进行实验,获得较高的识别率。实验结果表明,文中所提出的方法具有与行人行走路径到摄像机之间的距离无关的特点,且对于不完整的残缺步态序列、行人行为姿态的变化、携带物品和服饰变化等具有较强的鲁棒性。  相似文献   

12.
13.
We describe a novel probabilistic framework for real-time tracking of multiple objects from combined depth-colour imagery. Object shape is represented implicitly using 3D signed distance functions. Probabilistic generative models based on these functions are developed to account for the observed RGB-D imagery, and tracking is posed as a maximum a posteriori problem. We present first a method suited to tracking a single rigid 3D object, and then generalise this to multiple objects by combining distance functions into a shape union in the frame of the camera. This second model accounts for similarity and proximity between objects, and leads to robust real-time tracking without recourse to bolt-on or ad-hoc collision detection.  相似文献   

14.
赵宏  刘向东  杨永娟 《计算机应用》2005,40(12):3637-3643
同时定位与地图构建(SLAM)是机器人在未知环境实现自主导航的关键技术,针对目前常用的RGB-D SLAM系统实时性差和精确度低的问题,提出一种新的RGB-D SLAM系统,以进一步提升实时性和精确度。首先,采用ORB算法检测图像特征点,并对提取的特征点采用基于四叉树的均匀化策略进行处理,并结合词袋模型(BoW)进行特征匹配。然后,在系统相机姿态初始值估计阶段,结合PnP和非线性优化方法为后端优化提供一个更接近最优值的初始值;在后端优化中,使用光束法平差(BA)对相机姿态初始值进行迭代优化,从而得到相机姿态的最优值。最后,根据相机姿态和每帧点云地图的对应关系,将所有的点云数据注册到同一个坐标系中,得到场景的稠密点云地图,并对点云地图利用八叉树进行递归式的压缩以得到一种用于机器人导航的三维地图。在TUM RGB-D数据集上,将构建的RGB-D SLAM同RGB-D SLAMv2、ORB-SLAM2系统进行了对比,实验结果表明所构建的RGB-D SLAM系统在实时性和精确度上的综合表现更优。  相似文献   

15.
赵宏  刘向东  杨永娟 《计算机应用》2020,40(12):3637-3643
同时定位与地图构建(SLAM)是机器人在未知环境实现自主导航的关键技术,针对目前常用的RGB-D SLAM系统实时性差和精确度低的问题,提出一种新的RGB-D SLAM系统,以进一步提升实时性和精确度。首先,采用ORB算法检测图像特征点,并对提取的特征点采用基于四叉树的均匀化策略进行处理,并结合词袋模型(BoW)进行特征匹配。然后,在系统相机姿态初始值估计阶段,结合PnP和非线性优化方法为后端优化提供一个更接近最优值的初始值;在后端优化中,使用光束法平差(BA)对相机姿态初始值进行迭代优化,从而得到相机姿态的最优值。最后,根据相机姿态和每帧点云地图的对应关系,将所有的点云数据注册到同一个坐标系中,得到场景的稠密点云地图,并对点云地图利用八叉树进行递归式的压缩以得到一种用于机器人导航的三维地图。在TUM RGB-D数据集上,将构建的RGB-D SLAM同RGB-D SLAMv2、ORB-SLAM2系统进行了对比,实验结果表明所构建的RGB-D SLAM系统在实时性和精确度上的综合表现更优。  相似文献   

16.
微软公司 2010 年推出的 Kinect 深度传感器能够同步提供场景深度和彩色信息,其应用的一个关键领域就是目标 识别。传统的目标识别大多限制在特殊的情形,如:手势识别、人脸识别,而大规模的目标识别是近年来的研究趋势。 通过 Kinect 得到的 RGB-D 数据集多为室内和办公环境下获取的多场景、多视角、分目标类型的数据集,为大规模的目标 识别算法设计提供了学习基础。同时,Kinect 获取的深度信息为目标识别提供了强有力的线索,利用深度信息的识别方法 较以前的方法具有无法比拟的优势,大大地提高了识别的精度。文章首先对 Kinect 的深度获取技术做了详细介绍;其次 对现有的 3D 目标识别方法进行综述,接着对已有的 3D 测试数据集进行分析和比较;最后对文章进行小结以及对未来 3D 目标识别算法和 3D 测试数据集的发展趋势作了简单的阐述。  相似文献   

17.
This paper presents a novel approach for human identification at a distance using gait recognition. Recognition of a person from their gait is a biometric of increasing interest. The proposed work introduces a nonlinear machine learning method, kernel Principal Component Analysis (PCA), to extract gait features from silhouettes for individual recognition. Binarized silhouette of a motion object is first represented by four 1-D signals which are the basic image features called the distance vectors. Fourier transform is performed to achieve translation invariant for the gait patterns accumulated from silhouette sequences which are extracted from different circumstances. Kernel PCA is then used to extract higher order relations among the gait patterns for future recognition. A fusion strategy is finally executed to produce a final decision. The experiments are carried out on the CMU and the USF gait databases and presented based on the different training gait cycles.  相似文献   

18.
相对于人脸和指纹等广泛使用的生物特征识别手段而言,步态识别是一种相对新的非接触式的身份识别方法。提出了一种基于改进的局部敏感判别分析的步态识别方法。在真实的步态数据库上的实验结果表明,提出的步态识别方法是有效可行的。  相似文献   

19.
为解决在利用增强现实技术进行装备维修的应用中,由于头戴设备硬件资源受限和计算能力不足导致的三维场景建模实时性差、回环检测鲁棒性低问题,利用RGB-D相机对传统视觉SLAM方法中计算费时的环节进行优化。通过将帧间匹配的SIFT算法提取主要素对特征描述符合理降维,优化匹配距离;利用RGB-D相机获取场景轮廓以降低关键帧数目;在回环检测中增加轮廓粗匹配步骤,减少词袋模型中用于聚类的特征描述子数量,提高回环检测的速度和效果。实例验证了该算法的可行性。  相似文献   

20.
Integrating face and gait for human recognition at a distance in video.   总被引:1,自引:0,他引:1  
This paper introduces a new video-based recognition method to recognize noncooperating individuals at a distance in video who expose side views to the camera. Information from two biometrics sources, side face and gait, is utilized and integrated for recognition. For side face, an enhanced side-face image (ESFI), a higher resolution image compared with the image directly obtained from a single video frame, is constructed, which integrates face information from multiple video frames. For gait, the gait energy image (GEI), a spatio-temporal compact representation of gait in video, is used to characterize human-walking properties. The features of face and gait are obtained separately using the principal component analysis and multiple discriminant analysis combined method from ESFI and GEI, respectively. They are then integrated at the match score level by using different fusion strategies. The approach is tested on a database of video sequences, corresponding to 45 people, which are collected over seven months. The different fusion methods are compared and analyzed. The experimental results show that: 1) the idea of constructing ESFI from multiple frames is promising for human recognition in video, and better face features are extracted from ESFI compared to those from the original side-face images (OSFIs); 2) the synchronization of face and gait is not necessary for face template ESFI and gait template GEI; the synthetic match scores combine information from them; and 3) an integrated information from side face and gait is effective for human recognition in video.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号