首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Most of the existing approaches of multimodal 2D + 3D face recognition exploit the 2D and 3D information at the feature or score level. They do not fully benefit from the dependency between modalities. Exploiting this dependency at the early stage is more effective than the later stage. Early fusion data contains richer information about the input biometric than the compressed features or matching scores. We propose an image recombination for face recognition that explores the dependency between modalities at the image level. Facial cues from the 2D and 3D images are recombined into a more independent and discriminating data by finding transformation axes that account for the maximal amount of variances in the images. We also introduce a complete framework of multimodal 2D + 3D face recognition that utilizes the 2D and 3D facial information at the enrollment, image and score levels. Experimental results based on NTU-CSP and Bosphorus 3D face databases show that our face recognition system using image recombination outperforms other face recognition systems based on the pixel- or score-level fusion.  相似文献   

2.
In order to help the visually impaired as they navigate unfamiliar environment such as public buildings, this paper presents a novel smart phone, vision-based indoor localization, and guidance system, called Seeing Eye Phone. This system requires a smart phone from the user and a server. The smart phone captures and transmits images of the user facing forward to the server. The server processes the phone images to detect and describe 2D features by SURF and then matches them to the 2D features of the stored map images that include their corresponding 3D information of the building. After features are matched, Direct Linear Transform runs on a subset of correspondences to find a rough initial pose estimate and the Levenberg–Marquardt algorithm further refines the pose estimate to find a more optimal solution. With the estimated pose and the camera’s intrinsic parameters, the location and orientation of the user are calculated using 3D location correspondence data stored for features of each image. Positional information is then transmitted back to the smart phone and communicated to the user via text-to-speech. This indoor guiding system uses efficient algorithms such as SURF, homographs, multi-view geometry, and 3D to 2D reprojection to solve a very unique problem that will benefit the visually impaired. The experimental results demonstrate the feasibility of using a simple machine vision system design to accomplish a complex task and the potential of building a commercial product based on this design.  相似文献   

3.
为了提高手机图像的识别率,从手机文字扫描识别应用问题出发,在分析手机普遍存在的暗角退化特征的基础上,提出了一种从含有文字信息的图像中提取图像的暗角退化模型的方法;然后根据该暗角模型对输入图像进行复原,最终达到修正手机摄像模组暗角退化的目的。  相似文献   

4.
一种肿瘤细胞显微图象处理与识别系统   总被引:2,自引:1,他引:1  
介绍了计算机肿瘤细胞显微图象处理与识别系统,论述了肿瘤显微图象的预处理技术及细胞形态特征参数的提取、数据处理与肿瘤细胞数据库的建立。系统应用了体视学由二维结构信息定量推论出三维结构信息,为生物形态三维定量研究提供了方法。  相似文献   

5.
静息态功能磁共振图像是随着时间变化的一系列三维图像。已有的3D卷积过程本质上是对三维图像数据或二维图像+时间维数据进行处理,无法有效地融合静息态功能磁共振图像的时间轴信息。为此,本文提出了新型的4D卷积神经网络识别模型。具体而言,通过对输入的fMRI使用四维卷积核执行四维卷积,在自闭症患者的功能磁共振图像中,从空间和时间上提取特征,从而捕获图像在时间序列上的变化信息。所开发的模型从输入图像中生成多个信息通道,最终的特征表示结合了所有通道的信息。实验结果表明,在保证模型泛化性能的前提下,该方法融合了功能像的全局信息,并且采集了功能像随时间变化的趋势信息,进而解决了用卷积神经网络处理三维图像随时间变化的分类问题。  相似文献   

6.
目的 在移动互联网时代下,移动增强现实应用得到越来越快的发展。然而户外场景中存在许多相似结构的建筑,且手机的存储和计算能力有限,因此应用多集中于室内小范围环境,对于室外大规模复杂场景的适应性较弱。对此,建立一套基于云端图像识别的移动增强现实系统。方法 为解决相似特征的误匹配问题,算法中将重力信息加入到SURF和BRISK特征描述中去,构建Gravity-SURF和Gravity-BRISK特征描述。云端系统对增强信息进行有效管理,采用基于Gravity-SURF特征的VLAD方法对大规模图像进行识别;在智能终端上的应用中呈现识别图像的增强信息,并利用识别图像的Gravity-BRISK特征和光流结合的方法对相机进行跟踪,采用Unity3D渲染引擎实时绘制3维模型。结果 在包含重力信息的4 000幅户外图像的数据库中进行实验。采用结合重力信息的特征描述算法,能够增强具有相似特征的描述符的区分性,并提高匹配正确率。图像识别算法的识别率能达到88%以上,识别时间在420 ms左右;光流跟踪的RMS误差小于1.2像素,帧率能达到23 帧/s。结论 本文针对室外大规模复杂场景建立的基于图像识别的移动增强现实系统,能方便对不同应用的增强现实数据进行管理。系统被应用到谷歌眼镜和新闻领域上,不局限于单一的应用领域。结果表明,识别算法和跟踪注册算法能够满足系统的精度和实时性要求。  相似文献   

7.
Most interaction recognition approaches have been limited to single‐person action classification in videos. However, for still images where motion information is not available, the task becomes more complex. Aiming to this point, we propose an approach for multiperson human interaction recognition in images with keypoint‐based feature image analysis. Proposed method is a three‐stage framework. In the first stage, we propose feature‐based neural network (FCNN) for action recognition trained with feature images. Feature images are body features, that is, effective distances between a set of body part pairs and angular relation between body part triplets, rearranged in 2D gray‐scale image to learn effective representation of complex actions. In the later stage, we propose a voting‐based method for direction encoding to anticipate probable motion in steady images. Finally, our multiperson interaction recognition algorithm identifies which human pairs are interacting with each other using an interaction parameter. We evaluate our approach on two real‐world data sets, that is, UT‐interaction and SBU kinect interaction. The empirical experiments show that results are better than the state‐of‐the‐art methods with recognition accuracy of 95.83% on UT‐I set 1, 92.5% on UT‐I set 2, and 94.28% on SBU clean data set.  相似文献   

8.
Audio-visual speech recognition (AVSR) has shown impressive improvements over audio-only speech recognition in the presence of acoustic noise. However, the problems of region-of-interest detection and feature extraction may influence the recognition performance due to the visual speech information obtained typically from planar video data. In this paper, we deviate from the traditional visual speech information and propose an AVSR system integrating 3D lip information. The Microsoft Kinect multi-sensory device was adopted for data collection. The different feature extraction and selection algorithms were applied to planar images and 3D lip information, so as to fuse the planar images and 3D lip feature into the visual-3D lip joint feature. For automatic speech recognition (ASR), the fusion methods were investigated and the audio-visual speech information was integrated into a state-synchronous two stream Hidden Markov Model. The experimental results demonstrated that our AVSR system integrating 3D lip information improved the recognition performance of traditional ASR and AVSR system in acoustic noise environments.  相似文献   

9.
针对建筑物在城市化发展规划、地理国情信息系统更新、数字化城市以及军事侦察等方面的迫切要求,提出将半监督鉴别分析(Semi-supervised Discriminant Analysis,SDA)算法应用于高分辨率SAR影像的建筑区提取中,实现快速提取建筑区信息以及提高城市地物目标识别能力。以Radarsat-2影像和TerraSAR-X影像为实验数据,基于灰度共生矩阵计算影像的各种纹理特征;结合SDA算法进行特征提取,并以新特征作为大津法(Otsu)的输入提取建筑区;最后对分类结果进行后处理。实验结果与线性鉴别分析(Linear Discriminant Analysis,LDA)算法和局部保持投影(Local Preserving Projection,LPP)算法进行比较,结果表明:SDA算法具有较强的泛化能力,在先验类别信息较少时,适用于高分辨率SAR影像的特征提取,可以快速有效地提取建筑区信息。  相似文献   

10.
手机屏幕缺陷检测是手机生产的重要环节,实现准确而高效的屏幕缺陷检测对于提高手机工业产能具有重要意义。在实际生产过程中,手机屏幕图像缺陷特征隐晦、缺陷尺寸差异大等问题,加大了手机屏幕缺陷检测的难度。为解决上述问题,提出了一种基于Preprocessing operations are combined with U-Net-Faster R-CNN(PU-Faster R-CNN)的手机屏幕缺陷检测模型。针对手机屏幕图像的特征信息隐晦的问题,提出多层特征增强模块,有效的对目标缺陷特征信息进行增强。构建多尺度特征提取网络,有效提取多尺度的缺陷特征信息。为了生成拟合性更好的Anchor box,提出了自适应区域建议网络,通过自迭代聚类算法生成尺寸更准确的Anchor box模板。实验结果表明,基于PU-Faster R-CNN的手机屏幕检测框架在手机屏幕数据集上优于主流的手机屏幕缺陷检测框架。  相似文献   

11.
In last years, Face recognition based on 3D techniques is an emergent technology which has demonstrated better results than conventional 2D approaches. Using texture (180° multi-view image) and depth maps is supposed to increase the robustness towards the two main challenges in Face Recognition: Pose and illumination. Nevertheless, 3D data should be acquired under highly controlled conditions and in most cases depends on the collaboration of the subject to be recognized. Thus, in applications such as surveillance or control access points, this kind of 3D data may not be available during the recognition process. This leads to a new paradigm using some mixed 2D-3D face recognition systems where 3D data is used in the training but either 2D or 3D information can be used in the recognition depending on the scenario. Following this concept, where only part of the information (partial concept) is used in the recognition, a novel method is presented in this work. This has been called Partial Principal Component Analysis (P2CA) since they fuse the Partial concept with the fundamentals of the well known PCA algorithm. This strategy has been proven to be very robust in pose variation scenarios showing that the 3D training process retains all the spatial information of the face while the 2D picture effectively recovers the face information from the available data. Furthermore, in this work, a novel approach for the automatic creation of 180° aligned cylindrical projected face images using nine different views is presented. These face images are created by using a cylindrical approximation for the real object surface. The alignment is done by applying first a global 2D affine transformation of the image, and afterward a local transformation of the desired face features using a triangle mesh. This local alignment allows a closer look to the feature properties and not the differences. Finally, these aligned face images are used for training a pose invariant face recognition approach (P2CA).  相似文献   

12.
Automated human identification is a significant issue in real and virtual societies. Iris is a suitable choice for meeting this goal. In this paper, we present an iris recognition system that uses images acquired in both near-infrared and visible lights. These two types of images reveal different textural information of the iris tissue. We demonstrated the necessity to process both VL and NIR images to recognize irides. The proposed system exploits two feature extraction algorithms: one is based on 1D log-Gabor wavelet which gives a detailed representation of the iris region and the other is based on 1D Haar wavelet which represents a coarse model of iris. The Haar wavelet algorithm is proposed in this paper. It makes smaller iris templates than the 1D log-Gabor approach and yet achieves an appropriate recognition rate. We performed the fusion at the match score level and examined the performance of the system in both verification and identification modes. UTIRIS database was used to evaluate the method. The results were compared with other approaches and proved to have better recognition accuracy, while no image enhancement technique is utilized prior to the feature extraction stage. Furthermore, we demonstrated that fusion can compensate the lack of input image information, which can be beneficial in reducing the computation complexity and handling non-cooperative iris images.  相似文献   

13.
14.
目的 针对3维人脸识别中存在表情变化的问题,提出了一种基于刚性区域特征点的3维人脸识别方法。方法 该方法首先在人脸纹理图像上提取人脸图像的特征点,并删除非刚性区域内的特征点,然后根据采样点的序号,在人脸空间几何信息上得到人脸图像特征点的3维几何信息,并建立以特征点为中心的刚性区域内的子区域,最后以子区域为局部特征进行人脸识别测试,得到不同子区域对人脸识别的贡献,并以此作为依据对人脸识别的结果进行加权统计。结果 在FRGC v2.0的3维人脸数据库上进行实验测试,该方法的识别准确率为98.5%,当错误接受率(FAR)为0.001时的验证率为99.2%,结果表明,该方法对非中性表情下的3维人脸识别具有很好的准确性。结论 该方法可以有效克服表情变化对3维人脸识别的影响,同时对3维数据中存在的空洞和尖锐噪声等因素具有较好的鲁棒性,对提高3维人脸识别性能具有重要意义。  相似文献   

15.
This paper presents a prototype system of rooftop detection and 3D building modeling from aerial images. In this system, without the knowledge of the position and orientation information of the aerial vehicle a priori, the parameters of the camera pose and ground plane are first estimated by simple human?Ccomputer interaction. Next, after an over-segmentation of the aerial image by the Mean-Shift algorithm, the rooftop regions are coarsely detected by integrating multi-scale SIFT-like feature vectors with SVM-based visual object recognition. 2D cues alone however might not always be sufficient to separate regions such as parking lots from building roofs. Thus in order to further refine the accuracy of the roof-detection result and remove the misclassified non-rooftop regions such as parking lots, we further resort to 3D depth information estimated based on multi-view geometry. More specifically, we determine whether a candidate region is a rooftop or not according to its height information relative to the ground plane, whereas the candidate region??s height information is obtained by a novel, hierarchical, asymmetry correlation-based corner matching scheme. The output of the system will be a water-tight triangle mesh based 3D building model texture mapped with the aerial images. We developed an interactive 3D viewer based on OpenGL and C+?+ to allow the user to virtually navigate the reconstructed 3D scene with mouse and keyboard. Experimental results are shown on real aerial scenes.  相似文献   

16.
医学图像中目标的特征信息是各类基于图像技术的应用研究中的重要信息,也是进行图像分析、目标识别以及医学三维重建的研究基础。医学图像三维重建中最基础、最关键的一个过程就是提取图像序列中需要重建的目标,为后续的三维重建提供基础数据。论文利用一种新的彩色图像特征描述方法,同时从彩色医学图像中提取出目标的边界与角点,从而达到提取目标轮廓的目的。  相似文献   

17.
This paper describes a method of geo-registering a sequence of panoramic images to a digital map by matching pixel information from the images with information on the building footprint contained in a digital map. Recently, images captured at the ground level using a Mobile Mapping System (MMS), such as the panoramic images displayed by Google Street View, have been considered as a valuable resource for three-dimensional (3D) building modeling. However, the wide intervals between these panoramic images, as well as locational and directional error from the related sensors, make it difficult to analyze the image data. This paper demonstrates a formulation method for connecting pixels in panoramic images with information on footprint vertices and building lines contained in a digital map. To allow both pixel and footprint information consistent in 3D space, each panoramic image is tilt-corrected in pre-processing to upright the image using the estimated pitch and roll of a vehicle and removing the pitch and roll effects from the panoramic image pixels. Through the proposed formulation, a single panoramic image can be easily geo-registered with simple user-provided constraints, and adjacent sequential images can then be automatically geo-registered using point feature matching. Experimental results showed a significant reduction in the locational and directional error of sequential panoramic images, and the proposed vanishing point (VP) based validation process was found to successfully detect failure cases.  相似文献   

18.
本文讨论了一种基于光学标记识别技术的文档图象识别系统,重点阐述了该系统的设计思想和实现技术。系统以标记识别为例,对图象数字化、图象预处理、获取图象信息以及信息特征提取等几个关键部分进行了详细描述。在图象预处理过程中给出了一种高效实用的边缘检测算法,在标记信息特征提取中采用了统计分析的方法,其结果可以大大
大提高标记识别系统的准确率。  相似文献   

19.
Due to the steady increase in the number of heterogeneous types of location information on the internet, it is hard to organize a complete overview of the geospatial information for the tasks of knowledge acquisition related to specific geographic locations. The text- and photo-types of geographical dataset contain numerous location data, such as location-based tourism information, therefore defining high dimensional spaces of attributes that are highly correlated. In this work, we utilized text- and photo-types of location information with a novel approach of information fusion that exploits effective image annotation and location based text-mining approaches to enhance identification of geographic location and spatial cognition. In this paper, we describe our feature extraction methods to annotating images, and utilizing text mining approach to analyze images and texts simultaneously, in order to carry out geospatial text mining and image classification tasks. Subsequently, photo-images and textual documents are projected to a unified feature space, in order to generate a co-constructed semantic space for information fusion. Also, we employed text mining approaches to classify documents into various categories based upon their geospatial features, with the aims to discovering relationships between documents and geographical zones. The experimental results show that the proposed method can effectively enhance the tasks of location based knowledge discovery.  相似文献   

20.
针对三维人脸识别算法中的高精度分类器设计问题,采用人脸全局特征和局部特征共四个相互独立的多特征信息分类后进行D-S数据融合技术来实现。通过SVM分类器对三维人脸图像中相互独立的全局特征(面廓)和局部特征(眼睛、鼻子和嘴)共四个特征进行一对一的单特征识别,并将其结果进行数据归一化处理后,作为D-S证据理论的BPA,按照D-S理论融合全局特征和局部特征数据,计算出更加准确的识别结果。经过融合数据结果分析,发现该算法可靠有效,大大提高了三维人脸的识别效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号