首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
This paper describes the baseline corpus of a new multimodal biometric database, the MMU GASPFA (Gait–Speech–Face) database. The corpus in GASPFA is acquired using commercial off the shelf (COTS) equipment including digital video cameras, digital voice recorder, digital camera, Kinect camera and accelerometer equipped smart phones. The corpus consists of frontal face images from the digital camera, speech utterances recorded using the digital voice recorder, gait videos with their associated data recorded using both the digital video cameras and Kinect camera simultaneously as well as accelerometer readings from the smart phones. A total of 82 participants had their biometric data recorded. MMU GASPFA is able to support both multimodal biometric authentication as well as gait action recognition. This paper describes the acquisition setup and protocols used in MMU GASPFA, as well as the content of the corpus. Baseline results from a subset of the participants are presented for validation purposes.  相似文献   

一个面向语音识别的云南民族口音普通话语音数据库   总被引:2,自引:0,他引:2  
介绍了一个以语音识别为目的的云南民族口音普通话语音数据库。当前,语音识别技术要走向实用必须解决用户情况多样性带来的鲁棒性问题,通常把这个问题简要地归结为“男女老幼”和“南腔北调”。作为民族文化大省的云南,共有25个少数民族,广大少数民族同胞在说普通话时明显带有地方民族口音,云南民族口音普通话语音识别研究是用户情况多样性研究的重要内容,而为之建立云南民族口音普通话语音数据库是该研究的重要基础和先决条件。  相似文献   

Event classification is inherently sequential and multimodal. Therefore, deep neural models need to dynamically focus on the most relevant time window and/or modality of a video. In this study, we propose the Multimodal Attentive Fusion Network (MAFnet), an architecture that can dynamically fuse visual and audio information for event recognition. Inspired by prior studies in neuroscience, we couple both modalities at different levels of visual and audio paths. Furthermore, the network dynamically highlights a modality at a given time window relevant to classify events. Experimental results in AVE (Audio-Visual Event), UCF51, and Kinetics-Sounds datasets show that the approach can effectively improve the accuracy in audio-visual event classification. Code is available at: https://github.com/numediart/MAFnet  相似文献   

A close relationship exists between the advancement of face recognition algorithms and the availability of face databases varying factors that affect facial appearance in a controlled manner. The CMU PIE database has been very influential in advancing research in face recognition across pose and illumination. Despite its success the PIE database has several shortcomings: a limited number of subjects, a single recording session and only few expressions captured. To address these issues we collected the CMU Multi-PIE database. It contains 337 subjects, imaged under 15 view points and 19 illumination conditions in up to four recording sessions. In this paper we introduce the database and describe the recording procedure. We furthermore present results from baseline experiments using PCA and LDA classifiers to highlight similarities and differences between PIE and Multi-PIE.  相似文献   

One of the major challenges encountered by current face recognition techniques lies in the difficulties of handling varying poses, i.e., recognition of faces in arbitrary in-depth rotations. The face image differences caused by rotations are often larger than the inter-person differences used in distinguishing identities. Face recognition across pose, on the other hand, has great potentials in many applications dealing with uncooperative subjects, in which the full power of face recognition being a passive biometric technique can be implemented and utilised. Extensive efforts have been put into the research toward pose-invariant face recognition in recent years and many prominent approaches have been proposed. However, several issues in face recognition across pose still remain open, such as lack of understanding about subspaces of pose variant images, problem intractability in 3D face modelling, complex face surface reflection mechanism, etc. This paper provides a critical survey of researches on image-based face recognition across pose. The existing techniques are comprehensively reviewed and discussed. They are classified into different categories according to their methodologies in handling pose variations. Their strategies, advantages/disadvantages and performances are elaborated. By generalising different tactics in handling pose variations and evaluating their performances, several promising directions for future research have been suggested.  相似文献   

基于聚类的大型人脸检索系统   总被引:4,自引:0,他引:4  
刘燕  张星明  郭宇聪 《计算机工程》2005,31(15):162-164
介绍了基于聚类的大型人脸检索系统的设计思想与实现技术。该系统综合了人脸定位、人脸识别和聚类检索等技术,采用C/S模式,维护了大型的人脸数据库,提供了灵活的查询接口。实验结果表明该系统具有理想的识别率和查询速度,因此具有广泛的应用前景。  相似文献   

The problem of identifying the topology implied by wireframe drawings of polyhedral objects requires the identification of face loops, loops of edges which correspond to a face in the object the drawing portrays.In this paper, we survey the advantages and limitations of known approaches, and present and discuss test results which illustrate the successes and failures of a currently popular approach based on Dijkstra’s Algorithm. We conclude that the root cause of many failure cases is that the underlying algorithm assumes that the cost of traversing an edge is fixed.We propose a new polynomial-order algorithm for finding faces in wireframes. This algorithm could be adapted to any graph-theoretical least-cost circuit problem where the cost of traversing an edge is not fixed but context-dependent.  相似文献   

一种新型的嵌入式语音识别机器人系统   总被引:1,自引:1,他引:0  
本文探讨和研究了一种新型的基于嵌入式系统以及DSP的语音识别工业机器人系统。系统采用嵌入式 DSP的方案使机器人的性能、成本、可配置性和可扩展性达到一个更佳的平衡点,同时在语音识别方面采用了改进的MFCC方法进行语音特征提取以及采用基于K均值分段的HMM模型进行实时语音学习与识别使算法的实时性和可移植性提高。  相似文献   

This paper investigates the enhancement of a speech recognition system that uses both audio and visual speech information in noisy environments by presenting contributions in two main system stages: front-end and back-end. The double use of Gabor filters is proposed as a feature extractor in the front-end stage of both modules to capture robust spectro-temporal features. The performance obtained from the resulted Gabor Audio Features (GAFs) and Gabor Visual Features (GVFs) is compared to the performance of other conventional features such as MFCC, PLP, RASTA-PLP audio features and DCT2 visual features. The experimental results show that a system utilizing GAFs and GVFs has a better performance, especially in a low-SNR scenario. To improve the back-end stage, a complete framework of synchronous Multi-Stream Hidden Markov Model (MSHMM) is used to solve the dynamic stream weight estimation problem for Audio-Visual Speech Recognition (AVSR). To demonstrate the usefulness of the dynamic weighting in the overall performance of AVSR system, we empirically show the preference of Late Integration (LI) compared to Early Integration (EI) especially when one of the modalities is corrupted. Results confirm the superior recognition accuracy for all SNR levels the superiority of the AVSR system with the Late Integration.  相似文献   

汉语语音识别系统评估王仁华,倪晋富(中国科学技术大学合肥230027)关键词语音识别,性能评价,语音数据库1引言汉语语音识别系统评估,是指运用科学的方法和技术手段,来评定不同的识别系统和算法之间的优劣.这项研究对改进和完善现有系统设计,提高系统性能,...  相似文献   

Native XML数据库的快速查询,可以通过基于XML文档编码的结构连接算法实现。在对现有结构连接算法进行综述的前提下,提出一种新的Native XML数据库的结构连接算法——基于深度均匀划分的结构连接算法(DRIAM)。该算法不要求输入数据AList和DList有序或在其节点编码上建有索引,避免了排序和索引所增加的额外开销;不需要输入数据AList和Dlist全部加载到内存中,可以适应不同内存大小限制的情况,并且该算法时间复杂度非常低。  相似文献   

人脸的自动识别是模式识别、图像处理等学科的研究热点,并在商业和法律方面有广阔的应用前景(如身份证、信用卡、护照等身份认证以及智能小区管理、电视监控系统等等),近年来关于人脸自动识别的研究取得了很大的进展。但是,这些研究成果离这一问题的彻底解决还有很大的距离,这一课题仍然是当前研究的热点问题之一。本文重点对现有的人脸检测与识别的方法及研究进行总结,并比较了各种方法的优缺点。并在最后指出了进一步工作的方向。  相似文献   

Database security plays an important role in the overall security of information systems and networks. This is both because of the nature of this technology and its widespread use today. The development of appropriate secure database design and implementation methodologies is therefore an important research problem and a necessary prerequisite for the successful development of such systems. The general framework and requirements for database security are given and a number of parameters of the secure database design and implementation problem are presented and discussed in this paper. A secure database system development methodology is then presented which could help overcome some of the problems currently encountered.  相似文献   

The globally integrated contemporary business environment has prompted new challenges to database architectures in order to enable organizations to improve database applications performance, scalability, reliability and data privacy in adapting to the evolving nature of business. Although a number of distributed database architectures are available for choice, there is a lack of an in‐depth understanding of the performance characteristics of these database architectures in a comparison way. In this paper, we report a performance study of three typical (centralized, partitioned and replicated) database architectures. We used the TPC‐C as the evaluation benchmark to simulate a contemporary business environment, and a commercially available database management system that supports the three architectures. We compared the performance of the partitioned and replicated architectures against the centralized database, which results in some interesting observations and practical experience. The findings and the practice presented in this paper provide useful information and experience for the enterprise architects and database administrators in determining the appropriate database architecture in moving from centralized to distributed environments. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

A new approach for estimating null value in relational database   总被引:1,自引:0,他引:1  
In general, a database system will not operate properly if it exist some null values of attributes in the system. In this paper, we propose a new approach to estimate null values in relational database, which utilize other clustering algorithm to cluster data, and use fuzzy correlation and distance similarity to calculate the correlation of different attribute. For verifying our method, this paper utilize mean of absolute error rate (MAER) as evaluation criterion to compare with other methods; it is shown that our proposed method proves importance than the existing methods for estimating null values in relational database systems.  相似文献   

This paper proposes a face recognition system to overcome the problem due to illumination variation. The propose system first classifies the image's illumination into dark, normal or shadow and then based on the illumination type; an appropriate technique is applied for illumination normalization. Propose system ensures that there is no loss of features from the image due to a proper selection of illumination normalization technique for illumination compensation. Moreover, it also saves the processing time for illumination normalization process when an image is classified as normal. This makes the approach computationally efficient. Rough Set Theory is used to build rmf illumination classifier for illumination classification. The results obtained as high as 96% in terms of accuracy of correct classification of images as dark, normal or shadow.  相似文献   

吕成国  韩纪庆  高文 《计算机工程》2005,31(5):34-35,193
语速变化是发音变异的一种,文章建立了快、慢和正常语速的语音库,运用差别子空间方法对语速变化的语音进行了训练和识别,并对其进行了改进,提出了多路差别子空间方法。实验结果表时,该方法对语速变化的语音有良好的识别效果。  相似文献   

本文提出了一种用于人脸粗检测的广义几何投影方法,该方法能从场景图像中快速地筛选出侯选人脸区域,筛选比例高,并能同时预测出筛选位置人脸的尺度范围,使得整个人脸的检测速度得到很大的提高。  相似文献   

从安全性与可操作性的双重需求出发构造一个外包数据库模式分析框架,提出基于虚拟服务器模式(virtual server architecture,VSA)的外包数据库模式,并设计以同态密文操作策略为核心技术的系统实现方案,证实VSA模式相对于传统模式的优势。  相似文献   

Gabor filter banks constitute a very robust tool to extract discriminant information from a visual scene. After the now “classical” bank with 5 frequencies and 8 orientations proposed by Lades et al. and Wiskott et al., many other parametrizations of a Gabor filter bank have appeared. In order to find the optimal parametrization for a face recognition experiment, we have performed a 6-way analysis of variance of Gabor parameters using FERET, FRAV2D, FRAV3D, FRGC and XM2VTS face databases, including frontal and turned poses, facial expressions, occlusions and changes of illumination. Considering independent criteria to find the optimal Gabor filter bank, the bank with the highest recognition rate was found to have 6 frequencies and narrower Gaussian widths in the space domain. These results were obtained with Mahalanobis distance for a k-NN classifier, with analytical and holistic Gabor feature vectors. Moreover about 20% of the banks studied here obtained in average a better performance than the classical bank. For most of the databases considered, the highest recognition rates have been achieved with analytical representations (frontal images, images with turns or occlusions), with a holistic preponderance for images with gestures or changes of illumination. The inferiority found for holistic Gabor representations versus their analytical counterparts can be explained for the intrinsic redundancy and the size of the feature vectors of this kind of representation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号