首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
近年来,表情识别逐渐成为计算机视觉和模式识别领域的研究热点之一。给出了一个包含人脸特征提取和表情识别的计算机视觉系统,通过对视频中人脸兼容运动特征的跟踪,提取人脸运动特征向量序列,与以往的方法不同,提取到的特征向量流被分割为两类,一类是表情特征向量流,另一类是视觉语音特征向量流。然后,利用基于CHMM(Couple Hidden Markov Model)的表情识别模型,进行人脸表情的识别,该模型允许两个向量流根据其各自的时域特征以异步方式进行处理,同时保持这两个向量流在时域上的自然关联。实验表明该方法优于传统的单通道处理方法。  相似文献   

2.
In this paper we propose a new approach to real-time view-based pose recognition and interpolation. Pose recognition is particularly useful for identifying camera views in databases, video sequences, video streams, and live recordings. All of these applications require a fast pose recognition process, in many cases video real-time. It should further be possible to extend the database with new material, i.e., to update the recognition system online. The method that we propose is based on P-channels, a special kind of information representation which combines advantages of histograms and local linear models. Our approach is motivated by its similarity to information representation in biological systems but its main advantage is its robustness against common distortions such as clutter and occlusion. The recognition algorithm consists of three steps: (1) low-level image features for color and local orientation are extracted in each point of the image; (2) these features are encoded into P-channels by combining similar features within local image regions; (3) the query P-channels are compared to a set of prototype P-channels in a database using a least-squares approach. The algorithm is applied in two scene registration experiments with fisheye camera data, one for pose interpolation from synthetic images and one for finding the nearest view in a set of real images. The method compares favorable to SIFT-based methods, in particular concerning interpolation. The method can be used for initializing pose-tracking systems, either when starting the tracking or when the tracking has failed and the system needs to re-initialize. Due to its real-time performance, the method can also be embedded directly into the tracking system, allowing a sensor fusion unit choosing dynamically between the frame-by-frame tracking and the pose recognition.  相似文献   

3.
4.
嵌入式语音识别系统中的DTW在线并行算法*   总被引:2,自引:0,他引:2  
为提高语音识别系统的实时性,利用动态规划和并行计算思想,提出一种适用于嵌入式语音识别系统的DTW(动态时间规整)在线并行算法。通过分析标准DTW及其主要衍生算法,对DTW算法的数据结构进行改进以满足在线算法要求,在寻找最佳路径过程中动态连续地分配和释放内存或预先分配固定大小的内存,并将多个关键词的DTW计算分布到多个运算单元;最后汇总各运算单元的结果得到识别结果。实验表明,该算法比经典DTW降低了内存使用和识别时间,并使语音识别的实时系数达到1.17,具有较高的实时性。  相似文献   

5.
在维吾尔文联机手写识别过程的训练阶段,单词被切分成字母,经过特征提取和聚类形成特征向量作为模型的输入。构造出以字符为基元的隐马尔可夫模型(HMM),将其嵌入到识别字典网络中。通过基于HMM的分类识别器,最终得到识别结果。首次将消除延迟笔画、建立有延迟笔画和无延迟笔画的字典的方法应用于维吾尔文手写识别中,取得了较高的识别率。  相似文献   

6.
Despite their known weaknesses, hidden Markov models (HMMs) have been the dominant technique for acoustic modeling in speech recognition for over two decades. Still, the advances in the HMM framework have not solved its key problems: it discards information about time dependencies and is prone to overgeneralization. In this paper, we attempt to overcome these problems by relying on straightforward template matching. The basis for the recognizer is the well-known DTW algorithm. However, classical DTW continuous speech recognition results in an explosion of the search space. The traditional top-down search is therefore complemented with a data-driven selection of candidates for DTW alignment. We also extend the DTW framework with a flexible subword unit mechanism and a class sensitive distance measure-two components suggested by state-of-the-art HMM systems. The added flexibility of the unit selection in the template-based framework leads to new approaches to speaker and environment adaptation. The template matching system reaches a performance somewhat worse than the best published HMM results for the Resource Management benchmark, but thanks to complementarity of errors between the HMM and DTW systems, the combination of both leads to a decrease in word error rate with 17% compared to the HMM results  相似文献   

7.
语音识别中动态时间规整和隐马尔可夫统一模型   总被引:1,自引:0,他引:1  
对于目前在语音识别中广泛使用的两种技术即动态时间规整(DTW)技术和隐马尔可夫模型(HMM)的本质联系,提出了二者的统一模型(DHUM,DTW and HMM Uni-fied Model),并分别给出DTW和HM向DHUM的转换关系。文中还提出了用DHUM解决更接近语音实际情况的高阶HMM作语音识别时所面临的运算量过大的问题。中等词表的识别实验结果表明,建立在DHUM之上的识别器的识别性能不低于  相似文献   

8.
This paper is concerned with the problem of recognition of dynamic hand gestures. We have considered gestures which are sequences of distinct hand poses. In these gestures hand poses can undergo motion and discrete changes. However, continuous deformations of the hand shapes are not permitted. We have developed a recognition engine which can reliably recognize these gestures despite individual variations. The engine also has the ability to detect start and end of gesture sequences in an automated fashion. The recognition strategy uses a combination of static shape recognition (performed using contour discriminant analysis), Kalman filter based hand tracking and a HMM based temporal characterization scheme. The system is fairly robust to background clutter and uses skin color for static shape recognition and tracking. A real time implementation on standard hardware is developed. Experimental results establish the effectiveness of the approach.  相似文献   

9.
为了提高肌电信号手势识别算法的准确度,增强实时性,提出了一种基于动态时间规整(DTW)算法的手势识别方法,该方法利用肌电信号(EMG)对个体间的手势进行识别。首先,采用滑动平均能量的方法对原始的EMG信号进行数据分割,探测有效动作;其次,对于分割的数据段使用平均绝对值(MAV)来提取信号特征;最后,用DTW算法将8维的EMG信号融合并计算测试样本和模版的相似度,其中采用了DTW算法寻找规整路径的方法进行了模板制作,实现了个体间的手势识别。实验结果表明,使用DTW算法对肌电信号进行手势识别,其动作识别的准确率达到96.09%,该方法计算速度快,实时性强。  相似文献   

10.
基于乘积HMM的双模态语音识别方法   总被引:3,自引:2,他引:1       下载免费PDF全文
针对噪声环境中的语音识别,提出一种用于双模态语音识别的乘积隐马尔可夫模型(HMM)。在独立训练音频HMM和视频HMM的基础上,建立二维训练模型,表征音频流和视频流之间的异步特性。引入权重系数,根据不同噪声环境自适应调整音频流与视频流的权重。实验结果证明,与其他双模态语音识别方法相比,该方法的识别性能更高。  相似文献   

11.
Sequential Monte Carlo (SMC) represents a principal statistical method for tracking objects in video sequences by on-line estimation of the state of a non-linear dynamic system. The performance of individual stages of the SMC algorithm is usually data-dependent, making the prediction of the performance of a real-time capable system difficult and often leading to grossly overestimated and inefficient system designs. Also, the considerable computational complexity is a major obstacle when implementing SMC methods on purely CPU-based resource constrained embedded systems. In contrast, heterogeneous multi-cores present a more suitable implementation platform. We use hybrid CPU/FPGA systems, as they can efficiently execute both the control-centric sequential as well as the data-parallel parts of an SMC application. However, even with hybrid CPU/FPGA platforms, determining the optimal HW/SW partitioning is challenging in general, and even impossible with a design time approach. Thus, we need self-adaptive architectures and system software layers that are able to react autonomously to varying workloads and changing input data while preserving real-time constraints and area efficiency. In this article, we present a video tracking application modeled on top of a framework for implementing SMC methods on CPU/FPGA-based systems such as modern platform FPGAs. Based on a multithreaded programming model, our framework allows for an easy design space exploration with respect to the HW/SW partitioning. Additionally, the application can adaptively switch between several partitionings during run-time to react to changing input data and performance requirements. Our system utilizes two variants of a add/remove self-adaptation technique for task partitioning inside this framework that achieve soft real-time behavior while trying to minimize the number of active cores. To evaluate its performance and area requirements, we demonstrate the application and the framework on a real-life video tracking case study and show that partial reconfiguration can be effectively and transparently used for realizing adaptive real-time HW/SW systems.  相似文献   

12.
In this paper, we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, noise, and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target in a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an l1-regularized least-squares problem. Then, the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework. Two strategies are used to further improve the tracking performance. First, target templates are dynamically updated to capture appearance changes. Second, nonnegativity constraints are enforced to filter out clutter which negatively resembles tracking targets. We test the proposed approach on numerous sequences involving different types of challenges, including occlusion and variations in illumination, scale, and pose. The proposed approach demonstrates excellent performance in comparison with previously proposed trackers. We also extend the method for simultaneous tracking and recognition by introducing a static template set which stores target images from different classes. The recognition result at each frame is propagated to produce the final result for the whole video. The approach is validated on a vehicle tracking and classification task using outdoor infrared video sequences.  相似文献   

13.
Motion trajectories provide rich spatio-temporal information about an object's activity. The trajectory information can be obtained using a tracking algorithm on data streams available from a range of devices including motion sensors, video cameras, haptic devices, etc. Developing view-invariant activity recognition algorithms based on this high dimensional cue is an extremely challenging task. This paper presents efficient activity recognition algorithms using novel view-invariant representation of trajectories. Towards this end, we derive two Affine-invariant representations for motion trajectories based on curvature scale space (CSS) and centroid distance function (CDF). The properties of these schemes facilitate the design of efficient recognition algorithms based on hidden Markov models (HMMs). In the CSS-based representation, maxima of curvature zero crossings at increasing levels of smoothness are extracted to mark the location and extent of concavities in the curvature. The sequences of these CSS maxima are then modeled by continuous density (HMMs). For the case of CDF, we first segment the trajectory into subtrajectories using CDF-based representation. These subtrajectories are then represented by their Principal Component Analysis (PCA) coefficients. The sequences of these PCA coefficients from subtrajectories are then modeled by continuous density hidden Markov models (HMMs). Different classes of object motions are modeled by one Continuous HMM per class where state PDFs are represented by GMMs. Experiments using a database of around 1750 complex trajectories (obtained from UCI-KDD data archives) subdivided into five different classes are reported.  相似文献   

14.
基于拉普拉斯脸和隐马尔可夫的视频人脸识别   总被引:1,自引:2,他引:1       下载免费PDF全文
提出了一种基于拉普拉斯脸和隐马尔可夫模型的视频人脸识别方法。在训练过程中,采用拉普拉斯脸方法将每一视频序列中的人脸图像映射到拉普拉斯空间,将降维后的特征作为观测值,通过隐马尔可夫模型得到每一训练视频的统计特性和时间动态特性。在识别过程中,用每一个训练视频的隐马尔可夫模型来分析测试视频的时间动态特性,计算出每一训练模型产生该序列的概率,概率最大值所对应的模型就是待识别序列所属的类别。实验结果表明,该方法能够很好地进行视频人脸识别。  相似文献   

15.
Obtaining training material for rarely used English words and common given names from countries where English is not spoken is di?cult due to excessive time, storage and cost factors. By considering pe...  相似文献   

16.
This paper studies some pattern recognition algorithms for on-line signature recognition: vector quantization (VQ), nearest neighbor (NN), dynamic time warping (DTW) and hidden Markov models (HMM). We have used a database of 330 users which includes 25 skilled forgeries performed by five different impostors. This database is larger than the typical ones found in the literature.Experimental results reveal that our first proposed combination of VQ and DTW (by means of score fusion) outperforms the other algorithms (DTW, HMM) and achieves a minimum detection cost function (DCF) value equal to 1.37% for random forgeries and 5.42% for skilled forgeries. In addition, we present another combined DTW-VQ scheme which enables improvement of privacy for remote authentication systems, avoiding the submission of the whole original dynamical signature information (using codewords, instead of feature vectors). This system achieves similar performance than DTW.  相似文献   

17.
Building a large vocabulary continuous speech recognition (LVCSR) system requires a lot of hours of segmented and labelled speech data. Arabic language, as many other low-resourced languages, lacks such data, but the use of automatic segmentation proved to be a good alternative to make these resources available. In this paper, we suggest the combination of hidden Markov models (HMMs) and support vector machines (SVMs) to segment and to label the speech waveform into phoneme units. HMMs generate the sequence of phonemes and their frontiers; the SVM refines the frontiers and corrects the labels. The obtained segmented and labelled units may serve as a training set for speech recognition applications. The HMM/SVM segmentation algorithm is assessed using both the hit rate and the word error rate (WER); the resulting scores were compared to those provided by the manual segmentation and to those provided by the well-known embedded learning algorithm. The results show that the speech recognizer built upon the HMM/SVM segmentation outperforms in terms of WER the one built upon the embedded learning segmentation of about 0.05%, even in noisy background.  相似文献   

18.
设计了一种基于视频监控的人脸检测跟踪识别系统,该系统的功能是检测并实时跟踪视频中的人脸图像,同时进行身份识别。针对Gentle AdaBoost算法构造的级联分类器检测效率偏低的问题,提出了一种递进复杂度的级联分类器。针对传统粒子滤波器最高权重粒子不准确的问题,提出了均值权重粒子滤波器。针对传统粒子滤波器样本衰退的问题,提出了一种同时结合人脸检测和人脸跟踪算法的跟踪校正策略。对于检测和跟踪到的人脸,利用基于Gabor变换和HMM的方法进行身份识别。实验结果表明,系统能够准确地检测并实时跟踪视频中的人脸,可以实现人脸的快速识别,是一种能够应用到视频监控系统中的有效方法。  相似文献   

19.
Detecting and tracking human faces in video sequences is useful in a number of applications such as gesture recognition and human-machine interaction. In this paper, we show that online appearance models (holistic approaches) can be used for simultaneously tracking the head, the lips, the eyebrows, and the eyelids in monocular video sequences. Unlike previous approaches to eyelid tracking, we show that the online appearance models can be used for this purpose. Neither color information nor intensity edges are used by our proposed approach. More precisely, we show how the classical appearance-based trackers can be upgraded in order to deal with fast eyelid movements. The proposed eyelid tracking is made robust by avoiding eye feature extraction. Experiments on real videos show the usefulness of the proposed tracking schemes as well as their enhancement to our previous approach.
Javier OrozcoEmail:
  相似文献   

20.
提出一种用签名的分段差异值作为隐马尔可夫模型(HMM)观测值的在线签名认证应用方法。首先,采用双向后向合并DTW算法确定签名中关键点之间的对应关系。然后,采用经典DTW度量签名中各种细微的差异,用这些DTW差异值作为观测值训练HMM模型。将模型状态的意义定义为相似程度,将状态转移结构设定为全概率转移。在SVC2004签名数据库上,验证了该方法的有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号