首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Investigating Speaker Verification in real-world noisy environments, a novel feature extraction process suitable for suppression of time-varying noise is compared with a fine-tuned spectral subtraction method. The proposed feature extraction process is based on approximating the clean speech and the noise spectral magnitude with a mixture of Gaussian probability density functions (pdfs) by using the Expectation-Maximization algorithm (EM). Subsequently, the Bayesian inference framework is applied to the degraded spectral coefficients, and by employing Minimum Mean Square Error Estimation (MMSE), a closed form solution for the spectral magnitude estimation task is derived. The estimated spectral magnitude finally is incorporated into the Mel-Frequency Cepstral Coefficients (MFCCs) front-end of a baseline text-independent speaker verification system, based on Probabilistic Neural Networks, which participated successfully in the 2002 NIST (National Institute of Standards and Technology of USA) Speaker Recognition Evaluation. A comparative study of the proposed technique for real-world noise types demonstrates a significant performance gain compared to the baseline speech features and to the spectral subtraction enhancement method. Improvements of the absolute speaker verification performance with more than 27% for 0 dB signal-to-noise ratio (SNR), compared to the MFCCs, and with more than 13% for –5 dB SNR, compared to the spectral subtraction version, were obtained in the case of a passing-by aircraft scenario.  相似文献   

2.
在文本无关的说话人识别中,韵律特征由于其对信道环境噪声不敏感等特性而被应用于话者识别任务中.本文对韵律参数采用基于高斯混合模型超向量的支持向量机建模方法,并将类内协方差特征映射方法应用于模型超向量上,单系统的性能比传统方法的混合高斯-通用背景模型(Gaussian mixture model-universal background model,GMM-UBM)基线系统有了40.19%的提升.该方法与本文的基于声学倒谱参数的确认系统融合后,能使整体系统的识别性能有9.25%的提升.在NIST(National institute of standards and technology mixture)2006说话人测试数据库上,融合后的系统能够取得4.9%的等错误率.  相似文献   

3.
论文介绍了一个基于DSP的说话人确认系统,该系统确认算法建立于高斯混合模型-全局背景模型(GMM-UBM)的基础上,并在特征空间采用一种新的基于信息熵特征融合的算法,实验结果表明在不影响识别率的情况下,该算法计算量比传统的特征关联融合的要减少以上,比归一化融合要少。硬件系统采用高速DSP芯片TMS320C6701,为确认算法的实时实现提供了保证。  相似文献   

4.
给出了一种基于多微商核函数(MDK)的结合高斯混合模型(GMM)和支持向量机(SVM)的方法,并应用于SVM文本无关话者确认。从GMM话者语音特征概率分布出发,用多阶微商描述GMM概率分布,将GMM和SVM结合的问题转化为用多阶微商建立SVM话者模型的问题。首先对说话人语音进行基于因子分析的参数域失配补偿,用GMM描述失配补偿后的话者语音特征的概率分布;然后对GMM求多阶微商;最后构建多微商核函数,建立多SVM话者模型。在NIST’01 2min-1min话者确认数据库上的实验表明,基于多微商核函数的SVM话者确认系统性能优于基于失配补偿的GMM系统,也比基于失配补偿的Fisher核函数SVM话者系统和基于失配补偿的Kullback-Leibler(KL)距离SVM话者系统有较大的提高。  相似文献   

5.
提出一种可用于较少语音数据量的文本无关的超音段信息提取方法.通过对基音和能量的轨迹动态分段,提取超音段信息,并使用异方差线性区分分析(HLDA)进行参数优化,克服超音段信息提取对数据量大小的依赖,同时采用混合高斯-统一背景(GMM-UBM)模型结构,建立文本无关话者识别系统.在NIST′01数据库上的实验表明,该系统性能优于基于短时帧的音源信息参数系统,更重要的是不需要大数据量的支持,且与基于短时帧倒谱参数的话者识别系统融合后,系统识别性能明显改善,等误识率相对下降10%.  相似文献   

6.
提出在与文本无关说话人确认中采用模型间马氏(Mahalanobis)距离的夹角作为测试算法,在混合高斯模型(Gaussian ixture Model)的情况下,采用这种算法在保持识别率与传统的对数似然度算法相近的前提下,可以大大降低运算量,对于说话人确认或识别的实时实现有很大的帮助.另外,推荐的算法与传统的对数似然度算法的结果可以融合,可以将说话人确认的等错误率降低12~15%.  相似文献   

7.
基于TZ Normalization规整的话者确认阈值选取   总被引:3,自引:0,他引:3  
针对说话人确认中,各目标话者模型输出评分分布不一致而导致系统确认阈值设置的困难,本文采取了通过评分规整确定系统最小检测代价函数(DCF)确认阈值的方法.在分析了已有的两种评分规整方法Z normalization和T normalization的基础上,提出了一种结合两者优点的组合规整方法——TZ normalization,并据此给出了一种阈值动态修正方法,有效地提高了系统的性能和阈值选取的鲁棒性.对历年的NIST(手机电话语音)评测语料库进行了实验,表明了该方法的有效性.  相似文献   

8.
The evolution of robust speech recognition systems that maintain a high level of recognition accuracy in difficult and dynamically-varying acoustical environments is becoming increasingly important as speech recognition technology becomes a more integral part of mobile applications. In distributed speech recognition (DSR) architecture the recogniser's front-end is located in the terminal and is connected over a data network to a remote back-end recognition server. The terminal performs the feature parameter extraction, or the front-end of the speech recognition system. These features are transmitted over a data channel to the remote back-end recogniser. DSR provides particular benefits for the applications of mobile devices such as improved recognition performance compared to using the voice channel and ubiquitous access from different networks with a guaranteed level of recognition performance. A feature extraction algorithm integrated into the DSR system is required to operate in real-time as well as with the lowest possible computational costs.In this paper, two innovative front-end processing techniques for noise robust speech recognition are presented and compared, time-domain based frame-attenuation (TD-FrAtt) and frequency-domain based frame-attenuation (FD-FrAtt). These techniques include different forms of frame-attenuation, improvement of spectral subtraction based on minimum statistics, as well as a mel-cepstrum feature extraction procedure. Tests are performed using the Slovenian SpeechDat II fixed telephone database and the Aurora 2 database together with the HTK speech recognition toolkit. The results obtained are especially encouraging for mobile DSR systems with limited sizes of available memory and processing power.  相似文献   

9.
This document describes the implementation of a free speech speaker verification solution in the Israel First Direct Bank's (FDI) call center. This implementation is, as far as we know, the world's first commercial free-speech SV implementation.Before implementing the system, FDI was using a manual speaker authentication method consisting of several personal questions that the calling customer had to answer correctly (e.g. What is the first letter of your mother's maiden name). Most call centers still use similar methods for authenticating callers. The FDI realized that this process was expensive and time-consuming and that it was also vulnerable to impostors since the answers for the authentication questions could be obtained quite easily.In this document we describe how free speech SV was implemented and integrated with the FDI systems in order to reduce the cost of customer authentication and to increase security. We discuss the type of expectations the FDI had from this system and how expectations should be adjusted to realistic levels. We also describe some of the major obstacles that were encountered when implementing the system in a real world environment and finally we present the current performance level of the system in terms of false acceptance and false rejection of callers.  相似文献   

10.
This paper presents the use of distance normalization techniques in order to improve speaker verification system performance. These techniques provide a dynamic threshold that compensates for the trial-to-trial variations and replaces the fixed threshold used in the classical speaker verification approach. Two methods are described: the cohort model normalization and a new and original hybrid cohort-world model normalization. These methods are compared from the point of view of storage space requirements and computational effort. Two algorithms are proposed: one uses existing user models, and the other creates new models. The algorithms were evaluated using the YOHO database and a proprietary database. The results showed that using these methods, the errors of false rejection are significantly reduced for a constant false acceptance error, when the cohort size is increasing. The algorithms also involve fewer computational resources than other algorithms, making them more suitable for commercial application.  相似文献   

11.
基于PCNN的语谱图特征提取在说话人识别中的应用   总被引:7,自引:1,他引:7  
该文首次提出了一种将有生物视觉依据的人工神经网络——脉冲耦合神经网络(PulseCoupledNeuralNetwork,以下简称为PCNN)用于说话人识别领域的语谱图特征提取的新方法。该方法将语谱图输入到PCNN后得到输出图像的时间序列及其熵序列作为说话人语音的特征,利用它的不变性实现说话人识别。实验结果表明,该方法可以快速有效地进行说话人识别。该文将PCNN引入到语音识别的应用研究中,开拓了信号处理中两个极为重要的部分———语音信号处理和图像信号处理结合的新领域,同时对于PCNN的理论研究和实际应用具有非常重要的现实意义。  相似文献   

12.
Modeling distributed computer systems is known to be a challenging enterprise. Typically, distributed systems are comprised of large numbers of components whose coordination may require complex interactions. Modeling such systems more often than not leads to the nominal intractability of the resulting state space. Various formal methods have been proposed to address the modeling of coordination among distributed systems components. For the most part, however, these methods do not support formal verification mechanisms. By way of contrast, the L-automata/L-processes model supports formal verification mechanisms which in many examples can successfully circumvent state space explosion problems, and allow verification proofs to be extended to an arbitrary number of components. After reviewing L-automata/L-processes formalisms, we present here the formal specification of a fault-tolerant algorithm for a distributed computer system. We also expose the L-automata/L-processes verification of the distributed system, demonstrating how various techniques such as homomorphic reduction, induction, and linearization, can be used to overcome various problems which surface as one models large, complex systems.  相似文献   

13.
A case study on the application of Communicating Sequential Processes (CSP) to the design and verification of fault-tolerant real-time systems is presented. The distributed recovery block (DRB) scheme is a design technique for the uniform treatment of hardware and software faults in real-time systems. Through a simple fault-tolerant real-time system design using the DRB scheme, the case study illustrates a paradigm for specifying fault-tolerant software and demonstrates how the different behavioural aspects of a fault-tolerant real-time system design can be separately and systematically specified, formulated, and verified using an integrated set of formal techniques based on CSP.  相似文献   

14.
This paper describes a set of tools that enables developers to log and analyze the run-time behavior of distributed control systems. A feature of the tools is that they can be applied to distributed systems. The logging tools enable developers to instrument C or C++ programs so that data indicating state changes can be logged automatically in a variety of formats. In particular, run-time data from distributed systems can be synchronized into a single relational database. Tools are also provided for visualizing the logged data. Analysis to verify correct program behavior is done using a new interval logic that is described in this paper. The logic enables system engineers to express temporal specifications for the autonomous control program that are then checked against the logged data. The data logging, visualization, and interval logic analysis tools are all fully implemented. Results are given from a NASA distributed autonomous control system application.  相似文献   

15.
16.
本文介绍了如何应用提升小波包变换对信号进行特征提取,并在此基础上提出了四条定量的评价标准,能够全面地对此类特征提取方法的有效性进行评价。通过这四个标准,就能更科学地选取合适的特征小波包,从而进一步提高原方法的效率,减少不必要的计算复杂度,使之更加适用于压缩机信号的实时监测。  相似文献   

17.
机械结构多机分布式优化的集成策略研究   总被引:3,自引:0,他引:3  
为了解决复杂机械结构的优化集成问题,提出了跨广义优化平台与有限元分析平台的多计算机分布式优化的集成策略及系统总体框架,详细探讨了实现集成的一些关键技术,包括在有限元前处理中的边界载荷条件的自动施加、网格自动修改、数据通讯问题、远程控制技术等,提出了相应的解决方案,并已有效地用于求解液压挖掘机的广义优化模型。  相似文献   

18.
基于工程特征类的预算工程量自适应提取方法研究   总被引:3,自引:2,他引:3  
提出一种基于施工图的建筑CAD与预算编制集成方法,通过对施工图逻辑结构和反映预算工程量的各类图形信息特点进行分析,提出了工程特征类信息的概念,采用模板和模式识别技术实现了多层构造类预算工程量的自动提取和定额自动套取,并在基于通用设计平台的概预算一体化系统中得到应用。  相似文献   

19.
We study a networking architecture model that is built on a distributed processing environment (DPE) for multimedia services suitable for high speed transport networks such as ATM networks. In this architecture, the applications are deployed as units of software building blocks. Each building block provides a layered view for the effective management and control of the multimedia network resources and services according to the concept of telecommunications management network (TMN) and telecommunications information networking architecture (TINA). For the purpose of flexible service provision to users and effective service introduction by service providers, this architecture proposes the adoption of ad hoc service building blocks such as a video on demand building block and a CSCW building block that have interactions with a general purpose building block. This paper also proposes a naming structure for the management of user profiles and session profiles using a directory service system, and an effective control model for multimedia logical device objects using a stream process approach. The proposed model is implemented on a DPE platform that provides various transparencies, ANSAware.  相似文献   

20.
This paper proposes a projection-based symmetrical factorisation method for extracting semantic features from collections of text documents stored in a Latent Semantic space. Preliminary experimental results demonstrate this yields a comparable representation to that provided by a novel probabilistic approach which reconsiders the entire indexing problem of text documents and works directly in the original high dimensional vector-space representation of text. The employed projection index is derived here from the a priori constraints on the problem. The principal advantage of this approach is computational efficiency and is obtained by the exploitation of the Latent Semantic Indexing as a preprocessing stage. Simulation results on subsets of the 20-Newsgroups text corpus in various settings are provided. This revised version was published online in August 2006 with corrections to the Cover Date.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号