首页 | 本学科首页   官方微博 | 高级检索  
     


Stochastic feature compensation methods for speaker verification in noisy environments
Affiliation:1. Science Applications International Corporation, Cleveland, OH 44135, United States;2. NASA Glenn Research Center, Cleveland, OH 44135, United States;1. Management College, Inner Mongolia University of Technology, Hohhot 010051, China;2. School of Economics and Management, Inner Mongolia University, Hohhot 010021, China;3. College of Science, Inner Mongolia University of Technology, Hohhot 010051, China;1. Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien, Taiwan;2. Department of Electrical Engineering, National Dong Hwa University, Hualien, Taiwan;3. Institute of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan;1. School of Economics and Management, Tongji University, Shanghai 200092, China;2. School of Law, Tongji University, Shanghai 200092, China;1. Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, 75450 Melaka, Malaysia;2. Faculty of Engineering and Technology, Multimedia University, Jalan Ayer Keroh Lama, 75450 Melaka, Malaysia
Abstract:This paper explores the significance of stereo-based stochastic feature compensation (SFC) methods for robust speaker verification (SV) in mismatched training and test environments. Gaussian Mixture Model (GMM)-based SFC methods developed in past has been solely restricted for speech recognition tasks. Application of these algorithms in a SV framework for background noise compensation is proposed in this paper. A priori knowledge about the test environment and availability of stereo training data is assumed. During the training phase, Mel frequency cepstral coefficient (MFCC) features extracted from a speaker's noisy and clean speech utterance (stereo data) are used to build front end GMMs. During the evaluation phase, noisy test utterances are transformed on the basis of a minimum mean squared error (MMSE) or maximum likelihood (MLE) estimate, using the target speaker GMMs. Experiments conducted on the NIST-2003-SRE database with clean speech utterances artificially degraded with different types of additive noises reveal that the proposed SV systems strictly outperform baseline SV systems in mismatched conditions across all noisy background environments.
Keywords:Speaker verification  Noisy environment  Minimum mean squared error  Maximum likelihood estimate  Expectation Maximization algorithm  Gaussian Mixture Models
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号