首页 | 本学科首页   官方微博 | 高级检索  
     


Simplified supervised i-vector modeling with application to robust and efficient language identification and speaker verification
Affiliation:1. Signal Analysis and Interpretation Laboratory (SAIL), University of Southern California, Los Angeles, CA 90089, USA;2. Sun Yat-Sen University Carnegie Mellon University Joint Institute of Engineering, Sun Yat-Sen University, Guangzhou, China;3. Sun Yat-Sen University Carnegie Mellon University Shunde International Joint Research Institute, Shunde, China;1. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China;2. Panasonic R&D Center Singapore, 534415 Singapore, Singapore;3. Robotics Institute, Carnegie Mellon University, Pittsburgh 15213-3891, USA;1. Laboratoire d'' Automatique et de Robotique, Département d'' Électronique, Faculté des sciences de l''ingérineur, Université Constantine 1, Route d'' Ain el bey, 25000 Constantine, Algeria;2. IBISC Laboratory, University Evry val Essonnes, 40 Pelvoux Street, 91080 EVRY Courcouronnes Cedex, France;1. Microsoft Corporation, Redmond, WA, USA;2. Heriot-Watt University, Edinburgh, UK;1. Research Reactor Institute, Kyoto University, 2 Asashiro Nishi, Kumatori-cho, Sennan-gun, Osaka 590-0494, Japan;2. Transnuclear Tokyo, Ltd., 1-18-16, Shinbashi, Minato-ku, Tokyo 105-0004, Japan
Abstract:This paper presents a simplified and supervised i-vector modeling approach with applications to robust and efficient language identification and speaker verification. First, by concatenating the label vector and the linear regression matrix at the end of the mean supervector and the i-vector factor loading matrix, respectively, the traditional i-vectors are extended to label-regularized supervised i-vectors. These supervised i-vectors are optimized to not only reconstruct the mean supervectors well but also minimize the mean square error between the original and the reconstructed label vectors to make the supervised i-vectors become more discriminative in terms of the label information. Second, factor analysis (FA) is performed on the pre-normalized centered GMM first order statistics supervector to ensure each gaussian component's statistics sub-vector is treated equally in the FA, which reduces the computational cost by a factor of 25 in the simplified i-vector framework. Third, since the entire matrix inversion term in the simplified i-vector extraction only depends on one single variable (total frame number), we make a global table of the resulting matrices against the frame numbers’ log values. Using this lookup table, each utterance's simplified i-vector extraction is further sped up by a factor of 4 and suffers only a small quantization error. Finally, the simplified version of the supervised i-vector modeling is proposed to enhance both the robustness and efficiency. The proposed methods are evaluated on the DARPA RATS dev2 task, the NIST LRE 2007 general task and the NIST SRE 2010 female condition 5 task for noisy channel language identification, clean channel language identification and clean channel speaker verification, respectively. For language identification on the DARPA RATS, the simplified supervised i-vector modeling achieved 2%, 16%, and 7% relative equal error rate (EER) reduction on three different feature sets and sped up by a factor of more than 100 against the baseline i-vector method for the 120 s task. Similar results were observed on the NIST LRE 2007 30 s task with 7% relative average cost reduction. Results also show that the use of Gammatone frequency cepstral coefficients, Mel-frequency cepstral coefficients and spectro-temporal Gabor features in conjunction with shifted-delta-cepstral features improves the overall language identification performance significantly. For speaker verification, the proposed supervised i-vector approach outperforms the i-vector baseline by relatively 12% and 7% in terms of EER and norm old minDCF values, respectively.
Keywords:Language identification  Speaker verification  I-vector  Supervised i-vector  Simplified i-vector  Simplified supervised i-vector
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号