首页 | 本学科首页   官方微博 | 高级检索  
     


From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification
Affiliation:1. Speech and Image Processing Unit, School of Computing, University of Eastern Finland, Joensuu, Finland;2. School of Computing and Electrical Engineering, Indian Institute of Technology Mandi, Himachal Pradesh, India;1. Audio & Speech Processing Lab, School of Computer Engineering, Iran University of Science & Technology, Tehran, Iran;2. Computer Engineering Department, Faculty of Engineering, Arak University, Arak, Iran;3. Electrical & Computer Engineering Department, K.N. Toosi University of Technology, Tehran, Iran;1. RT-RK d.o.o., Novi Sad, Serbia;2. Faculty of Engineering, University of Novi Sad, Novi Sad, Serbia;3. University of Rochester, Rochester, NY, USA;1. INRS-EMT, University of Quebec, Montreal, Quebec, Canada;2. CRIM, Montreal, Quebec, Canada;1. Unitec Institute of Technology, 1 Carrington Rd., Mt Albert, Auckland, New Zealand;2. The University of Auckland, Auckland, New Zealand
Abstract:The availability of multiple utterances (and hence, i-vectors) for speaker enrollment brings up several alternatives for their utilization with probabilistic linear discriminant analysis (PLDA). This paper provides an overview of their effective utilization, from a practical viewpoint. We derive expressions for the evaluation of the likelihood ratio for the multi-enrollment case, with details on the computation of the required matrix inversions and determinants. The performance of five different scoring methods, and the effect of i-vector length normalization is compared experimentally. We conclude that length normalization is a useful technique for all but one of the scoring methods considered, and averaging i-vectors is the most effective out of the methods compared. We also study the application of multicondition training on the PLDA model. Our experiments indicate that multicondition training is more effective in estimating PLDA hyperparameters than it is for likelihood computation. Finally, we look at the effect of the configuration of the enrollment data on PLDA scoring, studying the properties of conditional dependence and number-of-enrollment-utterances per target speaker. Our experiments indicate that these properties affect the performance of the PLDA model. These results further support the conclusion that i-vector averaging is a simple and effective way to process multiple enrollment utterances.
Keywords:i-vector  Probabilistic linear discriminant analysis  Multiple enrollment  Speaker verification
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号