From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification

Affiliation:	1. Speech and Image Processing Unit, School of Computing, University of Eastern Finland, Joensuu, Finland;2. School of Computing and Electrical Engineering, Indian Institute of Technology Mandi, Himachal Pradesh, India;1. Audio & Speech Processing Lab, School of Computer Engineering, Iran University of Science & Technology, Tehran, Iran;2. Computer Engineering Department, Faculty of Engineering, Arak University, Arak, Iran;3. Electrical & Computer Engineering Department, K.N. Toosi University of Technology, Tehran, Iran;1. RT-RK d.o.o., Novi Sad, Serbia;2. Faculty of Engineering, University of Novi Sad, Novi Sad, Serbia;3. University of Rochester, Rochester, NY, USA;1. INRS-EMT, University of Quebec, Montreal, Quebec, Canada;2. CRIM, Montreal, Quebec, Canada;1. Unitec Institute of Technology, 1 Carrington Rd., Mt Albert, Auckland, New Zealand;2. The University of Auckland, Auckland, New Zealand

Abstract:	The availability of multiple utterances (and hence, i-vectors) for speaker enrollment brings up several alternatives for their utilization with probabilistic linear discriminant analysis (PLDA). This paper provides an overview of their effective utilization, from a practical viewpoint. We derive expressions for the evaluation of the likelihood ratio for the multi-enrollment case, with details on the computation of the required matrix inversions and determinants. The performance of five different scoring methods, and the effect of i-vector length normalization is compared experimentally. We conclude that length normalization is a useful technique for all but one of the scoring methods considered, and averaging i-vectors is the most effective out of the methods compared. We also study the application of multicondition training on the PLDA model. Our experiments indicate that multicondition training is more effective in estimating PLDA hyperparameters than it is for likelihood computation. Finally, we look at the effect of the configuration of the enrollment data on PLDA scoring, studying the properties of conditional dependence and number-of-enrollment-utterances per target speaker. Our experiments indicate that these properties affect the performance of the PLDA model. These results further support the conclusion that i-vector averaging is a simple and effective way to process multiple enrollment utterances.

Keywords:	i-vector Probabilistic linear discriminant analysis Multiple enrollment Speaker verification
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏