Speaker identification using multi-step clustering algorithm with transformation-based GMM |
| |
Authors: | Limin Xu Zhenmin Tang |
| |
Affiliation: | (1) Department of Electronic Commerce, School of International Economics and business Nanjing University of Finance and Economics, 210094 Nanjing, Jiangsu, China;(2) School of Computer Science, Nanjing University of Science and Technology, 210094 Nanjing, Jiangsu, China |
| |
Abstract: | To improve the performance of speaker recognition, the embedded linear transformation is used to integrate both transformation
and diagonal-covariance Caussian mixture into a unified framework. In the case, the mixture number of GMM must be fixed in
model training. The cluster expectation-maximization (EM) algorithm is a well-known technique in which the mixture number
is regarded as an estimated parameter. This paper presents a new model structure that integrates a multi-step cluster algorithm
into the estimating process of GMM with the embedded transformation. In the approach, the transformation matrix, the mixture
number and model parameters are simultaneously estimated according to a maximum likelihood criterion. The proposed method
is demonstrated on a database of three data sessions for text independent speaker identification. The experiments show that
this method outperforms the traditional GMM with cluster EM algorithm.
This text was submitted by the authors in English. |
| |
Keywords: | speaker identification Gaussian mixture model multi-step cluster algorithm linear transformation expectation-maximization algorithm |
本文献已被 SpringerLink 等数据库收录! |
|