首页 | 本学科首页   官方微博 | 高级检索  
     


Voiced/non-voiced speech classification using adaptive thresholding with bivariate EMD
Authors:Md. Khademul Islam Molla  Keikichi Hirose  Md. Kamrul Hasan
Affiliation:1. Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
2. Department of Computer Science and Engineering, The University of Rajshahi, Rajshahi, Bangladesh
3. Department of Electrical and Electronics Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
Abstract:This paper introduces a robust voiced/non-voiced (VnV) speech classification method using bivariate empirical mode decomposition (bEMD). Fractional Gaussian noise (fGn) is employed as the reference signal to derive a data adaptive threshold for VnV discrimination. The analyzing speech signal and fGn are combined to generate a complex signal which is decomposed into a finite number of complex-valued intrinsic mode functions (IMFs) by using bEMD. The real and imaginary parts of the IMFs represent the IMFs of observed speech and fGn, respectively. The log-energies of both types of IMFs are calculated. There exist similarities between the IMF log-energy representation of fGn and unvoiced speech signals. Hence, the upper confidence limit from IMF log-energies of fGn is used as data adaptive threshold for VnV classification. If the subband log-energy of speech segment exceeds the threshold, the segment is classified as voiced and unvoiced otherwise. The experimental results show that the proposed algorithm performs better than the recently reported methods without requiring any training data for a wide range of SNRs.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号