首页 | 本学科首页   官方微博 | 高级检索  
     


Improved voice activity detection algorithm using wavelet and support vector machine
Authors:Shi-Huang Chen  Rodrigo Capobianco Guido  Trieu-Kien Truong  Yaotsu Chang
Affiliation:1. Department of Computer Science and Information Engineering, Shu-Te University, Kaohsiung County, 824, Taiwan, ROC;2. University of São Paulo (USP), Institute of Physics at São Carlos (IFSC), Department of Physics and Informatics (FFI), Avenida Trabalhador SãoCarlense 400, 13566-590 São Carlos, SP, Brazil;3. Department of Information Engineering, I-Shou University, Kaohsiung County, 840, Taiwan, ROC;4. Department of Applied Mathematics, I-Shou University, Kaohsiung County, 840, Taiwan, ROC;1. Department of Hematology, Qilu Hospital of Shandong University, Jinan, Shandong 250012, PR China;2. Department of Hematology, Liaocheng People’s Hospital, Liaocheng, Shandong 252000, PR China;3. Haisheng Oncology Hospital of Qingdao, Qingdao, Shandong 266031, PR China;4. School of Biomedical Sciences, Charles Sturt University, Wagga Wagga, NSW 2650, Australia;1. Departament of Biochemistry and Molecular Biology, Federal University of Santa Maria, Av. Roraima, 97105-900 Santa Maria, RS, Brazil;2. Departament of Microbiology and Parasitology, Federal University of Santa Maria, Av. Roraima, 97105-900 Santa Maria, RS, Brazil;3. Laboratory of Experimental Surgery, Federal University of Santa Maria, Av. Roraima, 97105-900 Santa Maria, RS, Brazil;4. Department of Biochemistry, Federal University of Technology, P. M. B. 704, Akure 340001, Nigeria;1. Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian, 361005, China;2. College of Chemical Engineering, Tianjin University, Tianjin, 300072, China;1. Departamento de Óptica, Universidad Complutense de Madrid, 28040 Madrid, Spain;2. Instituto de Física Fundamental (IFF-CSIC), Serrano 123, 28006 Madrid, Spain;1. Department of Biotechnology, Daegu University, Gyeongsan, Gyeongbuk, 38453, Republic of Korea;2. Metalloenzyme Research Group and Department of Integrative Plant Science, Chung-Ang University, Anseong 456-756, Republic of Korea
Abstract:This paper proposes an improved voice activity detection (VAD) algorithm using wavelet and support vector machine (SVM) for European Telecommunication Standards Institution (ETSI) adaptive multi-rate (AMR) narrow-band (NB) and wide-band (WB) speech codecs. First, based on the wavelet transform, the original IIR filter bank and pitch/tone detector are implemented, respectively, via the wavelet filter bank and the wavelet-based pitch/tone detection algorithm. The wavelet filter bank can divide input speech signal into several frequency bands so that the signal power level at each sub-band can be calculated. In addition, the background noise level can be estimated in each sub-band by using the wavelet de-noising method. The wavelet filter bank is also derived to detect correlated complex signals like music. Then the proposed algorithm can apply SVM to train an optimized non-linear VAD decision rule involving the sub-band power, noise level, pitch period, tone flag, and complex signals warning flag of input speech signals. By the use of the trained SVM, the proposed VAD algorithm can produce more accurate detection results. Various experimental results carried out from the Aurora speech database with different noise conditions show that the proposed algorithm gives considerable VAD performances superior to the AMR-NB VAD Options 1 and 2, and AMR-WB VAD.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号