首页 | 本学科首页   官方微博 | 高级检索  
     


Spoken emotion recognition through optimum-path forest classification using glottal features
Authors:Alexander I Iliev  Michael S Scordilis  João P Papa  Alexandre X Falcão
Affiliation:1. Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA;2. Institute of Computing, University of Campinas, Campinas, São Paulo, Brazil;1. Faculty of Engineering, Department of Electrical Engineering, Bu-Ali Sina University, Hamedan, Iran;2. Department of Civil, Faculty of Engineering, Bu-Ali Sina University, Hamedan, Iran;1. Departament of Biochemistry and Molecular Biology, Federal University of Santa Maria, Av. Roraima, 97105-900 Santa Maria, RS, Brazil;2. Departament of Microbiology and Parasitology, Federal University of Santa Maria, Av. Roraima, 97105-900 Santa Maria, RS, Brazil;3. Laboratory of Experimental Surgery, Federal University of Santa Maria, Av. Roraima, 97105-900 Santa Maria, RS, Brazil;4. Department of Biochemistry, Federal University of Technology, P. M. B. 704, Akure 340001, Nigeria;1. Sir Joseph Swan Centre for Energy Research, Newcastle University, Newcastle NE1 7RU, UK;2. Institute of Refrigeration and Cryogenics, Shanghai Jiao Tong University, Shanghai, 200240, China;1. Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian, 361005, China;2. College of Chemical Engineering, Tianjin University, Tianjin, 300072, China;1. Division of Speech and Hearing Sciences, The University of Hong Kong, Prince Philip Dental Hospital, 34 Hospital Road, Hong Kong;2. Department of Electrical Engineering & Computer Science, University of Wisconsin-Milwaukee, 3200 N. Cramer Street, Milwaukee, WI 53211, USA;1. The Key Laboratory of Space Applied Physics and Chemistry, Ministry of Education, School of Science, Northwestern Polytechnical University, Xi’an 710072, China;2. Engineering Research Center for Integrated Mechatronics Materials and Components, Changwon National University, 92 Toechon-ro, Republic of Korea
Abstract:A new method for the recognition of spoken emotions is presented based on features of the glottal airflow signal. Its effectiveness is tested on the new optimum path classifier (OPF) as well as on six other previously established classification methods that included the Gaussian mixture model (GMM), support vector machine (SVM), artificial neural networks – multi layer perceptron (ANN-MLP), k-nearest neighbor rule (k-NN), Bayesian classifier (BC) and the C4.5 decision tree. The speech database used in this work was collected in an anechoic environment with ten speakers (5 M and 5 F) each speaking ten sentences in four different emotions: Happy, Angry, Sad, and Neutral. The glottal waveform was extracted from fluent speech via inverse filtering. The investigated features included the glottal symmetry and MFCC vectors of various lengths both for the glottal and the corresponding speech signal. Experimental results indicate that best performance is obtained for the glottal-only features with SVM and OPF generally providing the highest recognition rates, while for GMM or the combination of glottal and speech features performance was relatively inferior. For this text dependent, multi speaker task the top performing classifiers achieved perfect recognition rates for the case of 6th order glottal MFCCs.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号