首页 | 本学科首页   官方微博 | 高级检索  
     


A new approach of audio emotion recognition
Affiliation:1. Department of Computer Science & Networked System, Sunway University, 46150 Petaling Jaya, Malaysia;2. School of Engineering, Edith Cowan University, WA 6027, Australia;3. Intel Microelectronics (M) Sdn. Bhd., 11900 Pulau Pinang, Malaysia;1. Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Lembah Pantai, 50603 Kuala Lumpur, Malaysia;2. Odette School of Business, University of Windsor, 401 Sunset Ave, Windsor, ON N9B 3P4, Canada;1. Grup de Recerca en Sistemes Intel·ligents, Ramon Llull University, Quatre Camins 2, 08022 Barcelona, Spain;2. Grup de Recerca en Internet Technologies & Storage, Ramon Llull University, Quatre Camins 2, 08022 Barcelona, Spain;3. Departamento de Ingeniería Matemática e Informática, Universidad Pública de Navarra, Campus de Arrosadía, 31006 Pamplona, Spain;1. University of Cauca, Cll. 5 4-70 Popayán, Colombia;2. Universidad Carlos III de Madrid, Av. Universidad 30, 28911 Leganés, Spain;3. University of East London, Docklands Campus, London E16 2RD, United Kingdom;1. College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China;2. School of Software Microelectronics, Peking University, Beijing 100190, China;1. School of Management, Hefei University of Technology, Hefei 230009, PR China;2. Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, Hefei 230009, PR China;3. School of Electric Engineering and Automation, Hefei University of Technology, Hefei 230009, PR China
Abstract:A new architecture of intelligent audio emotion recognition is proposed in this paper. It fully utilizes both prosodic and spectral features in its design. It has two main paths in parallel and can recognize 6 emotions. Path 1 is designed based on intensive analysis of different prosodic features. Significant prosodic features are identified to differentiate emotions. Path 2 is designed based on research analysis on spectral features. Extraction of Mel-Frequency Cepstral Coefficient (MFCC) feature is then followed by Bi-directional Principle Component Analysis (BDPCA), Linear Discriminant Analysis (LDA) and Radial Basis Function (RBF) neural classification. This path has 3 parallel BDPCA + LDA + RBF sub-paths structure and each handles two emotions. Fusion modules are also proposed for weights assignment and decision making. The performance of the proposed architecture is evaluated on eNTERFACE’05 and RML databases. Simulation results and comparison have revealed good performance of the proposed recognizer.
Keywords:Audio emotion recognition  RBF neural network  Prosodic features  MFCC feature
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号