首页 | 本学科首页   官方微博 | 高级检索  
     


Vocal fatigue induced by prolonged oral reading: Analysis and detection
Affiliation:1. LIPADE, Paris Descartes University, 45 rue des Saints Pères, 75006 Paris, France;2. STIH, Paris Sorbonne University, 28 rue Serpente, 75006 Paris, France;1. Department of Management Science and Engineering, School of Business, Renmin University of China, Beijing 100872, China;2. Department of Management Science and Engineering, School of Economics and Management, Tsinghua University, Beijing 100084, China;1. Structural Engineering Group, Sharif University of Technology, Tehran, Iran;2. Hydraulic Structures Group, Sharif University of Technology, Tehran, Iran;1. The Centre for Speech Technology Research, University of Edinburgh, UK;2. Cambridge Research Laboratory, Toshiba Research Europe Limited, UK;1. Imperial College London, Department of Computing, United Kingdom;2. Technische Universität München, Machine Intelligence & Signal Processing Group, MMK, Germany;1. Technische Universität München, Machine Intelligence & Signal Processing Group, MMK, Germany;2. Friedrich-Alexander University Erlangen-Nuremberg, Pattern Recognition Lab, Germany;1. Division of General and GI Surgery, Emory University Hospital, Atlanta, GA, USA;2. Department of Surgery, Kaiser Westside Medical Center, Hillsboro, OR, USA;3. Department of Orthopedic Surgery, Blanchfield Army Community Hospital, Fort Campbell, KY, USA
Abstract:This article uses prolonged oral reading corpora for various experiments to analyze and detect vocal fatigue. Vocal fatigue particularly concerns voice professionals, including teachers, telemarketing operators, users of automatic speech recognition technology and actors. In analyzing and detecting vocal fatigue, we focused our investigations on three main experiments: a prosodic analysis that can be compared to the results found in related work, a two-class Support Vector Machines (SVM) classifier into Fatigue and Non-Fatigue states using a large set of audio features and a comparison function that estimates the difference in fatigue level between two speech segments using a combination of multiple phoneme-based comparison functions. The experiments on prosodic analysis showed that vocal fatigue was not associated with an increase in fundamental frequency and voice intensity. A two-class SVM classifier using the Paralinguistic Challenge 2010 audio feature set gave an unweighted accuracy of 94.1% for the training set (10-fold cross-validation) and 68.2% for the test set. These results show that the phenomenon of vocal fatigue can be modeled and detected. The comparison function was assessed by detecting increased fatigue levels between two speech segments. The fatigue level detection performance in Equal Error Rate (EER) was 31% using all phonetic segments and yielded EER of 21% after filtering phonetic segments and 19% after filtering phonetic segments and cepstral features. These results show that some phonemes are more sensitive than others to vocal fatigue. These experiments show that the fatigued voice has specific characteristics for prolonged oral reading and suggest the feasibility of vocal fatigue detection.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号