首页 | 本学科首页   官方微博 | 高级检索  
     


On robustness of speech based biometric systems against voice conversion attack
Affiliation:1. Département génie électrique, Ecole Mohamamdia d’Ingénieurs (EMI), Université Mohammed V Agdal, Rabat, Morocco;2. Laboratoire de Recherche en Economie de l’Energie, Environnement et Ressources, Département d’Economie, University Caddy Ayyad, Marrakech, Morocco;1. Simsoft Computer Technologies, Middle East Technical University, Teknokent Bolgesi, 06800 Ankara, Turkey;2. Microsoft, 1 Microsoft Way, Redmond, WA 98052, United States;3. Computer Engineering, Middle East Technical University, 06800 Ankara, Turkey;1. College of Finance, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China;2. School of Economics and Management, Southeast University, Nanjing 210096, Jiangsu, China;1. Virtual Systems Research Centre, University of Skövde, Skövde 54128, Sweden;2. Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA;3. Amazon Development Centre (India) Pvt. Ltd., Bengaluru 560055, India;4. Department of Mechanical Engineering, Walchand College of Engineering, Sangli, Maharashtra 416415, India;5. General Motors R&D Center, Warren, MI 48090, USA
Abstract:Voice conversion (VC) approach, which morphs the voice of a source speaker to be perceived as spoken by a specified target speaker, can be intentionally used to deceive the speaker identification (SID) and speaker verification (SV) systems that use speech biometric. Voice conversion spoofing attacks to imitate a particular speaker pose potential threat to these kinds of systems. In this paper, we first present an experimental study to evaluate the robustness of such systems against voice conversion disguise. We use Gaussian mixture model (GMM) based SID systems, GMM with universal background model (GMM-UBM) based SV systems and GMM supervector with support vector machine (GMM-SVM) based SV systems for this. Voice conversion is conducted by using three different techniques: GMM based VC technique, weighted frequency warping (WFW) based conversion method and its variation, where energy correction is disabled (WFW?). Evaluation is done by using intra-gender and cross-gender voice conversions between fifty male and fifty female speakers taken from TIMIT database. The result is indicated by degradation in the percentage of correct identification (POC) score in SID systems and degradation in equal error rate (EER) in all SV systems. Experimental results show that the GMM-SVM SV systems are more resilient against voice conversion spoofing attacks than GMM-UBM SV systems and all SID and SV systems are most vulnerable towards GMM based conversion than WFW and WFW? based conversion. From the results, it can also be said that, in general terms, all SID and SV systems are slightly more robust to voices converted through cross-gender conversion than intra-gender conversion. This work extended the study to find out the relationship between VC objective score and SV system performance in CMU ARCTIC database, which is a parallel corpus. The results of this experiment show an approach on quantifying objective score of voice conversion that can be related to the ability to spoof an SV system.
Keywords:Speaker identification  Speaker verification  Voice conversion  GMM  WFW  SVM
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号