期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Stochastic feature compensation methods for speaker verification in noisy environments

《Applied Soft Computing》2014

This paper explores the significance of stereo-based stochastic feature compensation (SFC) methods for robust speaker verification (SV) in mismatched training and test environments. Gaussian Mixture Model (GMM)-based SFC methods developed in past has been solely restricted for speech recognition tasks. Application of these algorithms in a SV framework for background noise compensation is proposed in this paper. A priori knowledge about the test environment and availability of stereo training data is assumed. During the training phase, Mel frequency cepstral coefficient (MFCC) features extracted from a speaker's noisy and clean speech utterance (stereo data) are used to build front end GMMs. During the evaluation phase, noisy test utterances are transformed on the basis of a minimum mean squared error (MMSE) or maximum likelihood (MLE) estimate, using the target speaker GMMs. Experiments conducted on the NIST-2003-SRE database with clean speech utterances artificially degraded with different types of additive noises reveal that the proposed SV systems strictly outperform baseline SV systems in mismatched conditions across all noisy background environments. 相似文献

2.

Supervector-based approaches in a discriminative framework for speaker verification in noisy environments

Sourjya Sarkar K. Sreenivasa Rao 《International Journal of Speech Technology》2017,20(2):387-416

This paper explores the robustness of supervector-based speaker modeling approaches for speaker verification (SV) in noisy environments. In this paper speaker modeling is carried out in two different frameworks: (i) Gaussian mixture model-support vector machine (GMM-SVM) combined method and (ii) total variability modeling method. In the GMM-SVM combined method, supervectors obtained by concatenating the mean of an adapted speaker GMMs are used to train speaker-specific SVMs during the training/enrollment phase of SV. During the evaluation/testing phase, noisy test utterances transformed into supervectors are subjected to SVM-based pattern matching and classification. In the total variability modeling method, large size supervectors are reduced to a low dimensional channel robust vector (i-vector) prior to SVM training and subsequent evaluation. Special emphasis has been laid on the significance of a utterance partitioning technique for mitigating data-imbalance and utterance duration mismatches. An adaptive boosting algorithm is proposed in the total variability modeling framework for enhancing the accuracy of SVM classifiers. Experiments performed on the NIST-SRE-2003 database with training and test utterances corrupted with additive noises indicate that the aforementioned modeling methods outperform the standard GMM-universal background model (GMM-UBM) framework for SV. It is observed that the use of utterance partitioning and adaptive boosting in the speaker modeling frameworks result in substantial performance improvements under degraded conditions. 相似文献

3.

A novel text-independent speaker verification method based on theglobal speaker model

Yiying Zhang Zhang D. Xiaoyan Zhu 《IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society》2000,30(5):598-602

This correspondence introduces a new text-independent speaker verification method, which is derived from the basic idea of pattern recognition that the discriminating ability of a classifier can be improved by removing the common information between classes. In looking for the common speech characteristics between a group of speakers, a global speaker model can be established. By subtracting the score acquired from this model, the conventional likelihood score is normalized with the consequence of more compact score distribution and lower equal error rates. Several experiments are carried out to demonstrate the effectiveness of the proposed method 相似文献

4.

Sample-specific late classifier fusion for speaker verification

Hasheminejad Mohammad Farsi Hassan 《Multimedia Tools and Applications》2018,77(12):15273-15289

Multimedia Tools and Applications - Due to the mismatch between training and test conditions, speaker verification in real environments, continues to be a challenging problem. An effective way of... 相似文献

5.

Three-stage speaker verification architecture in emotional talking environments

Ismail Shahin Ali Bou Nassif 《International Journal of Speech Technology》2018,21(4):915-930

相似文献

6.

A kernel trick for sequences applied to text-independent speaker verification systems

Johnny Mariéthoz Author Vitae Samy Bengio Author Vitae 《Pattern recognition》2007,40(8):2315-2324

This paper presents a principled SVM based speaker verification system. We propose a new framework and a new sequence kernel that can make use of any Mercer kernel at the frame level. An extension of the sequence kernel based on the Max operator is also proposed. The new system is compared to state-of-the-art GMM and other SVM based systems found in the literature on the Banca and Polyvar databases. The new system outperforms, most of the time, the other systems, statistically significantly. Finally, the new proposed framework clarifies previous SVM based systems and suggests interesting future research directions. 相似文献

7.

A novel biometric system for signature verification based on score level fusion approach

Dhieb Thameur Boubaker Houcine Njah Sourour Ben Ayed Mounir Alimi Adel M. 《Multimedia Tools and Applications》2022,81(6):7817-7845

Multimedia Tools and Applications - The active modality of handwriting is broadly related to signature verification in the context of biometric user authentication systems. Signature verification... 相似文献

8.

A verification approach to applied system security

Achim D. Brucker Burkhart Wolff 《International Journal on Software Tools for Technology Transfer (STTT)》2005,7(3):233-247

We present a method for the security analysis of realistic models over off-the-shelf systems and their configuration by formal, machine-checked proofs. The presentation follows a large case study based on a formal security analysis of a CVS-Server architecture.The analysis is based on an abstract architecture (enforcing a role-based access control), which is refined to an implementation architecture (based on the usual discretionary access control provided by the POSIX environment). Both architectures serve as a skeleton to formulate access control and confidentiality properties.Both the abstract and the implementation architecture are specified in the language Z. Based on a logical embedding of Z into Isabelle/HOL, we provide formal, machine-checked proofs for consistency properties of the specification, for the correctness of the refinement, and for security properties. 相似文献

9.

改进PSO-SVM在说话人确认中的应用

下载免费PDF全文

景新幸杨艺敏刘涛《计算机工程与应用》2011,47(33):106-108

针对粒子群算法容易过早出现早熟收敛问题,提出一种改进的PSO算法。在当前粒子陷入局部最优时,该算法根据平均粒距对部分粒子以一定的概率进行变异,从而扩大粒子群的全局搜索能力。将改进的PSO算法用来训练支持向量机,并应用在说话人识别系统中。通过实验证明改进的PSO算法在收敛速度和识别精度上都得到了改善。相似文献

10.

Multiple ellipses detection in noisy environments: A hierarchical approach

Zhi-Yong Liu^{Author Vitae} Hong Qiao Author Vitae 《Pattern recognition》2009,42(11):2421-2433

Detection of multiple ellipses in noisy environments is a basic yet challenging task in many vision related problems. The key area of difficulty is on distinguishing the pixels pertaining to each target in the presence of noise. To tackle with the issue, we propose a hierarchical approach which is motivated by the fact that any segment of an ellipse can identify itself in ellipse reconstruction. First, we find all the neat edges without any branches, followed by an ellipse fitting on each of them. Second, some target candidates are estimated based on the neat edges, by a proposed grouping strategy. Finally, the targets are detected based on the candidates, by a proposed selective competitive algorithm to distinguish the true pixels of each target. A real application of the proposed method is illustrated in addition to some other demonstrative experiments. 相似文献

11.

Gammatone filterbank and symbiotic combination of amplitude and phase-based spectra for robust speaker verification under noisy conditions and compression artifacts

Fedila M. Bengherabi M. Amrouche A. 《Multimedia Tools and Applications》2018,77(13):16721-16739

Multimedia Tools and Applications - The main novelty of this work resides in incorporating a Gammatone filter-bank as a substitute of the Mel filter-bank in the extraction pipeline of the Product... 相似文献

12.

短语音噪声环境下说话人识别特征提取

高会贤马全福郑晓势《计算机应用》2010,30(10):2712-2714

为了使说话人识别系统在语音较短和存在噪声的环境下也具有较高的识别率,基于矢量量化识别算法,对提取的特征参数进行研究。把小波变换与美尔频率倒谱系数(MFCC)的提取相结合,并将改进后的特征与谱质心特征进行了组合,建立了一种美尔频率小波变换系数+谱质心(MFWTC+SC)的新的组合特征参数。经实验表明,该组合特征可以有效地提高说话人识别系统的性能。相似文献

13.

Noise robust speaker verification via the fusion of SNR-independent and SNR-dependent PLDA

Xiaomin Pang Man-Wai Mak 《International Journal of Speech Technology》2015,18(4):633-648

相似文献

14.

Novel hybrid DNN approaches for speaker verification in emotional and stressful talking environments

Shahin Ismail Nassif Ali Bou Nemmour Nawel Elnagar Ashraf Alhudhaif Adi Polat Kemal 《Neural computing & applications》2021,33(23):16033-16055

Neural Computing and Applications - In this work, we conducted an empirical comparative study of the performance of text-independent speaker verification in emotional and stressful environments.... 相似文献

15.

Two-space variability compensation technique for speaker verification in short length and reverberant environments

Flavio J. Reyes-Díaz Gabriel Hernández-Sierra José R. Calvo de Lara 《International Journal of Speech Technology》2017,20(3):475-485

The performance of state-of-the-art speaker verification in uncontrolled environment is affected by different variabilities. Short duration variability is very common in these scenarios and causes the speaker verification performance to decrease quickly while the duration of verification utterances decreases. Linear discriminant analysis (LDA) is the most common session variability compensation algorithm, nevertheless it presents some shortcomings when trained with insufficient data. In this paper we introduce two methods for session variability compensation to deal with short-length utterances on i-vector space. The first method proposes to incorporate the short duration variability information in the within-class variance estimation process. The second proposes to compensate the session and short duration variabilities in two different spaces with LDA algorithms (2S-LDA). First, we analyzed the behavior of the within and between class scatters in the first proposed method. Then, both proposed methods are evaluated on telephone session from NIST SRE-08 for different duration of the evaluation utterances: full (average 2.5 min), 20, 15, 10 and 5 s. The 2S-LDA method obtains good results on different short-length utterances conditions in the evaluations, with a EER relative average improvement of 1.58%, compared to the best baseline (WCCN[LDA]). Finally, we applied the 2S-LDA method in speaker verification under reverberant environment, using different reverberant conditions from Reverb challenge 2013, obtaining an improvement of 8.96 and 23% under matched and mismatched reverberant conditions, respectively. 相似文献

16.

A nonlinear autoregressive model for speaker verification

Sundararajan Srinivasan Tao Ma Georgios Lazarou Joseph Picone 《International Journal of Speech Technology》2014,17(1):17-25

Gaussian Mixture Models (GMM) have been the most popular approach in speaker recognition and verification for over two decades. The inefficiencies of this model for signals such as speech are well documented and include an inability to model temporal dependencies that result from nonlinearities in the speech signal. The resulting models are often complex and overdetermined, which leads to a lack of generalization. In this paper, we present a nonlinear mixture autoregressive model (MixAR) that attempts to directly model nonlinearities in the trajectories of the speech features. We apply this model to the problem of speaker verification. Experiments with synthetic data demonstrate the viability of the model. Evaluations on standard speech databases, including TIMIT, NTIMIT, and NIST-2001, demonstrate that MixAR, using only half the number of parameters and only static features, can achieve a lower equal error rate when compared to GMMs, particularly in the presence of previously unseen noise. Performance as a function of the duration of both the training and evaluation utterances is also analyzed. 相似文献

17.

基于韵律特征的SVM说话人确认

下载免费PDF全文

黄肖忠李辉许东星郭伟《计算机工程与应用》2011,47(15):148-151

提出了一种基于韵律特征和SVM的文本无关说话人确认系统。采用小波分析方法,从语音信号的MFCC、F0和能量轨迹中提取出超音段韵律特征,通过实验研究三者的韵律特征在特征层的最佳互补融合,得到信号的韵律特征PMFCCFE,用韵律特征的GMM均值超矢量作为参数训练目标话者的SVM模型,以更有效地区分目标话者和冒认话者。在NIST06 8side-1side数据库的实验表明,以短时倒谱参数的GMM-UBM系统为基准,超音段韵律特征的GMM-SVM系统的EER相对下降了57.9%,MinDCF相对下降了41.4%。相似文献

18.

Significance of duration modification for speaker verification under mismatch speech tempo condition

Rohan Kumar Das Bidisha Sharma S. R. Mahadeva Prasanna 《International Journal of Speech Technology》2018,21(3):401-408

This work explores the scope of duration modification for speaker verification (SV) under mismatch speech tempo condition. The SV performance is found to depend on speaking rate of a speaker. The mismatch in the speaking rate can degrade the performance of a system and is crucial from the perspective of deployable systems. In this work, an analysis of SV performance is carried out by varying the speaking rate of train and test speech. Based on the studies, a framework is proposed to compensate the mismatch in speech tempo. The framework changes the duration of test speech in terms of speaking rate according to the derived mismatch factor between train and test speech. This in turn matches speech tempo of the test speech to that of the claimed speaker model. The proposed approach is found to have significant impact on SV performance while comparing the performance under mismatch conditions. A set of practical data having mismatch in speech tempo is also used to cross-validate the framework. 相似文献

19.

Symbolic problem solving in noisy,novel and uncertain task environments

《International journal of man-machine studies》1992,36(2):145

相似文献

20.

A novel approach for multimodal medical image fusion

《Expert systems with applications》2014,41(16):7425-7435

Fusion of multimodal medical images increases robustness and enhances accuracy in biomedical research and clinical diagnosis. It attracts much attention over the past decade. In this paper, an efficient multimodal medical image fusion approach based on compressive sensing is presented to fuse computed tomography (CT) and magnetic resonance imaging (MRI) images. The significant sparse coefficients of CT and MRI images are acquired via multi-scale discrete wavelet transform. A proposed weighted fusion rule is utilized to fuse the high frequency coefficients of the source medical images; while the pulse coupled neural networks (PCNN) fusion rule is exploited to fuse the low frequency coefficients. Random Gaussian matrix is used to encode and measure. The fused image is reconstructed via Compressive Sampling Matched Pursuit algorithm (CoSaMP). To show the efficiency of the proposed approach, several comparative experiments are conducted. The results reveal that the proposed approach achieves better fused image quality than the existing state-of-the-art methods. Furthermore, the novel fusion approach has the superiority of high stability, good flexibility and low time consumption. 相似文献