首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
为了提高信道变化下说话人确认系统的识别率和鲁棒性,提出一种基于i-向量和加权线性判别分析的稀疏表示分类算法。首先借助于加权线性判别分析的信道补偿和降维性能,消除i-向量中信道干扰信息并降低i-向量的维数;紧接着在i-向量集上构建训练语音样本过完备字典矩阵,采用MAP算法求解测试语音在字典矩阵上的稀疏系数向量,最后利用稀疏系数向量重构测试语音样本,根据重构误差确定目标说话人。仿真实验结果验证了该算法的有效性和可行性。  相似文献   

2.
本文提出了一种基于Cohort相似度度量的识别方式,训练集外选择出同这个目标说话人比较近似的M个说话人计算M+1维混合高斯.即Cohort模型,,来描述说话人模型,可以很大程度上消除现有系统的不匹配.通过实验,本文提出的基于Cohort的方法,可以将性能提高15%左右,从而证明了该方.法的可行性和应用性.  相似文献   

3.
In this work we develop a speaker recognition system based on the excitation source information and demonstrate its significance by comparing with the vocal tract information based system. The speaker-specific excitation information is extracted by the subsegmental, segmental and suprasegmental processing of the LP residual. The speaker-specific information from each level is modeled independently using Gaussian mixture modeling—universal background model (GMM-UBM) modeling and then combined at the score level. The significance of the proposed speaker recognition system is demonstrated by conducting speaker verification experiments on the NIST-03 database. Two different tests, namely, Clean test and Noisy test are conducted. In case of Clean test, the test speech signal is used as it is for verification. In case of Noisy test, the test speech is corrupted by factory noise (9 dB) and then used for verification. Even though for Clean test case, the proposed source based speaker recognition system still provides relatively poor performance than the vocal tract information, its performance is better for Noisy test case. Finally, for both clean and noisy cases, by providing different and robust speaker-specific evidences, the proposed system helps the vocal tract system to further improve the overall performance.  相似文献   

4.
Security is a major problem in web based access or remote access to data bases. In the present study, the technique of committee neural networks was developed for speech based speaker verification. Speech data from the designated speaker and several imposters were obtained. Several parameters were extracted in the time and frequency domains, and fed to neural networks. Several neural networks were trained and the five best performing networks were recruited into the committee. The committee decision was based on majority voting of the member networks. The committee opinion was evaluated with further testing data. The committee correctly identified the designated speaker in (50 out of 50) 100% of the cases and rejected imposters in (150 out of 150) 100% of the cases. The committee decision was not unanimous in majority of the cases tested.  相似文献   

5.
Our initial speaker verification study exploring the impact of mismatch in training and test conditions finds that the mismatch in sensor and acoustic environment results in significant performance degradation compared to other mismatches like language and style (Haris et al. in Int. J. Speech Technol., 2012). In this work we present a method to suppress the mismatch between the training and test speech, specifically due to sensor and acoustic environment. The method is based on identifying and emphasizing more speaker specific and less mismatch affected vowel-like regions (VLRs) compared to the other speech regions. VLRs are separated from the speech regions (regions detected using voice activity detection (VAD)) using VLR onset point (VLROP) and are processed independently during training and testing of the speaker verification system. Finally, the scores are combined with more weight to that generated by VLRs as those are relatively more speaker specific and less mismatch affected. Speaker verification studies are conducted using the mel-frequency cepstral coefficients (MFCCs) as feature vectors. The speaker modeling is done using the Gaussian mixture model-universal background model and the state-of-the-art i-vector based approach. The experimental results show that for both the systems, proposed approach provides consistent performance improvement on the conversational approach with and without different channel compensation techniques. For instance, with IITG-MV Phase-II dataset for headphone trained and voice recorder test speech, the proposed approach provides a relative improvement of 25.08?% (in EER) for the i-vector based speaker verification systems with LDA and WCCN compared to conventional approach.  相似文献   

6.
基于PCA和核Fisher判别的说话人确认   总被引:1,自引:0,他引:1  
针对核Fisher判别技术在说话人确认中实时性较差的问题,提出了一种基于PCA和核Fisher判别的说话人确认方法.利用PCA进行特征向量的降维、去冗余,以减少后续计算的复杂度,提高说话人确认的速度,使用基于核函数的Fisher判别技术对说话人进行确认,从而在整体上提高系统的实时性.并通过实验验证了该方法的有效性.  相似文献   

7.
自I-Vector(身份认证矢量)被提出以来,基于I-Vector的说话人确认系统迅速取代了基于GMM超矢量的系统并开始流行。I-Vector-SVM系统作为其中之一,在通常训练样本较少的说话人确认领域有着独特的优势,但其性能受核函数影响较大。因此,基于多核学习(Multiple Kernel Learning,MKL)思想,构建了基于I-Vector的多核学习SVM说话人确认系统,并与I-VectorSVM基线系统进行了性能比较。基于NIST语料库的实验表明,基于I-Vector的多核学习说话人确认系统相对于基线系统可取得一定的性能提升。  相似文献   

8.
在说话人空间中,存在语音特征随句子和时间差异而变化的问题。这个变化主要是由语音数据中的语音信息和说话人信息的变化引起的。如果把这两种信息彼此分离就能实现鲁棒的说话人识别。在假设大的说话人变量的空间为“语音空间”和小的说话人变量的空间为“说话人空间”的情况下,通过子空间方法分离语音信息和说话人信息,提出了说话人辨认和说话人确认方法。结果显示:通过相对于传统方法的比较试验,能用小量训练数据建立鲁棒说话人模型。  相似文献   

9.
This study analyzes the effect of degradation on human and automatic speaker verification (SV) tasks. The perceptual test is conducted by the subjects having knowledge about speaker verification. An automatic SV system is developed using the Mel-frequency cepstral coefficients (MFCC) and Gaussian mixture model (GMM). The human and automatic speaker verification performances are compared for clean train and different degraded test conditions. Speech signals are reconstructed in clean and degraded conditions by highlighting different speaker specific information and compared through perceptual test. The perceptual cues that the human subjects used as speaker specific information are investigated and their importance in degraded condition is highlighted. The difference in the nature of human and automatic SV tasks is investigated in terms of falsely accepted and falsely rejected speech pairs. Speech signals are reconstructed in clean and degraded conditions by highlighting different speaker specific information and compared through perceptual test. A discussion on human vs automatic speaker verification is carried out and the possibility of performance improvement of automatic speaker verification under degraded condition is suggested.  相似文献   

10.
文章针对统一背景模型与群模型两种反模型进行了分析,在基于统一背景模型与群模型的改进说话人确认模型的基础上,将贝叶斯自适应算法引入到基于高斯混合统一背景模型的说话人确认系统,解决了说话人确认中存在的模型不匹配问题,通过文本无关的测试语音库进行的实验和分析显示,改进算法具有更好的识别效果。  相似文献   

11.
稀疏表示以其出色的分类性能成为说话人确认研究的热点,其中过完备字典的构建是关键,直接影响其性能。为了提高说话人确认系统的鲁棒性,同时解决稀疏表示过完备字典中存在噪声及信道干扰信息的问题,提出一种基于i-向量的主成分稀疏表示字典学习算法。该算法在高斯通用背景模型的基础上提取说话人的i-向量,并使用类内协方差归一化技术对i-向量进行信道补偿;根据信道补偿后的说话人i-向量的均值向量估计其信道偏移空间,在该空间采用主成分分析方法提取低维信道偏移主分量,用于重新计算说话人i-向量,从而达到进一步抑制i-向量中信道干扰的目的;将新的i-向量作为字典原子构建高鲁棒性稀疏表示过完备字典。在测试阶段,测试语音的i-向量在该字典上寻找其稀疏表示系数向量,根据系数向量对测试i-向量的重构误差确定目标说话人。仿真实验表明,该算法具有良好的识别性能。  相似文献   

12.
Finger-vein verification has drawn increasing attention because it is highly secured and private biometric in practical applications. However, as the imaging environment is affected by many factors, the captured image contains not only the vein pattern but also the noise and irregular shadowing which can decrease the verification accuracy. To address this problem, in this paper, we proposed a new finger-vein extraction approach which detects the valley-like structures using the curvatures in Radon space. Firstly, given a pixel, we obtain eight patches centered on it by rotating a window along eight different orientations and project the resulting patches into Radon space using the Radon transform. Secondly, the vein patches create prominent valleys in Radon space. The vein patterns are enhanced according to the curvature values of the valleys. Finally, the vein network is extracted from the enhancing image by a binarization scheme and matched for personal verification. The experimental results on both contacted and contactless finger-vein databases illustrate that our approach can significantly improve the accuracy of the finger-vein verification system.  相似文献   

13.
串空间模型是分析安全协议的一种实用、直观和严格的形式化方法。概述基于该模型结合使用定理证明和模型检测技术开发的安全协议验证工具AVSP的体系结构,提出一些剪枝规则对状态搜索空间进行剪枝。通过Needham-Schroeder安全协议的弱一致性认证属性验证过程来表明这些状态搜索空间剪枝规则可有效缩小状态搜索空间,防止状态空间爆炸。  相似文献   

14.
In this paper, a curve fitting space (CFS) is presented to map non-linearly separable data to linearly separable ones. A linear or quadratic transformation maps data into a new space for better classification, if the transformation method is properly guessed. This new CFS space can be of high or low dimensionality, and the number of dimensions is generally low, and it is equal to the number of classes. The CFS method is based on fitting a hyperplane or curve to the learning data or enclosing them into a hypersurface. In the proposed method, the hyperplanes, curves, or cortex become the axis of the new space. In the new space, a linear support vector machine multi-class classifier is applied to classify the learn data.  相似文献   

15.
Data system analysis methods for designing of collective neural network classifiers are considered. It is suggested to use methods of sign graph local balancing and algorithms of system behavior stereotype selection for construction of competent areas of local classifiers. Connection graph is formed on the base of statistic dependences between variables of feature space. Decisions of local classifiers are integrated according to vote principle. Experimental results for real data base with a high degree of non-homogeneity are shown.  相似文献   

16.
Multimedia Tools and Applications - In this paper, we analyze the application of the sparse representation of frames of the speech signal for the speaker verification. It is lately shown that...  相似文献   

17.
为了改进中文手写签名真伪鉴别系统的性能,提出了一种混合极限学习机和稀疏表示的层次化分类方法。首先,利用极限学习机强大的泛化能力和鲁棒性,对较易识别的伪签名进行分类,如随机伪造的签名;接着,利用稀疏表示分类具有的精准描述性能,设计签名数据字典,对较难识别的伪签名进行分类,如熟练伪造的签名。实验结果表明,层次化分类的签名鉴别方法与前沿的两种方法相比总体准确率最高,达到了95.53%。  相似文献   

18.
We present a study of sera derived from the malaria medical analysis of 189 subjects. The feature space is 18-dimensional and each serum is represented by a binary number. The subjects are divided into three different groups: no malaria, clinical malaria and asymptomatic subjects. We studied the main characteristics of the data and we selected 7 out of the 18 antigens as the most important for group discrimination. We propose a novel representation of the data in the so-called relational space, where the coded data of pairs of patients are plotted. We are able to separate the groups with 58% accuracy, about 15% points better than several conventional methods with which we compare our results.  相似文献   

19.
On the classification of toppoints in scale space   总被引:2,自引:0,他引:2  
An algebraic classification scheme for toppoints in scale space is proposed. A critical point is a point whose spatial derivatives are zero, and a toppoint is a critical point in which the Hessian does not have full rank. A critical curve is a curve consisting of critical points. It is proposed that toppoints be classified according to the number of critical curves that intersect at the toppoint.Toppoints are analyzed further when one or two critical curves pass through the toppoint. If the Hessian is of rank one, a single critical curve passes through the toppoint; an extremum and a saddle point meet along a curve with a horizontal tangent at the toppoint. Two possibilities exist in the generic case. Either an extremum and a saddle point approach each other with increasing scale and disappear at the toppoint, or an extremum and a saddle are created at the toppoint and diverge from each other at increasing scale.If two critical curves intersect at a toppoint, there are again two different cases when Gaussian blurring is used. In the first case the Hessian has rank zero and the two curves intersect with horizontal perpendicular tangents. Two subcases occur, depending on the root structure of the cubic form of the Taylor expansion of the image at the toppoint. If this cubic form has a single real root, two extrema and two saddles approach the toppoint from below and disappear at the toppoint. The two extrema approach each other along a tangent from opposite directions. Similarly, the two saddles approach each other along the perpendicular tangent. If the cubic form has three distinct real roots, two saddles approach each other from below along a tangent from opposite directions. Above the toppoint the two saddles separate from each other in the perpendicular direction.In the second case the Hessian has rank one and there also exists another algebraic constraint on the coefficients of the cubic form of the Taylor expansion of the image of the toppoint. Two critical curves intersect at the toppoint. Again, two subcases occur, depending on whether the curves are real or complex. If the curves are real, there are two tangents at the intersection point but they are not horizontal, as in the previous subcase. An extremum and a saddle approach each other from below. Above the toppoint an extremum and a saddle separate from each other. The saddle above the toppoint moves in the same direction as the extremum below the toppoint, and the extremum above the toppoint moves in the same direction as the saddle below the toppoint. If the curves are complex, the toppoint is an isolated point in scale space.  相似文献   

20.
Speaker verification is a challenging problem in speaker recognition where the objective is to determine whether a segment of speech in fact comes from a specific individual. In supervised machine learning terms this is a challenging problem as, while examples belonging to the target class are easy to gather, the set of counter-examples is completely open. This makes it difficult to cast this as a supervised classification problem as it is difficult to construct a representative set of counter examples. So we cast this as a one-class classification problem and evaluate a variety of state-of-the-art one-class classification techniques on a benchmark speech recognition dataset. We construct this as a two-level classification process whereby, at the lower level, speech segments of 20 ms in length are classified and then a decision on an complete speech sample is made by aggregating these component classifications. We show that of the one-class classification techniques we evaluate, Gaussian Mixture Models shows the best performance on this task.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号