高新建  屈丹  李弼程 《计算机应用》2007,27(10):2602-2604
在说话人确认中,由于目标说话人和冒认者的得分分布是双峰分布,并且不同目标说话人模型得分分布不一致,使对所有说话人确定一个统一的阈值变得困难,导致系统性能下降。分数归一化通过调整冒认者的得分分布来调整阈值。简要介绍了目前最常用的两种归一化方法:零归一化(Z-Norm)和测试归一化(T-Norm)。重点引入了一种新的根据KL距离的D-Norm 归一化方法。然后结合Z-Norm 和D-Norm的优点,又提出一种新的方法ZD-Norm。对这四种归一化方法的性能进行了比较。实验表明,ZD-Norm相对Z-Norm和D-Norm,能够更有效地提高说话人确认系统的性能。  相似文献   

在文本无关的说话人辨识中,为了提高系统在电话语音条件下的鲁棒性,提出了将说话人确认中常用的评分规整手段用于说话人辨识中,即对测试语音通过不同话者模型的评分分别进行评分规整,为测试语音选取最接近的话者模型作为系统识别输出,有效地提高了系统性能。在NIST’03 1spk数据库上的说话人辨识实验表明了评分规整技术对说话人辨识的有效性。  相似文献   

This paper presents a principled SVM based speaker verification system. We propose a new framework and a new sequence kernel that can make use of any Mercer kernel at the frame level. An extension of the sequence kernel based on the Max operator is also proposed. The new system is compared to state-of-the-art GMM and other SVM based systems found in the literature on the Banca and Polyvar databases. The new system outperforms, most of the time, the other systems, statistically significantly. Finally, the new proposed framework clarifies previous SVM based systems and suggests interesting future research directions.  相似文献   

王明  肖熙 《计算机应用》2007,27(8):2051-2052
从变帧长、变帧率角度考虑提出一种新的提取MFCC的方法。该方法先将帧长和帧率都限制为基音周期的整数倍,即基音同步算法;然后基于变帧率算法的原理在语音特征变化缓慢的地方去除一些帧来降低帧率。在NIST 99说话人评测上进行的说话人确认实验表明,该方法不但提升了系统性能,而且降低了帧率,节省了特征文件的存储空间。  相似文献   

In the i-vector/probabilistic linear discriminant analysis (PLDA) technique, the PLDA backend classifier is modelled on i-vectors. PLDA defines an i-vector subspace that compensates the unwanted variability and helps to discriminate among speaker-phrase pairs. The channel or session variability manifested in i-vectors are known to be nonlinear in nature. PLDA training, however, assumes the variability to be linearly separable, thereby causing loss of important discriminating information. Besides, the i-vector estimation, itself, is known to be poor in case of short utterances. This paper attempts to address these issues using a simple hierarchy-based system. A modified fuzzy-clustering technique is employed to divide the feature space into more characteristic feature subspaces using vocal source features. Thereafter, a separate i-vector/PLDA model is trained for each of the subspaces. The sparser alignment owing to subspace-specific universal background model and the relatively reduced dimensions of variability in individual subspaces help to train more effective i-vector/PLDA models. Also, vocal source features are complementary to mel frequency cepstral coefficients, which are transformed into i-vectors using mixture model technique. As a consequence, vocal source features and i-vectors tend to have complementary information. Thus using vocal source features for classification in a hierarchy tree may help to differentiate some of the speaker-phrase classes, which otherwise are not easily discriminable based on i-vectors. The proposed technique has been validated on Part 1 of RSR2015 database, and it shows a relative equal error rate reduction of up to 37.41% with respect to the baseline i-vector/PLDA system.  相似文献   

This paper describes a Speaker Verification System based on the use of multi resolution classifiers in order to cope with performance degradation due to natural variations of the excitation source and of the vocal tract. The different resolution representations of the speaker are obtained by considering multiple frame lengths in the feature extraction process and from these representations a single Pseudo‐Multi Parallel Branch (P‐MPB) Hidden Markov Model is obtained. In the verification process, different resolution representations of the speech signal are classified by multiple P‐MPB systems: the final decision is obtained by means of different combination techniques. The system based on the Weighted Majority Vote technique considerably outperforms baseline systems: improvements are between 15% and 38%. The execution time of the verification process is also evaluated and it proves to be very acceptable, thus allowing the use of the approach for applications in real time systems.  相似文献   

组合特征和二级判断模型相结合的说话人识别   总被引:1,自引:0,他引:1       下载免费PDF全文
针对目前说话人识别中个性化的特征提取以及假冒说话人的问题,提出一种组合特征提取和二级判断模型相结合的说话人识别方法。在特征提取阶段,采用MFCC倒谱特征、Delta_ Delta特征与平均幅度差法提取的基音周期相结合进行组合特征提取;在识别阶段,采用得分规整后的得分值与一个统一的阈值比较,将一部分假冒说话人排除后,再结合二级判断模型进行识别。实验结果证明该方法有效提高了识别率。  相似文献   

在说话人确认系统中,训练和测试的声学环境不匹配将造成性能急剧下降。本文提出了从特征规整和评分规整两个方面进行补偿的方法。首先,改进了基于分段的倒谱均值方差规整(SCMVN)方法,将倒谱系数都规整到相同的段内高斯统计分布,以提高不同环境条件下特征匹配程度;其次,针对由于不同说话人和不同测试环境引起的输出评分分布变化,提出了两阶段的评分规整方法,即先零规整再测试规整(TZnorm)和先测试规整再零规整(ZTnorm)两种得分变换方法,使得失配条件下与说话人无关的决策门限更加鲁棒。基于NIST2002说话人识别评测库上的实验表明,采用SCMVN的特征规整和ZTnorm的评分规整方法能够明显地提高系统性能。与采用倒谱均值减和零规整的基线系统相比,等错误率和最小检测代价分别降低了20.3%和18.1%。  相似文献   

针对语音识别率不高的问题,提出一种基于PCS-PCA和支持向量机的分级说话人确认方法.首先采用主成分分析法对话者特征向量降维的同时,得到说话人特征向量的主成份空间,在此空间中构造PCS-PCA分类器,筛选可能的目标说话人,然后采用支持向量机进行最终的说话人确认.仿真实验结果表明该方法具有较高的识别率和较快的训练速度.  相似文献   

为了提高信道变化下说话人确认系统的识别率和鲁棒性,提出一种基于i-向量和加权线性判别分析的稀疏表示分类算法。首先借助于加权线性判别分析的信道补偿和降维性能,消除i-向量中信道干扰信息并降低i-向量的维数;紧接着在i-向量集上构建训练语音样本过完备字典矩阵,采用MAP算法求解测试语音在字典矩阵上的稀疏系数向量,最后利用稀疏系数向量重构测试语音样本,根据重构误差确定目标说话人。仿真实验结果验证了该算法的有效性和可行性。  相似文献   

An intelligent verification platform based on a structured analysis model is presented.Using an abstract model mechanism with specific signal interfaces for user callback,the unified structured analysis data,shared by the electronic system level design,functional verification,and performance evaluation,enables efficient management review,auto-generation of code,and modeling in the transaction level.We introduce the class tree,flow parameter diagram,structured flow chart,and event-driven finite state machine as structured analysis models.As a sand table to carry maps from different perspectives and levels via an engine,this highly reusable platform provides the mapping topology to search for unintended consequences and the graph theory for comprehensive coverage and smart test cases.Experimental results show that the engine generates efficient test sequences,with a sharp increase in coverage for the same vector count compared with a random test.  相似文献   

This paper presents a simplified and supervised i-vector modeling approach with applications to robust and efficient language identification and speaker verification. First, by concatenating the label vector and the linear regression matrix at the end of the mean supervector and the i-vector factor loading matrix, respectively, the traditional i-vectors are extended to label-regularized supervised i-vectors. These supervised i-vectors are optimized to not only reconstruct the mean supervectors well but also minimize the mean square error between the original and the reconstructed label vectors to make the supervised i-vectors become more discriminative in terms of the label information. Second, factor analysis (FA) is performed on the pre-normalized centered GMM first order statistics supervector to ensure each gaussian component's statistics sub-vector is treated equally in the FA, which reduces the computational cost by a factor of 25 in the simplified i-vector framework. Third, since the entire matrix inversion term in the simplified i-vector extraction only depends on one single variable (total frame number), we make a global table of the resulting matrices against the frame numbers’ log values. Using this lookup table, each utterance's simplified i-vector extraction is further sped up by a factor of 4 and suffers only a small quantization error. Finally, the simplified version of the supervised i-vector modeling is proposed to enhance both the robustness and efficiency. The proposed methods are evaluated on the DARPA RATS dev2 task, the NIST LRE 2007 general task and the NIST SRE 2010 female condition 5 task for noisy channel language identification, clean channel language identification and clean channel speaker verification, respectively. For language identification on the DARPA RATS, the simplified supervised i-vector modeling achieved 2%, 16%, and 7% relative equal error rate (EER) reduction on three different feature sets and sped up by a factor of more than 100 against the baseline i-vector method for the 120 s task. Similar results were observed on the NIST LRE 2007 30 s task with 7% relative average cost reduction. Results also show that the use of Gammatone frequency cepstral coefficients, Mel-frequency cepstral coefficients and spectro-temporal Gabor features in conjunction with shifted-delta-cepstral features improves the overall language identification performance significantly. For speaker verification, the proposed supervised i-vector approach outperforms the i-vector baseline by relatively 12% and 7% in terms of EER and norm old minDCF values, respectively.  相似文献   

文本生成图像是机器学习领域非常具有挑战性的任务,虽然目前已经有了很大突破,但仍然存在模型训练不稳定以及梯度消失等问题。针对这些不足,在堆叠生成对抗网络(StackGAN)基础上,提出一种结合谱归一化与感知损失函数的文本生成图像模型。首先,该模型将谱归一化运用到判别器网络中,将每层网络梯度限制在固定范围内,相对减缓判别器网络的收敛速度,从而提高网络训练的稳定性;其次,将感知损失函数添加到生成器网络中,增强文本语义与图像内容的一致性。使用Inception score评估所提模型生成图像的质量。实验结果表明,该模型与原始StackGAN相比,具有更好的稳定性且生成图像更加逼真。  相似文献   

制定作战计划时往往需要考虑作战任务的时间约束问题。目前对作战任务的时间约束分析方法都存在约束类型少、验证方法适用范围小等问题。为此提出基于业务流的作战任务时间约束建模方法,构建了作战任务流模型并用以描述作战任务的相对和绝对时间约束。提出了作战任务的时间约束形式化验证方法,设计了作战任务模型到NuSMV语言的转换算法,并基于时序逻辑给出了作战任务的基本时间约束描述方法。最后以登岛作战任务为例,验证了其相对约束和绝对约束的部分性质,并根据反馈结果对模型进行了修正。  相似文献   

采用模型和得分非监督自适应的说话人识别   总被引:1,自引:0,他引:1  
在说话人识别的研究中, 使用以前的测试语句信息对模型参数或者测试得分进行动态更新, 使模型可以更精确地反映测试语句和说话人模型之间的关系, 这种更新策略称为非监督模式, 这方面的研究对实际的说话人识别系统具有非常重要的意义. 本文除了采用非监督的说话人模型自适应更新方法之外, 还提出了非监督的得分域自适应算法: 首先采用双高斯函数对得分建立一个先验的得分模型, 利用最大后验概率准则对得分规整的模型进行调整. 在测试过程中, 采用得分域和模型域的非监督算法可以互相补充, 提高识别率, 在NIST SRE 2006年1训练语段-1测试语段数据库上, 使用模型域和得分域非监督自适应的系统能够取得等错误率4.3%和检测代价函数0.021的结果.  相似文献   

虚拟维修模型的校验方法研究   总被引:1,自引:0,他引:1  
介绍了虚拟雏修与建模仿真的关系,分析了虚拟维修模型校验的特点,研究了虚拟维修模型校验的方法,分别应用置信区间法、假设检验法、Bayes法、TIC法、频谱分析法从静态性能和动态性能两个方面对模型实施校验,从而为虚拟维修模型准确性、实时性提供了较为可靠的验证方法.为虚拟维修模型的校验提供一套系统的理论和方法,有利于虚拟维修模型得到更好应用.  相似文献   

针对多无人机跟踪轨迹时的控制问题和安全问题,本文提出了一种四旋翼无人飞行器的有界跟踪控制方法.该方法能够保证多台无人机在进行轨迹跟踪的同时将自身运动限制在指定范围内.本文通过能量分析的方法设计了具体形式的控制器,并基于Lyapunov稳定性分析证明了系统误差的有界和收敛特性.在此基础上,通过设计和搭建室内多无人飞行器实验平台,完成了多无人飞行器实时轨迹跟踪的实验,验证了该控制器的实际性能.实验结论表明,无人飞行器的有界跟踪方法不但拥有良好的动态特性,而且能够有效地避免无人飞行器越过安全边界造成碰撞等问题,具有较好的安全性和鲁棒性.  相似文献   

基于激光雷达的室内机器人行人检测、跟踪容易受到复杂背景的影响。针对这种情况,提出一种基于似然域背景差分的行人检测、跟踪和跟随系统。利用即时定位与地图构建算法获得陌生环境的二维栅格地图,通过蒙特卡洛定位获得机器人在地图中的后验位姿,利用似然域模型分割出前景对应的激光雷达数据后,进行行人的检测、跟踪以及跟随。实验结果表明,该系统使行人检测准确率提升3.49%,平均检测时间缩短近32%,有效降低复杂背景对多行人检测与跟踪的影响,实现机器人对目标行人的实时跟随。  相似文献   

为建立精准可靠的膝关节三维几何模型,将采集到的膝关节CT和MRI二维断层数据导入Mimics 19.0中进行图像分割和三维重建,建立包含骨组织、韧带、关节软组织以及半月板等完整膝关节的三维模型,再在ANSYSWorkbench中定义材料属性、设置边界条件和载荷约束条件,对其进行生物力学分析。应力分析结果与诸多文献数据吻合,验证该模型的有效性。  相似文献   

为了加强工作流模型对业务流程的描述能力,提出了一种扩展有向图工作流模型及其验证方法.针对基于有向图工作流模型的不足,提出了扩展有向图工作流模型,并给出了该模型的定义和图形符号描述.在用Pi-演算准确描述扩展有向图工作流模型的基础上,给出了用Pi-演算分析和验证扩展有向图工作流模型正确性的方法.最后,结合实验对扩展有向图工作流模型及验证方法进行了仔细分析,实验结果表明了该工作流模型及验证方法的有效性和正确性.  相似文献   

