共查询到18条相似文献,搜索用时 109 毫秒
1.
基于动态贝叶斯网络的音视频双模态说话人识别 总被引:6,自引:2,他引:4
动态贝叶斯网络在描述具有多个通道的复杂随机过程方面具有优异的性能.基于动态贝叶斯网络进行音视频双模态说话人识别的工作.分析了音视频联合建模的层级结构,利用动态贝叶斯网络对不同层级的音视频关联关系建立模型,并基于该模型进行音视频说话人识别的实验.通过对不同层级的建模过程及说话人识别实验的结果进行分析,结果表明,动态贝叶斯网络为描述音视频间的时序相关性和特征相关性提供了有效的建模方法,在不同语音信噪比的情况下均能提高说话人识别的性能. 相似文献
2.
3.
文章针对统一背景模型与群模型两种反模型进行了分析,在基于统一背景模型与群模型的改进说话人确认模型的基础上,将贝叶斯自适应算法引入到基于高斯混合统一背景模型的说话人确认系统,解决了说话人确认中存在的模型不匹配问题,通过文本无关的测试语音库进行的实验和分析显示,改进算法具有更好的识别效果。 相似文献
4.
随着工控设备越来越多暴露于互联网,面临的安全威胁不断增加,主动防御已经成为一种必要的防御手段,蜜罐技术是一种有效的主动防御技术。攻击者为了攻击真实的资产设备,研究人员开始研究识别蜜罐的方法。对蜜罐进行准确识别涉及到许多不确定性因素。贝叶斯网络用于解决不确定性问题,与蜜罐识别问题相符合。基于蜜罐识别与贝叶斯网络的特点,提出了贝叶斯网络参数学习EM算法模型的工控蜜罐识别方法。首先,介绍了贝叶斯网络的理论基础及贝叶斯网络用于蜜罐识别的优势;接着,描述参数建模所用算法及预测推理算法,完成用于识别蜜罐的贝叶斯网络模型;最后,通过与SVM、KNN、随机森林和Native bayes算法作对比实验,验证所采用贝叶斯网络EM算法训练模型的性能更优,该模型借助贝叶斯联结树推理算法来完成预测识别,通过实例分析进行验证。实验结果表明,用EM算法训练的模型对于识别蜜罐是有效的。 相似文献
5.
根据软件项目的特点以及软件项目进度的安排,本文提出了基于贝叶斯网络的软件项目进度管理模型,在PERT图的基础上构造贝叶斯网络模型,由专家判断和工程经验确定网络中的概率参数。该模型可实现对项目进展情况的监控和控制,识别开发中对项目影响的不确定性因素,并进行反向参数学习,从而可以及时地调整不合理的开发进度,以达到优化的作用。仿真实验结果表明,该模型与实际情况相符合,应用于实际项目开发中取得了很好的效果。 相似文献
6.
7.
针对多传感器获取空中目标的多识别特征,提出了基于贝叶斯Noisy Or Gate网络的目标识别模型;该模型考虑未知因素的影响,将识别特征按二值节点进行网络识别结构构造,利用单个特征的识别结果,计算得到多个特征识别的任意组合,条件概率个数可以从2n减小为2n.仿真计算结果表明,该方法具有简化知识获取,节省存储空间,证据传播及时,实时性高的特点,为目标分类与识别提供了一个新的途径。 相似文献
8.
基于贝叶斯网络的威胁识别 总被引:6,自引:0,他引:6
对威胁进行准确识别是威胁评估的重要内容之一,它涉及到许多不确定性因素.贝叶斯网络是处理不确定性知识的有效工具.根据威胁识别与贝叶斯网络的特点,提出了基于贝叶斯网络的威胁识别方法.首先简单介绍了贝叶斯网络及其优点,然后根据一个具体的实例,建立了威胁识别的贝叶斯网络模型,并阐述了贝叶斯网络用于威胁识别的推理流程.通过对实例的计算结果表明,利用贝叶斯网络能够准确识别威胁,并能有效地处理不确定性信息. 相似文献
9.
用于态势估计的一种构造贝叶斯网络参数的方法 总被引:2,自引:0,他引:2
根据态势估计的特点和要求,采用Leaky noisy-or gate模型的方法构造,并将其应用于整个贝叶斯网络中。该方法以离散变量集为研究对象,由于满足构成Leaky noisy-or gate模型的条件,能够利用部分统计信息构造节点的网络参数。文中利用相关的贝叶斯网络推理软件进行了实验。实验结果表明,使用Leaky noisy-or gate模型的方法,对网络参数进行构造是可行的。 相似文献
10.
11.
说话人识别由于其独特的方便性、经济性和准确性等优势,已成为人们日常生活与工作中重要的身份认证方式。然而在实际应用场景下,对说话人识别系统的准确性、鲁棒性、迁移性、实时性等提出了巨大的挑战。近年来深度学习在特征表达和模式分类方面表现优异,为说话人识别技术的进一步发展提供了新方向。相较于传统说话人识别技术(如GMM-UBM、GMM-SVM、JFA、i-vector等),聚焦于深度学习框架下的说话人识别方法,按照深度学习在说话人识别中的作用方式,将目前的研究分为基于深度学习的特征表达、基于深度学习的后端建模、端到端联合优化三种类别,并分析和总结了其典型算法的特点及网络结构,对其具体性能进行了对比分析。最后总结了深度学习在说话人识别中的应用特点及优势,进一步分析了目前说话人识别研究面临的问题及挑战,并展望了深度学习框架下说话人识别研究的前景,以期推动说话人识别技术的进一步发展。 相似文献
12.
Hampshire J.B. II Waibel A. 《IEEE transactions on pattern analysis and machine intelligence》1992,14(7):751-769
The authors present the Meta-Pi network, a multinetwork connectionist classifier that forms distributed low-level knowledge representations for robust pattern recognition, given random feature vectors generated by multiple statistically distinct sources. They illustrate how the Meta-Pi paradigm implements an adaptive Bayesian maximum a posteriori classifier. They also demonstrate its performance in the context of multispeaker phoneme recognition in which the Meta-Pi superstructure combines speaker-dependent time-delay neural network (TDNN) modules to perform multispeaker /b,d,g/ phoneme recognition with speaker-dependent error rates of 2%. Finally, the authors apply the Meta-Pi architecture to a limited source-independent recognition task, illustrating its discrimination of a novel source. They demonstrate that it can adapt to the novel source (speaker), given five adaptation examples of each of the three phonemes 相似文献
13.
说话人识别就是从说话人的一段语音中提取出说话人的个性特征,通过对这些个人特征的分析和识别,从而达到对说话人进行辨认或者确认的目的。神经网络是一种基于非线性理论的分布式并行处理网络模型,具有很强的模式分类能力及对不完全信息的鲁棒性,为说话人识别技术提供了一种独特的方法。BP(Back-propagation Neural Network)是一种非循环多级网络训练算法,有输入层,输出层和N个隐含层组成。首先概述了语音识别技术,介绍了BP神经网络训练过程的7个步骤及其模型,如何建立BP神经网络模型。同时介绍了与其相关的特征参数的提取,神经网络的训练和识别过程,最后,通过编程在Linux系统下实现说话人身份的识别。 相似文献
14.
The issue of input variability resulting from speaker changes is one of the most crucial factors influencing the effectiveness
of speech recognition systems. A solution to this problem is adaptation or normalization of the input, in a way that all the
parameters of the input representation are adapted to that of a single speaker, and a kind of normalization is applied to
the input pattern against the speaker changes, before recognition. This paper proposes three such methods in which some effects
of the speaker changes influencing speech recognition process is compensated. In all three methods, a feed-forward neural
network is first trained for mapping the input into codes representing the phonetic classes and speakers. Then, among the
71 speakers used in training, the one who is showing the highest percentage of phone recognition accuracy is selected as the
reference speaker so that the representation parameters of the other speakers are converted to the corresponding speech uttered
by him. In the first method, the error back-propagation algorithm is used for finding the optimal point of every decision
region relating to each phone of each speaker in the input space for all the phones and all the speakers. The distances between
these points and the corresponding points related to the reference speaker are employed for offsetting the speaker change
effects and the adaptation of the input signal to the reference speaker. In the second method, using the error back-propagation
algorithm and maintaining the reference speaker data as the desirable speaker output, we correct all the speech signal frames,
i.e., the train and the test datasets, so that they coincide with the corresponding speech of the reference speaker. In the
third method, another feed-forward neural network is applied inversely for mapping the phonetic classes and speaker information
to the input representation. The phonetic output retrieved from the direct network along with the reference speaker data are
given to the inverse network. Using this information, the inverse network yields an estimation of the input representation
adapted to the reference speaker. In all three methods, the final speech recognition model is trained using the adapted training
data, and is tested by the adapted testing data. Implementing these methods and combining the final network results with un-adapted
network based on the highest confidence level, an increase of 2.1, 2.6 and 3% in phone recognition accuracy on the clean speech
is obtained from the three methods, respectively. 相似文献
15.
一种用于说话人辨认的EM训练算法 总被引:2,自引:0,他引:2
提出用于说话人辨认的一种概率映射网络(PMN)分类器,分类器的参数用EM(Expectationmaximization)算法进行训练。PMN网为一个四层前馈网,它构成一个贝叶斯分类器,实现多类分类的贝叶斯判别,把输入的说话人语音数据模型参数通过网络变换为输出的说话人判定。其网络节点对应于贝叶斯后验概率公式的各个变量。该PMN网络用高斯核函数作为密度函数,网络参数训练由EM算法实现,其学习方式为类间的监督学习和类内的非监督学习。实验结果表明,这种分类网络及其学习算法在说话人辨认应用中是有效的。 相似文献
16.
This paper explains a new hybrid method for Automatic Speaker Recognition using speech signals based on the Artificial Neural Network (ANN). ASR performance characteristics is regarded as the foremost challenge and necessitated to be improved. This research work mainly focusses on resolving the ASR problems as well as to improve the accuracy of the prediction of a speaker.. Mel Frequency Cepstral Coefficient (MFCC) is greatly exploited for signal feature extraction.The input samples are created using these extracted features and its dimensions have been reduced using Self Organizing Feature Map (SOFM). Finally, using the reduced input samples, recognition is performed using Multilayer Perceptron (MLP) with Bayesian Regularization.. The training of the network has been accomplished and verified by means of real speech datasets from the Multivariability speaker recognition database for 10 speakers. The proposed method is validated by performance estimation as well as classification accuracies in contradiction to other models.The proposed method gives better recognition rate and 93.33% accuracy is attained. 相似文献
17.
18.
遗传算法与BP神经网络相结合的说话人识别系统 总被引:2,自引:0,他引:2
基于BP神经网络的说话人识别系统是目前说话人识别中的一种主要模型,但BP神经网络通常难以确定隐含层单元的数目,且收敛速度慢。针对此缺点,提出了一种基于遗传算法(GA)的说话人识别BP神经网络优化方案,该方案利用混合编码的GA对神经网络的连接权和结构进行了优化,可以有效地剔除整个网络冗余节点和冗余连接权,方案利用了BP神经网络的并行性和GA的全局搜索能力,显著地改善了网络的处理能力。实验表明:基于混合编码GA的BP神经网络具有快速学习网络权重的能力,识别率高,是说话人识别的一种有效可行的新方案。 相似文献