首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Currently, phonotactic spoken language recognition (SLR) and acoustic SLR systems are widely used language recognition systems. Parallel phone recognition followed by vector space modeling (PPRVSM) is one typical phonotactic system for spoken language recognition. To achieve better performance, researchers assumed to extract more complementary information of the training data using phone recognizers trained for multiple language-specific phone recognizers, different acoustic models and acoustic features. These methods achieve good performance but usually compute at high computational cost and only using complementary information of the training data. In this paper, we explore a novel approach to discriminative vector space model (VSM) training by using a boosting framework to use the discriminative information of test data effectively, in which an ensemble of VSMs is trained sequentially. The effectiveness of our boosting variation comes from the emphasis on working with the high confidence test data to achieve discriminatively trained models. Our variant of boosting also includes utilizing original training data in VSM training. The discriminative boosting algorithm (DBA) is applied to the National Institute of Standards and Technology (NIST) language recognition evaluation (LRE) 2009 task and show performance improvements. The experimental results demonstrate that the proposed DBA shows 1.8 %, 11.72 % and 15.35 % relative reduction for 30s, 10s and 3s test utterances in equal error rate (EER) than baseline system.  相似文献   

Facial expression recognition (FER) is an active research area that has attracted much attention from both academics and practitioners of different fields. In this paper, we investigate an interesting and challenging issue in FER, where the training and testing samples are from a cross-domain dictionary. In this context, the data and feature distribution are inconsistent, and thus most of the existing recognition methods may not perform well. Given this, we propose an effective dynamic constraint representation approach based on cross-domain dictionary learning for expression recognition. The proposed approach aims to dynamically represent testing samples from source and target domains, thereby fully considering the feature elasticity in a cross-domain dictionary. We are therefore able to use the proposed approach to predict class information of unlabeled testing samples. Comprehensive experiments carried out using several public datasets confirm that the proposed approach is superior compared to some state-of-the-art methods.  相似文献   

He  Q.H. Kwong  S. Man  K.F. Tang  K.S. 《Electronics letters》1999,35(10):783-785
A new approach based on the maximum model distance (IMMD) approach for HMM speech recognition systems is proposed. It defines a more realistic model distance definition for HMM training, and utilises the limited training data in a more effective manner. Theoretical and practical issues concerning this approach are investigated. Experimental results showed that a significant reduction in errors could be achieved with this new approach when compared with the maximum model distance (MMD) criterion  相似文献   

Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and language training corpora. Maximum-likelihood point estimation is by far the most prevailing training method. However, due to the problems of unknown speech distributions, sparse training data, high spectral and temporal variabilities in speech, and possible mismatch between training and testing conditions, a dynamic training strategy is needed. To cope with the changing speakers and speaking conditions in real operational conditions for high-performance speech recognition, such paradigms incorporate a small amount of speaker and environment specific adaptation data into the training process. Bayesian adaptive learning is an optimal way to combine prior knowledge in an existing collection of general models with a new set of condition-specific adaptation data. In this paper, the mathematical framework for Bayesian adaptation of acoustic and language model parameters is first described. Maximum a posteriori point estimation is then developed for hidden Markov models and a number of useful parameters densities commonly used in automatic speech recognition and natural language processing  相似文献   

为了解决雷达高分辨距离像识别系统对训练样本需求量过大的问题,该文提出一种基于线性动态模型的小样本目标识别方法。首先分析了距离像频谱的统计特性,然后从其广义平稳性出发,使用线性动态模型对距离像频谱幅度建模,并用期望最大化算法估计模型参数。实测数据的实验结果表明:即使在很少的训练样本条件下,该方法仍能获得较高的正确识别率和良好的拒判性能。  相似文献   

传统的高分辨距离像(HRRP)统计识别方法大部分只使用雷达目标高分辨回波的幅值信息且需要大量的训练样本保证统计模型参数学习的精度。为了充分利用高分辨回波的相位信息,在雷达采样率有限、训练样本数不足的条件下保证统计识别的性能,该文提出一种多任务学习(MTL)复数因子分析(CFA)模型,将数据描述推广到复数域,将每个方位帧训练样本的统计建模视为单一的学习任务,各学习任务共享加载矩阵,利用贝塔伯努利(Beta-Bernoulli)稀疏先验自适应地选择各任务需要的因子,完成多任务的共同学习。基于实测数据的识别实验显示,与传统的单任务学习(STL)因子分析模型相比,该文提出的多任务因子分析模型具有更低的模型复杂度且在小样本条件下可以显著提高识别性能。  相似文献   

无切分维吾尔文文档识别技术能够有效避免字符切分错误,但是对于低数据资源的新样本类型,原有模型往往难以获得较高的识别性能。为此,该文提出共享常用维文字体间相对稳定的字符结构信息,并用Bootstrap方法提高样本利用效率的解决方法。通过在实际书籍样本上的实验表明,仅利用规模约原始训练样本1/5的新类型样本,该方法在测试集上的平均字符识别准确率就可以达到95.05%;而与常用的最大后验概率估计方法相比,也能使识别错误率相对降低55.76%~63.84%。因此,该方法能够有效解决低数据资源条件下的维文字符建模问题,实现对新样本类型的高性能识别。  相似文献   

Radar HRRP Statistical Recognition: Parametric Model and Model Selection   总被引:3,自引:0,他引:3  
Statistical modeling for radar high-resolution range profile (HRRP) is a challenging task in radar HRRP statistical recognition. Theoretical analysis and experimental results show that elements in an HRRP sample are statistically correlated and non-Gaussian distributed. First, this paper introduces three joint-Gaussian models, i.e., subspace approximation model, probability principal components analysis (PPCA) model and factor analysis (FA) model, into radar HRRP statistical recognition. Due to the experimental results, we can have the conclusion that the jointly non-Gaussian distributed HRRP samples approximately follow the joint-Gaussian distribution described by FA model. Therefore, we can apply FA model to radar HRRP statistical recognition rather than a joint-Gaussian mixture model, e.g., PPCA mixture model or FA mixture model, which is a more accurate choice for modeling non-Gaussian distributed correlations in multidimensional data but with high learning complexity and large computation burden, and the difficulty in the statistical modeling for HRRP samples is largely reduced. Second, this paper concerns model selection of FA model in radar HRRP statistical recognition, in which there are two issues, i.e., the partition of target-aspect frames and the determination of the number of factors in each frame. Based on the Akaike information criterion (AIC) and the Bayes' information criterion (BIC), an iterated algorithm for model selection is proposed in this paper, which can automatically give the optimal aspect-frame boundaries and determine the optimal number of factors in each aspect-frame. The recognition experiments based on measured data show that the proposed adaptive partition approach can further improve the recognition performance with higher recognition efficiency.  相似文献   

近年来,卷积神经网络(CNN)已广泛应用于合成孔径雷达(SAR)目标识别。由于SAR目标的训练数据集通常较小,基于CNN的SAR图像目标识别容易产生过拟合问题。生成对抗网络(GAN)是一种无监督训练网络,通过生成器和鉴别器两者之间的博弈,使生成的图像难以被鉴别器鉴别出真假。本文提出一种基于改进的卷积神经网络(ICNN)和改进的生成对抗网络(IGAN)的SAR目标识别方法,即先用训练样本对IGAN进行无监督预训练,再用训练好的IGAN鉴别器参数初始化ICNN,然后用训练样本对ICNN微调,最后用训练好的ICNN对测试样本进行分类。MSTAR实验结果表明,提出的方法不仅能够在训练样本数降至原样本数30%的情况下获得高达96.37%的识别率,而且该方法比直接采用ICNN的方法具有更强的抗噪声能力。  相似文献   

Linear Regression Classification (LRC) is a newly-appeared pattern recognition method, which formulates the recognition problem in terms of class-specific linear regression with sufficient training samples per class. In this paper, we extend LRC via intraclass variant dictionary and SVD to undersampled face recognition where there are very few, or even only one, training sample per class. Intraclass variant dictionary is adopted in undersampled situation to represent the possible variation between the training and testing samples. Three types of methods, quasi-inverse, ridge regularization and Singular Value Decomposition (SVD), are designed to solve low-rank problem of data matrix. Then the whole algorithm, named Extended LRC (ELRC), is presented for face recognition via intraclass variant dictionary and SVD. The experimental results on three well-known face databases show that the proposed ELRC has better generalization ability and is more robust to classification than many state-of-the-art methods in undersampled situation.  相似文献   

李汪华  张贞凯 《电讯技术》2023,63(12):1918-1924
针对合成孔径雷达(Synthetic Aperture Radar, SAR)图像目标识别问题,提出了一种基于集成卷积神经网络(Convolutional Neural Network, CNN)的SAR图像目标识别方法。首先对原始数据集进行数据增强的预处理操作,以扩充训练样本;接着通过重采样的方法从训练样本中获取不同的训练子集,并在训练各基分类器时引入Dropout和Padding操作,有效增强了网络泛化能力;然后采用Adadelta算法与Nesterov动量法结合的思想来优化网络,提高了网络的收敛速度和识别精度;最后采用相对多数投票法对基分类器的分类结果进行集成。在MSTAR数据集上进行的实验结果表明,集成后的模型识别准确率达到99.30%,识别性能优于单个卷积神经网络,具有较强的泛化能力和较好的稳健性。  相似文献   

民航陆空通话对民航飞行安全十分重要,但因其通话模式有特殊的语法结构与发音方式,日常语音识别声学模型无法有效应用于民航陆空通话的语音处理问题。针对民航陆空通话的特殊语境,本文提出了基于双向长短时记忆网络(BiLSTM)的民航陆空通话语音识别方法。首先,提取民航陆空通话语音的FBANK特征作为输入,以时序链式连接(CTC)为目标函数,训练BiLSTM网络得到BiLSTM/CTC模型。然后,利用声学模型,语言模型与陆空通话词典实现民航陆空通话的语音识别,并结合数据增强与数据迁移对模型进行增强训练提高语音识别性能。实验结果表明本文提出的方法适用于民航陆空通话语音识别,并且数据增强模型可有效降低民航陆空通话语音识别的词错误率。   相似文献   

徐雄 《电讯技术》2019,59(9):1048-1053
针对空战目标识别中机型自动识别比较困难的问题,提出了采用航迹特征的智能目标识别方法。利用卷积神经网络(Convolutional Neural Network,CNN)分层学习特征的能力,训练CNN算法模型自动地从航迹数据中学习有用的特征并分类。利用沿海实地采集的15个类别的飞机航迹数据,经一系列数据预处理后作为智能识别算法的训练和测试数据,在验证实验中描述了算法网络的相关配置,对比了CNN与其他分类器的识别结果。实验结果表明,CNN具有很好的识别性能。  相似文献   

Radio signal recognition based on image deep learning   总被引:1,自引:0,他引:1  
A technical idea was innovatively proposed that uses image deep learning to solve the problem of radio signal recognition.First,the radio signal was transformed into a two-dimensional picture,and the radio signal recognition problem was transformed into the object detection problem in the field of image recognition.Then,the advanced achievements about image recognition were used to improve the intelligence and ability of radio signal recognition in complex electromagnetic environment.Based on the proposed idea,a novel radio signal recognition algorithm named RadioImageDet was proposed.The experimental results show that the algorithm can effectively identify the waveform types and time/frequency coordinates of radio signals.After training and testing on the self-collected data set with 12 types and 4 740 samples,the accuracy reaches 86.04% and the mAP value reaches 77.72,while the detection time is only 33 ms on the medium configured desktop computer.  相似文献   

In this paper, we present a novel and effective feature extraction technique for face recognition. The proposed technique incorporates a kernel trick with Graph Embedding and the Fisher’s criterion which we call it as Kernel Discriminant Embedding (KDE). The proposed technique projects the original face samples onto a low dimensional subspace such that the within-class face samples are minimized and the between-class face samples are maximized based on Fisher’s criterion. The implementation of kernel trick and Graph Embedding criterion on the proposed technique reveals the underlying structure of data. Our experimental results on face recognition using ORL, FRGC and FERET databases validate the effectiveness of KDE for face feature extraction.  相似文献   

Dimension reduction is an important research area in pattern recognition when dealing with high- dimensional data. In this paper, a novel supervised dimension reduction approach is introduced for classification. Advantages of using not only global pattern information but also local pattern information are examined in the maximum margin criterion framework. Experimental comparative results in object recognition, handwritten digit recognition, and hyperspectral image classification are presented. According to the experimental results, the proposed method can be a valuable choice for dimension reduction when considering the difficulty of obtaining training samples for some applications.  相似文献   

该文针对人脸图像受到非刚性变化的影响,如旋转、姿态以及表情变化等,提出一种基于稠密尺度不变特征转换(SIFT)特征对齐(Dense SIFT Feature Alignment, DSFA)的稀疏表达人脸识别算法。整个算法包含两个步骤:首先利用DSFA方法对齐训练和测试样本;然后设计一种改进的稀疏表达模型进行人脸识别。为加快DSFA步骤的执行速度,还设计了一种由粗到精的层次化对齐机制。实验结果表明:在ORL,AR和LFW 3个典型数据集上,该文方法都获得了最高的识别精度。该文方法比传统稀疏表达方法在识别精度上平均提高了4.3%,同时提高了大约6倍的识别效率。  相似文献   

提出了一种基于最大相对界的改进隐马尔可夫模型训练方法.为解决隐马尔可夫模型的传统Baum_Welch训练算法在识别声目标时的局限以及现存区分训练算法泛化能力不足的问题,在经典隐马尔可夫模型为初始模型的基础上,定义了相对界,并通过最大化最小相对界建立一个最优化问题,用梯度下降法进行迭代求解,得到基于相对界的隐马尔可夫模型...  相似文献   

文中针对生物医学实体识别中存在的边界识别不准确和鲁棒性差的问题,提出了一种融合了预训练语言模型BERT与跨度标签网络的命名实体识别模型。该模型利用BERT获取文本的上下文信息,并结合跨度标签网络进行实体分类及边界判定,显著提升了实体识别的准确性。为增强模型的鲁棒性,引入对抗训练策略,通过迭代训练正常样本与对抗样本,以优化模型参数。基于CCKS2019评测数据集的实验表明,应用对抗训练方法后,其精准率、召回率及F1值均有所提升,验证了对抗训练能对提高模型的预测能力和鲁棒性的有效性。  相似文献   

马啸  邵利民  金鑫  徐冠雷 《电讯技术》2019,59(8):869-874
针对传统目标识别方法资源消耗大、精度和可靠性低、泛化能力不强的问题,提出了一种基于改进YOLO(You Only Look Once)模型的舰船目标识别方法。通过精简YOLO模型,设计了一个10层的卷积神经网络用于舰船目标的自动特征提取和分类识别,模型训练过程中引入迁移学习的概念防止模型过拟合并加速模型参数的训练。在自建舰船目标图像测试集上的实验分析结果表明,该方法能够正确识别出航母、除航母外的其余军舰及民船三类舰船目标,识别精度达到93.7%且识别效率较高,验证了所提舰船目标识别方法的有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号