首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 171 毫秒
1.
为了提高信道变化下说话人确认系统的识别率和鲁棒性,提出一种基于i-向量和加权线性判别分析的稀疏表示分类算法。首先借助于加权线性判别分析的信道补偿和降维性能,消除i-向量中信道干扰信息并降低i-向量的维数;紧接着在i-向量集上构建训练语音样本过完备字典矩阵,采用MAP算法求解测试语音在字典矩阵上的稀疏系数向量,最后利用稀疏系数向量重构测试语音样本,根据重构误差确定目标说话人。仿真实验结果验证了该算法的有效性和可行性。  相似文献   

2.
近年来,随着信号的稀疏性理论越来越受到人们的关注,稀疏表征分类器也作为一种新型的分类算法被应用到话者识别系统中。该模型的基本思想是:只要超完备字典足够大,任意待测样本都能够用超完备字典进行线性表示。基于信号的稀疏性理论,未知话者的向量系数,即稀疏解可以通过L1范数最小化获取。超完备字典则可视为语音特征向量在高斯混合模型-通用背景模型(GMM-UBM)上进行MAP自适应而得到的大型数据库。采用稀疏表征模型作为话者辨认的分类方法,基于TIMIT语料库的实验结果表明,所采用的话者辨认方法,能够大大提高说话人识别系统的性能。  相似文献   

3.
为了提高稀疏域隐写的性能,提出一种基于图像成分的稀疏域隐写算法.首先构造2个字典,分别用于稀疏表示图像的分片平滑成分(卡通成分)和纹理成分,并给出了2种构造字典的方法,一种是利用现有数学模型,另一种是利用K-SVD算法进行自适应学习;然后结合2个字典对彩色图像的R,G,B通道进行稀疏分解,分别获得2种图像成分的稀疏表示系数;最后将秘密信息嵌入到其中2个通道的非零表示系数中,并优先选择纹理成分稀疏表示系数,另一通道则用于保存分解路径.实验结果证明,该算法在获得较高视觉质量的同时,比其他稀疏域隐写算法具有更强的抗隐写分析能力和更好的鲁棒性.  相似文献   

4.
为了提高基于稀疏表示的人脸识别速度和对图像的噪声、遮挡、损坏的鲁棒性,提出了拓展的稀疏表示模型和D-KSVD(Discrimination K-SVD)的人脸识别算法。在原始的稀疏表示模型中添加了残差向量作为系数修正向量,使得拓展的稀疏表示模型具有更强的鲁棒性。针对字典学习中只包含表示能力没有包含类别信息的问题,在字典学习中添加了稀疏编码和分类器参数约束项,在字典学习的过程中同时更新稀疏编码和分类器参数,使字典中包含很好的表示能力和判别分类能力,用其稀疏编码系数进行人脸识别分类时能获得更好的识别性能。  相似文献   

5.
针对手掌位置、光照、采集设备等外界因素会影响掌纹图像的识别率以及传统稀疏重构的分类方法计算复杂度高的问题.提出融合双向二维主成分分析((2D)2PCA)与压缩感知的掌纹识别方法,将L1范数最小化重构算法替换成分类正交匹配追踪(COMP)算法,以降低复杂度.首先利用双向二维主成分分析对掌纹图像行列两个方向进行降维,提取特征矩阵,做为压缩感知算法的过完备字典.然后通过分类正交匹配追踪算法(COMP)求解图像在过完备字典上的稀疏表示,以得到一组最优稀疏系数重构每个图像.最后求得测试图像与各类重构图像的最小残差得出分类结果.基于北京交通大学掌纹库的实验结果表明,主成分分析与压缩感知方法可有效降低计算复杂度,对于光照不均匀和有位置变化的掌纹具有一定的鲁棒性,具有良好的掌纹识别性能,可以得到较高的掌纹识别率.  相似文献   

6.
研究了心电(ECG)信号在身份识别中的应用,提出了基于过完备字典下稀疏编码的手指心电身份识别认证算法.在预处理阶段,对ECG信号进行预处理消噪,去除心电信号里的噪声、基线漂移和心率变异的干扰.在特征提取阶段,提取单周期心电信号构成特征向量并构建字典模型,用核奇异值分解(KSVD)训练成冗余字典,然后对每一部分特征向量进行稀疏编码,实现在该字典上的稀疏表示.在分类识别阶段,利用得到的稀疏系数矩阵构建特征模板向量作为特征参数.通过欧氏距离匹配输出个体身份信息,实现个体身份识别认证.通过两个手指心电信号数据库对该算法进行了性能测试,获得了较高的识别率.  相似文献   

7.
基于过完备字典的图像稀疏表示是一种新的图像表示理论,利用过完备字典的冗余性可以有效地捕捉图像的各种结构特征,从而实现图像的有效表示。采用基于过完备字典稀疏表示的方法实现SAR图像的压缩。为了得到表示图像所需要的信息,只需要存储稀疏分解的系数极其对应的坐标,实现压缩的目的。采用K-SVD算法实现过完备字典的构造。K-SVD算法是一种基于学习的算法,由于训练样本全部来自于图像本身,因此字典能够更好地逼近图像本身的结构,实现稀疏表示。仿真表明对于SAR图像的压缩,算法是有效的,并且优于基于DCT的Jpeg算法和基于小波变换的EZW和SPIHT算法。  相似文献   

8.
针对单幅含噪图像的超分辨率重建问题,基于图像在过完备字典下的稀疏表示建立了超分辨率重建模型.该模型中低分辨率字典采用K-SVD算法直接训练,高分辨率字典则由高分辨率图像块与低分辨率字典下的同构的表示系数进行逼近求得;近似的高分辨率图像块通过高分辨率字典乘以表示系数得到,为使重建结果对噪声具有鲁棒性,利用基于稀疏表示的噪声图像恢复的方法由重叠的近似高分辨率图像块求得最终结果.实验结果表明,文中模型无论是主观视觉还是客观评价指标均取得了较好的效果,并验证了模型及算法的有效性.  相似文献   

9.
为了提高人脸的识别率和识别速度及其识别的鲁棒性,提出了基于拓展稀疏表示模型和LC-KSVD(Label Consist K-SVD)的人脸识别算法。针对字典学习中只包含表示能力没有包含类别信息的问题,在原始的稀疏表示模型中添加了残差向量作为系数修正向量,使得拓展稀疏表示模型具有更强的鲁棒性;在字典学习中添加稀疏编码和分类器参数约束项,通过字典学习同时更新稀疏编码和分类器参数,使字典中包含很好的表示能力和判别分类能力。实验结果表明,基于拓展稀疏表示模型和LC-KSVD的人脸识别具有高识别率和低识别速度,并且有很好的鲁棒性。  相似文献   

10.
文章介绍了一种DCT过完备字典和MOD算法相结合的图像稀疏表示去噪算法。首先将噪声图像分成小图像块,并运用正交匹配跟踪算法(0MP)在图像的初始化DCT过完备字典上对小图像块进行稀疏分解;然后使用MOD字典学习算法对DCT过完备字典进行更新;最后重复该过程以获得图像的稀疏表示并重构图像。试验结果表明:该方法在实现图像去噪的同时,其去噪性能比传统的方法更有优势。  相似文献   

11.
This paper presents a simplified and supervised i-vector modeling approach with applications to robust and efficient language identification and speaker verification. First, by concatenating the label vector and the linear regression matrix at the end of the mean supervector and the i-vector factor loading matrix, respectively, the traditional i-vectors are extended to label-regularized supervised i-vectors. These supervised i-vectors are optimized to not only reconstruct the mean supervectors well but also minimize the mean square error between the original and the reconstructed label vectors to make the supervised i-vectors become more discriminative in terms of the label information. Second, factor analysis (FA) is performed on the pre-normalized centered GMM first order statistics supervector to ensure each gaussian component's statistics sub-vector is treated equally in the FA, which reduces the computational cost by a factor of 25 in the simplified i-vector framework. Third, since the entire matrix inversion term in the simplified i-vector extraction only depends on one single variable (total frame number), we make a global table of the resulting matrices against the frame numbers’ log values. Using this lookup table, each utterance's simplified i-vector extraction is further sped up by a factor of 4 and suffers only a small quantization error. Finally, the simplified version of the supervised i-vector modeling is proposed to enhance both the robustness and efficiency. The proposed methods are evaluated on the DARPA RATS dev2 task, the NIST LRE 2007 general task and the NIST SRE 2010 female condition 5 task for noisy channel language identification, clean channel language identification and clean channel speaker verification, respectively. For language identification on the DARPA RATS, the simplified supervised i-vector modeling achieved 2%, 16%, and 7% relative equal error rate (EER) reduction on three different feature sets and sped up by a factor of more than 100 against the baseline i-vector method for the 120 s task. Similar results were observed on the NIST LRE 2007 30 s task with 7% relative average cost reduction. Results also show that the use of Gammatone frequency cepstral coefficients, Mel-frequency cepstral coefficients and spectro-temporal Gabor features in conjunction with shifted-delta-cepstral features improves the overall language identification performance significantly. For speaker verification, the proposed supervised i-vector approach outperforms the i-vector baseline by relatively 12% and 7% in terms of EER and norm old minDCF values, respectively.  相似文献   

12.
在基于全差异空间因子(i-Vector)的说话人确认系统中,需进一步从语音段的i-Vector表示中提取说话人相关的区分性信息,以提高系统性能。文中通过结合锚模型的思想,提出一种基于深层置信网络的建模方法。该方法通过对i-Vector中包含的复杂差异信息逐层进行分析、建模,以非线性变换的形式挖掘出其中的说话人相关信息。在NIST SRE 2008核心测试电话训练-电话测试数据库上,男声和女声的等错误率分别为4。96%和6。18%。进一步与基于线性判别分析的系统进行融合,能将等错误率降至4。74%和5。35%。  相似文献   

13.
Speaker verification (SV) using i-vector concept becomes state-of-the-art. In this technique, speakers are projected onto the total variability space and represented by vectors called i-vectors. During testing, the i-vectors of the test speech segment and claimant are conditioned to compensate for the session variability before scoring. So, i-vector system can be viewed as two processing blocks: one is total variability space and the other is post-processing module. Several questions arise, such as, (i) which part of the i-vector system plays a major role in speaker verification: total variability space or post-processing task; (ii) is the post-processing module intrinsic to the total variability space? The motivation of this paper is to partially answer these questions by proposing several simpler speaker characterization systems for speaker verification, where speakers are represented by their speaker characterization vectors (SCVs). The SCVs are obtained by uniform segmentation of the speakers gaussian mixture models (GMMs)- and maximum likelihood linear regression (MLLR) super-vectors. We consider two adaptation approaches for GMM super-vector: one is maximum a posteriori and other is MLLR. Similarly to the i-vector, SCVs are post-processed for session variability compensation during testing. The proposed system shows promising performance when compared to the classical i-vector system which indicates that the post-processing task plays an major role in i-vector based SV system and is not intrinsic to the total variability space. All experimental results are shown on NIST 2008 SRE core condition.  相似文献   

14.
基于过完备字典的振动信号稀疏表示是滚动轴承信号研究的新热点。提出一种改进MOD字典学习的算法,并用于滚动轴承振动信号的稀疏表示。该方法基于MOD(Method of Optimal Direction)训练学习过程,通过构造分段重叠训练矩阵,能够得到更为稀疏的变换系数。相对DCT、FFT和未改进的处理方法,该方法得到的变换系数更稀疏。将该方法应用到基于压缩感知的滚动轴承振动信号处理,在相同的重构误差范围内,该方法所需要的观测数更少,计算量更小。  相似文献   

15.
In the i-vector/probabilistic linear discriminant analysis (PLDA) technique, the PLDA backend classifier is modelled on i-vectors. PLDA defines an i-vector subspace that compensates the unwanted variability and helps to discriminate among speaker-phrase pairs. The channel or session variability manifested in i-vectors are known to be nonlinear in nature. PLDA training, however, assumes the variability to be linearly separable, thereby causing loss of important discriminating information. Besides, the i-vector estimation, itself, is known to be poor in case of short utterances. This paper attempts to address these issues using a simple hierarchy-based system. A modified fuzzy-clustering technique is employed to divide the feature space into more characteristic feature subspaces using vocal source features. Thereafter, a separate i-vector/PLDA model is trained for each of the subspaces. The sparser alignment owing to subspace-specific universal background model and the relatively reduced dimensions of variability in individual subspaces help to train more effective i-vector/PLDA models. Also, vocal source features are complementary to mel frequency cepstral coefficients, which are transformed into i-vectors using mixture model technique. As a consequence, vocal source features and i-vectors tend to have complementary information. Thus using vocal source features for classification in a hierarchy tree may help to differentiate some of the speaker-phrase classes, which otherwise are not easily discriminable based on i-vectors. The proposed technique has been validated on Part 1 of RSR2015 database, and it shows a relative equal error rate reduction of up to 37.41% with respect to the baseline i-vector/PLDA system.  相似文献   

16.
Constructing a good dictionary is the key to a successful image fusion technique in sparsity-based models. An efficient dictionary learning method based on a joint patch clustering is proposed for multimodal image fusion. To construct an over-complete dictionary to ensure sufficient number of useful atoms for representing a fused image, which conveys image information from different sensor modalities, all patches from different source images are clustered together with their structural similarities. For constructing a compact but informative dictionary, only a few principal components that effectively describe each of joint patch clusters are selected and combined to form the over-complete dictionary. Finally, sparse coefficients are estimated by a simultaneous orthogonal matching pursuit algorithm to represent multimodal images with the common dictionary learned by the proposed method. The experimental results with various pairs of source images validate effectiveness of the proposed method for image fusion task.  相似文献   

17.
杨萌  张弓 《中国图象图形学报》2012,17(11):1439-1443
提出一种基于稀疏优化模型的SAR图像滤波算法。该算法建立在超完备字典稀疏表示基础上,具有较强的数据稀疏性和稳健的建模假设。首先依据SAR图像的结构特征,运用正则化方法建立多目标稀疏优化模型,然后通过冗余字典稀疏优化变换系数,利用冗余字典以及具有点奇异性的小波和线奇异性的剪切波构造超完备字典,最后通过对优化问题的求解,重建SAR图像场景分辨单元的平均强度,实现了SAR图像的滤波。实验结果表明,该算法对SAR图像相干斑噪声具有很好的抑制效果,并且具有增强滤波图像纹理细节特征的优点。  相似文献   

18.
The use of sparse representation in signal and image processing has gradually increased over the past few years.Obtaining an over-complete dictionary from a set of signals allows us to represent these signals as a sparse linear combination of dictionary atoms.By considering the relativity among the multi-polarimetric synthetic aperture radar(SAR)images,a new compression scheme for multi-polarimetric SAR image based sparse representation is proposed.The multilevel dictionary is learned iteratively in the 9/7 wavelet domain using a single channel SAR image,and the other channels are compressed by sparse approximation,also in the 9/7 wavelet domain,followed by entropy coding of the sparse coefficients.The experimental results are compared with two state-of-the-art compression methods:SPIHT(set partitioning in hierarchical trees)and JPEG2000.Because of the efficiency of the coding scheme,our method outperforms both SPIHT and JPEG2000 in terms of peak signal-to-noise ratio(PSNR)and edge preservation index(EPI).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号