首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The use of a speech recognition system with telephone channel environments, or different microphones, requires channel equalisation. In speech recognition, the speech model provides a bank of statistical information that can be used in the channel identification and equalisation process. The authors consider HMM-based channel equalisation, and present results demonstrating that substantial improvement can be obtained through the equalisation process. An alternative method, for speech recognition, is to use a feature set which is more robust to channel distortion. Channel distortions result in an amplitude tilt of the speech cepstrum, and therefore differential cepstral features provide a measure of immunity to channel distortions. In particular the cepstral-time feature matrix, in addition to providing a framework for representing speech dynamics, can be made robust to channel distortions. The authors present results demonstrating that a major advantage of cepstral-time matrices is their channel insensitive character  相似文献   

2.
The speech signal is decomposed through adapted local trigonometric transforms. The decomposed signal is classified by M uniform sub-bands for each subinterval. The energy of each sub-band is used as a speech feature. This feature is applied to vector quantisation and the hidden Markov model. The new speech feature shows a slightly better recognition rate than the cepstrum for speaker independent speech recognition. The new speech feature also shows a lower standard deviation between speakers than does the cepstrum  相似文献   

3.
Fast feature extraction method for robust face verification   总被引:1,自引:0,他引:1  
Sanderson  C. Paliwal  K.K. 《Electronics letters》2002,38(25):1648-1650
A feature extraction technique for face verification is proposed. It utilises polynomial coefficients derived from 2D discrete cosine transform (DCT) coefficients of neighbouring blocks. Experimental results suggest that the technique is more robust against illumination direction changes than 2D Gabor wavelets, 2D DCT and eigenface methods. Moreover, compared to Gabor wavelets, the proposed technique is over 80 times quicker to compute.  相似文献   

4.
To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project F0 (fundamental frequency) features of neighboring syllables as compensations, and adds them to the original F0 features of the current syllable. The transforms are discriminatively trained by using an objective function termed as "minimum tone error", which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline.  相似文献   

5.
The robustness against noise, outliers, and corruption is a crucial issue in image feature extraction. To address this concern, this paper proposes a discriminative low-rank embedding image feature extraction algorithm. Firstly, to enhance the discriminative power of the extracted features, a discriminative term is introduced using label information, obtaining global discriminative information and learning an optimal projection matrix for data dimensionality reduction. Secondly, manifold constraints are incorporated, unifying low-rank embedding and manifold constraints into a single framework to capture the geometric structure of local manifolds while considering both local and global information. Finally, test samples are projected into a lower-dimensional space for classification. Experimental results demonstrate that the proposed method achieves classification accuracies of 95.62%, 95.22%, 86.38%, and 86.54% on the ORL, CMUPIE, AR, and COIL20 datasets, respectively, outperforming dimensionality reduction-based image feature extraction algorithms.  相似文献   

6.
Discriminative metric design for robust pattern recognition   总被引:2,自引:0,他引:2  
Motivated by the development of discriminative feature extraction (DFE), many researchers have come to realize the importance of designing a front-end feature extraction unit with an appropriate link to backend classification. This paper proposes an advanced formalization of DFE, which we call the discriminative metric design (DMD), and elaborates on its exemplar implementation by using a simple, linear feature transformation matrix. The resulting DMD implementation is shown to have a close relationship to various discriminative pattern recognizers, including artificial neural networks. The utility of the proposed method is clearly demonstrated in speech pattern recognition experiments  相似文献   

7.
目前,在计算机视觉方面,大多的监督学习方法用于解决其重要分支:行人重识别问题已经取得了不错的成果,但是此类方法需要对训练数据进行手工标注,特别是对于大容量的数据集,手工标注的成本很高,而且完全满足成对标记的数据难以获得,所以无监督学习成为必选项.此外,全局特征注重行人特征空间整体性的判别性,而局部特征有助于凸显不同部位特征的判别性.所以,基于全局与局部特征的无监督学习框架,使用全局损失函数与局部相斥损失函数共同进行判别性特征学习,并联合优化ResNet-50卷积神经网络(CNN)和各个样本之间的关系,最终实现行人重识别.大量实验数据验证了提出的方法在解决行人重识别任务时具有优越性.  相似文献   

8.
罗宗誉  严华林 《激光与红外》2020,50(11):1313-1321
雷达回波中的微多普勒效应能够反映目标的几何结构和运动特性,作为目标独一无二的特征,能够用来实现对目标类别和属性的判断。波长的优势使激光雷达相对于微波雷达具备更好的微多普勒探测精度。针对空中飞机目标(直升机、螺旋桨飞机)回波中微多普勒调制能量较弱,易被噪声污染的问题,提一种基于PCA-CLEAN的噪声稳健激光微多普勒特征提取方法,首先利用PCA对回波信号进行噪声抑制,然后利用CLEAN算法将回波中的机身分量和微动分量区分开,进而提取反映不同目标微动差异的三维特征进行目标分类,基于仿真和实测数据的实验结果表明,所提方法能够获得较好的分类性能,同时在低信噪比条件下能够获得较好的噪声抑制性能。  相似文献   

9.
目的针对目前模糊图像特征提取与匹配方面, 存在特征提取困难、匹配率低、抗噪以及抗尺度变 化能力弱的缺陷。方法提出一种基于SIFT算法与改进的中心对称局部二值模式相结合的精准 、特征识别 率高的匹配算法。首先采用SIFT进行特征的提取,生成多维的描述子,其次采用本文改进的 中心对称局 部二值模式对高维特征描述子进行降维处理,并采用局部特征区域对降维后的描述子进行特 征检测,并生 成纹理特征图像以及信息分布直方图,对特征区域的特征点进行信息量统计,并设置检测阈 值。提取符合 特征信息要求的特征点,并依据Hausdorff距离算法实现图像粗匹配,最后采用RANSAC算法 进行误差匹 配的剔除来改善匹配的精度和鲁棒性。结果测试结果表明,本文所建议的算法是有效的,它 不仅具有良 好的模糊图像分辨能力和抗尺度变化特性,而且具有较强的噪声抑制能力和抗光照变化能力 。结论本文 提出的基于视觉模糊的鲁棒特征匹配算法,不仅考虑到传统特征匹配算法的优缺点,也提出 了算法改进的 新思路,而且较SIFT算法以及LBP算法稳定性和准确度有了明显的提高。  相似文献   

10.
With the development of multimedia technology, fine-grained image retrieval has gradually become a new hot topic in computer vision, while its accuracy and speed are limited due to the low discriminative high-dimensional real-valued embedding. To solve this problem, we propose an end-to-end framework named DFMH (Discriminative Feature Mining Hashing), which consists of the DFEM (Discriminative Feature Extracting Module) and SHCM (Semantic Hash Coding Module). Specifically, DFEM explores more discriminative local regions by attention drop and obtains finer local feature expression by attention re-sample. SHCM generates high-quality hash codes by combining the quantization loss and bit balance loss. Validated by extensive experiments and ablation studies, our method consistently outperforms both the state-of-the-art generic retrieval methods as well as fine-grained retrieval methods on three datasets, including CUB Birds, Stanford Dogs and Stanford Cars.  相似文献   

11.
Transforming an original image into a high-dimensional (HD) feature has been proven to be effective in classifying images. This paper presents a novel feature extraction method utilizing the HD feature space to improve the discriminative ability for face recognition. We observed that the local binary pattern can be decomposed into bit-planes, each of which has scale-specific directional information of the face image. Each bit-plane not only has the inherent local-structure of the face image but also has an illumination-robust characteristic. By concatenating all the decomposed bit-planes, we generate an HD feature vector with an improved discriminative ability. To reduce the computational complexity while preserving the incorporated local structural information, a supervised dimension reduction method, the orthogonal linear discriminant analysis, is applied to the HD feature vector. Extensive experimental results show that existing classifiers with the proposed feature outperform those with other conventional features under various illumination, pose, and expression variations.  相似文献   

12.
Constructing the bag-of-features model from Space–time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.  相似文献   

13.
There are many factors to consider in carrying out a hyperspectral data classification. Perhaps chief among them are class training sample size, dimensionality, and distribution separability. The intent of this study is to design a classification procedure that is robust and maximally effective, but which provides the analyst with significant assists, thus simplifying the analyst's task. The result is a quadratic mixture classifier based on Mixed-LOOC2 regularized discriminant analysis and nonparametric weighted feature extraction. This procedure has the advantage of providing improved classification accuracy compared to typical previous methods but requires minimal need to consider the factors mentioned above. Experimental results demonstrating these properties are presented.  相似文献   

14.
Unsupervised feature learning has drawn more and more attention especially in visual representation in past years. Traditional feature learning approaches assume that there are few noises in training data set, and the number of samples is enough compared with the dimensions of samples. Unfortunately, these assumptions are violated in most of visual representation scenarios. In these cases, many feature learning approaches are failed to extract the important features. Toward this end, we propose a Robust Elastic Net (REN) approach to handle these problems. Our contributions are twofold. First of all, a novel feature learning approach is proposed to extract features by weighting elastic net. A distribution induced weight function is used to leverage the importance of different samples thus reducing the effects of outliers. Moreover, the REN feature learning approach can handle High Dimension, Low Sample Size (HDLSS) issues. Second, a REN classifier is proposed for object recognition, and can be used for generic visual representation including that from the REN feature extraction. By doing so, we can reduce the effect of outliers in samples. We validate the proposed REN feature learning and classifier on face recognition and background reconstruction. The experimental results showed the robustness of this proposed approach for both corrupted/occluded samples and HDLSS issues.  相似文献   

15.
A new cepstrum normalisation method is proposed which can be used to compensate for distortion caused by additive noise. Conventional methods only compensate for the deviation of the cepstral mean and/or variance. However, deviations of higher order moments also exist in noisy speech signals. The proposed method normalises the cepstrum up to its third-order moment, providing closer probability density functions between clean and noisy cepstra than is possible using conventional methods. From the speaker-independent isolated-word recognition experiments, it is shown that the proposed method gives improved performance compared with that of conventional methods, especially in heavy noise environments  相似文献   

16.
In this paper, a feature extraction scheme for a general type of nonstationary time series is described. A non-stationary time series is one in which the statistics of the process are a function of time; this time dependency makes it impossible to utilize standard globally derived statistical attributes such as autocorrelations, partial correlations, and higher order moments as features. In order to overcome this difficulty, the time series vectors are considered within a finite-time interval and are modeled as time-varying autoregressive (AR) processes. The AR coefficients that characterize the process are functions of time that may be represented by a family of basis vectors. A novel Bayesian formulation is developed that allows the model order of a time-varying AR process as well as the form of the family of basis vectors used in the representation of each of the AR coefficients to be determined. The corresponding basis coefficients are then invariant over the time window and, since they directly relate to the time-varying AR coefficients, are suitable features for discrimination. Results illustrate the effectiveness of the method  相似文献   

17.
Feature extraction has been an important research topic in pattern classification and has been studied extensively by many researchers. Most of the conventional feature extraction methods are performed using a criterion function defined between two classes or a global function. Although these methods work relatively well in most cases, it is generally not optimal in any sense for multiclass problems. In order to address this problem, the authors propose a method to optimize feature extraction for multiclass problems. The authors first investigate the distribution of classification accuracies of multiclass problems in the feature space and find that there exist much better feature sets that the conventional feature extraction algorithms fail to find. Then the authors propose an algorithm that finds such features. Experiments with remotely sensed data show that the proposed algorithm consistently provides better performances compared with the conventional feature extraction algorithms  相似文献   

18.
Nonparametric weighted feature extraction for classification   总被引:2,自引:0,他引:2  
In this paper, a new nonparametric feature extraction method is proposed for high-dimensional multiclass pattern recognition problems. It is based on a nonparametric extension of scatter matrices. There are at least two advantages to using the proposed nonparametric scatter matrices. First, they are generally of full rank. This provides the ability to specify the number of extracted features desired and to reduce the effect of the singularity problem. This is in contrast to parametric discriminant analysis, which usually only can extract L-1 (number of classes minus one) features. In a real situation, this may not be enough. Second, the nonparametric nature of scatter matrices reduces the effects of outliers and works well even for nonnormal datasets. The new method provides greater weight to samples near the expected decision boundary. This tends to provide for increased classification accuracy.  相似文献   

19.
With the deepening of neural network research, object detection has been developed rapidly in recent years, and video object detection methods have gradually attracted the attention of scholars, especially frameworks including multiple object tracking and detection. Most current works prefer to build the paradigm for multiple object tracking and detection by multi-task learning. Different with others, a multi-level temporal feature fusion structure is proposed in this paper to improve the performance of framework by utilizing the constraint of video temporal consistency. For training the temporal network end-to-end, a feature exchange training strategy is put forward for training the temporal feature fusion structure efficiently. The proposed method is tested on several acknowledged benchmarks, and encouraging results are obtained compared with the famous joint detection and tracking framework. The ablation experiment answers the problem of a good position for temporal feature fusion.  相似文献   

20.
Although the continuous hidden Markov model (CHMM) technique seems to be the most flexible and complete tool for speech modelling. It is not always used for the implementation of speech recognition systems because of several problems related to training and computational complexity. Thus, other simpler types of HMMs, such as discrete (DHMM) or semicontinuous (SCHMM) models, are commonly utilised with very acceptable results. Also, the superiority of continuous models over these types of HMMs is not clear. The authors' group has previously introduced the multiple vector quantisation (MVQ) technique, the main feature of which is the use of one separated VQ codebook for each recognition unit. The MVQ technique applied to DHMM models generates a new HMM modelling (basic MVQ models) that allows incorporation into the recognition dynamics of the input sequence information wasted by the discrete models in the VQ process. The authors propose a new variant of HMM models that arises from the idea of applying MVQ to SCHMM models. These are SCMVQ-HMM (semicontinuous multiple vector quantisation HMM) models that use one VQ codebook per recognition unit and several quantisation candidates for each input vector. It is shown that SCMVQ modelling is formally the closest one to CHMM, although requiring even less computation than SCHMMs. After studying several implementation issues of the MVQ technique. Such as which type of probability density function should be used, the authors show the superiority of SCMVQ models over other types of HMM models such as DHMMs, SCHMMs or the basic MVQs  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号