共查询到19条相似文献,搜索用时 670 毫秒
1.
首先用非参数独立分量分析方法提取表征说话人音频特性的时域基函数组,语音信号可由这些基函数线性组合而成。每个可识别的说话人对应一个不同的基函数组,对某个特定人的输入音频,只有与它对应的基函数组使其系数向量各分量之间的独立性最强(也就是互信息最小)。对待识别音频,分别用已知说话人的时域基函数组计算各自的系数向量,并计算系数向量各分量之间的互信息。互信息最小的基函数组对应的说话人即为识别结果。实验结果表明,即使用很少的测试数据.也能达到很高的识别率。 相似文献
2.
3.
多变量经验模式分解(MEMD)方法不需要根据先验知识选取基函数,能同时对多通道数据进行自适应分解,适合于分析具有高度相关性和非平稳性的脑电信号。为了判别包含有用信息的内蕴模式函数(IMFs),提出一种基于噪声辅助多变量经验模式分解(NA-MEMD)和互信息的方法,并用于脑电特征提取。首先使用NA-MEMD算法对多通道信号进行分解得到多尺度IMF分量,然后采用互信息法分别计算各尺度上信号与其IMF分量、噪声与其IMF分量、信号IMF分量与噪声IMF分量之间的相关性,接着根据敏感因子筛选包含有用信息的IMF分量,将其叠加得到对应的重构信号,最后采用共同空间模式(CSP)法对重构信号进行特征提取,再用支持向量机(SVM)实现分类。使用仿真数据和实际数据集BCI Competition IV Data Set 1进行测试,与现有的其他方法比较,验证了所提方法的有效性。 相似文献
4.
基于局域判别基的音频信号特征提取方法 总被引:1,自引:0,他引:1
音频特征提取在音频信号分析和处理中起着非常重要的作用。考虑到音频信号的非平稳性,对音频信号进行小波包分解,为了获取健壮的特征,采用改进的局域判别基(LDB)技术对小波包树进行裁剪,提取局域判别基各子空间能量的统计特征组成特征矢量,并利用Fisher准则函数进行特征选择,根据特征矢量设计支持向量机分类器,对三类音频进行分类识别。实验结果表明,该方法提取的特征矢量在音频信号分类中是非常有效的。 相似文献
5.
提出了一种结合加权特征向量空间模型和径向基概率神经网络(RBPNN)的文本分类方法.该方法针对传统的文本特征提取方法的不足,根据文本中特征项的位置信息和所属类别信息定义特征权重,然后,依据特征项的权值计算文档特征项的频数,通过TFIDF函数计算特征值并得到文本的特征向量,最后,采用RBPNN网络分类,通过最小二乘算法求解神经网络的第二隐层和输出层之间的权值,最终训练获得文本分类模型.文本分类实验结果表明,该方法在文本分类中表现出较好的效果,具有较好查全率和查准率. 相似文献
6.
7.
8.
9.
10.
一种基于反向文本频率互信息的文本挖掘算法研究 总被引:1,自引:0,他引:1
针对传统的文本分类算法存在着各特征词对分类结果的影响相同,分类准确率较低,同时造成了算法时间复杂度的增加,在分析了文本分类系统的一般模型,以及在应用了互信息量的特征提取方法提取特征项的基础上,提出一种基于反向文本频率互信息熵文本分类算法。该算法首先采用基于向量空间模型(vector spacemodel,VSM)对文本样本向量进行特征提取;然后对文本信息提取关键词集,筛选文本中的关键词,采用互信息来表示并计算词汇与文档分类相关度;最后计算关键词在文档中的权重。实验结果表明了提出的改进算法与传统的分类算法相比,具有较高的运算速度和较强的非线性映射能力,在收敛速度和准确程度上也有更好的分类效果。 相似文献
11.
针对多标记数据特征提取方法中输出核函数没有准确刻画标记间的相关性的问题,在充分度量标记间相关性的基础上,提出了两种新的输出核函数构造方法。第一种方法首先将多标记数据转化为单标记数据,并使用标记集合来刻画标记间的相关性;然后从损失函数的角度出发定义新的输出核函数。第二种方法是利用互信息来度量标记间的两两相关性,在此基础上进一步构造新的输出核函数。3个多标记数据集上2种分类器的实验结果表明,与原有核函数对应的多标记特征提取方法相比,基于损失函数的输出核函数对应的特征提取方法性能最好,5个评价指标的性能平均提高了10%左右, 尤其在Yeast数据集上,Coverage指标下降幅度达到了30%左右;基于互信息的输出核函数次之,性能平均提高了5%左右。实验结果表明,基于新的输出核函数的特征提取方法能够更加有效地提取特征,并进一步简化分类器的学习过程,提高分类器的泛化性能。 相似文献
12.
Existing classification algorithms use a set of training examples to select classification features, which are then used for all future applications of the classifier. A major problem with this approach is the selection of a training set: a small set will result in reduced performance, and a large set will require extensive training. In addition, class appearance may change over time requiring an adaptive classification system. In this paper, we propose a solution to these basic problems by developing an on-line feature selection method, which continuously modifies and improves the features used for classification based on the examples provided so far. The method is used for learning a new class, and to continuously improve classification performance as new data becomes available. In ongoing learning, examples are continuously presented to the system, and new features arise from these examples. The method continuously measures the value of the selected features using mutual information, and uses these values to efficiently update the set of selected features when new training information becomes available. The problem is challenging because at each stage the training process uses a small subset of the training data. Surprisingly, with sufficient training data the on-line process reaches the same performance as a scheme that has a complete access to the entire training data. 相似文献
13.
基于模糊软集合理论的文本分类方法 总被引:3,自引:0,他引:3
为提高文本分类精度,提出一种基于模糊软集合理论的文本分类方法。该方法把文本训练集表示成模糊软集合表格形式,通过约简、构造软集合对照表方法找出待分类文本所属类别,并针对文本特征提取过程中由于相近特征而导致分类精度下降问题给出一种基于正则化互信息特征选择算法,有效地解决了上述问题。与传统的KNN和SVM分类算法相比,模糊软集合方法在文本分类的精度和准度上都有所提高。 相似文献
14.
Hong Guo 《Pattern recognition》2006,39(5):980-987
This paper proposes a novel method for breast cancer diagnosis using the feature generated by genetic programming (GP). We developed a new feature extraction measure (modified Fisher linear discriminant analysis (MFLDA)) to overcome the limitation of Fisher criterion. GP as an evolutionary mechanism provides a training structure to generate features. A modified Fisher criterion is developed to help GP optimize features that allow pattern vectors belonging to different categories to distribute compactly and disjoint regions. First, the MFLDA is experimentally compared with some classical feature extraction methods (principal component analysis, Fisher linear discriminant analysis, alternative Fisher linear discriminant analysis). Second, the feature generated by GP based on the modified Fisher criterion is compared with the features generated by GP using Fisher criterion and an alternative Fisher criterion in terms of the classification performance. The classification is carried out by a simple classifier (minimum distance classifier). Finally, the same feature generated by GP is compared with a original feature set as the inputs to multi-layer perceptrons and support vector machine. Results demonstrate the capability of this method to transform information from high-dimensional feature space into one-dimensional space and automatically discover the relationship among data, to improve classification accuracy. 相似文献
15.
In this paper, a new cluster-based approach is proposed for extracting features from the coefficients of a two-dimensional discrete wavelet transform. The wavelet coefficients from the matrix of each frequency channel are segregated into non-overlapping clusters in an unsupervised mode using a set of application-specific representative images. In practical situations, this set of representative images can be the same as the ones kept aside for training a classifier. The proposed method divides the matrices of computed wavelet coefficients into disjoint clusters that are centered around the position of dominant coefficients. The features that can distinguish images of one class from those of other classes are obtained by computing energies of the clusters. The feature vectors so obtained are then presented as input patterns to an image classifier, such as a neural network. Experimental results based on the applications for texture classification and wood surface defect detection have shown that the proposed cluster-based wavelet feature extraction method is able to effectively extract important intrinsic information content from the test images, and increase the overall classification accuracy as compared with conventional feature extraction methods. 相似文献
16.
宏基因组测序序列分类问题是宏基因组学研究的一个重点问题.影响宏基因组分类性能的主要因素是特征向量的提取问题,如何提取并产生合适的特征向量对于提高宏基因组分类问题的分类精度和运行时间有着重大影响.因此,针对宏基因组分类问题的数据特点,利用三阶马尔可夫模型的性质,提出了一种基于转移概率矩阵的特征提取方法,并采用基于互信息的特征选择算法对提取后的特征向量进行降维处理,最后将新提出的特征向量应用到SVM分类算法中,并与相关算法进行了性能对比.结果显示,新提出的特征向量在不同的宏基因组物种之间有着良好的区分度,特别适用于大规模宏基因组数据的分类问题. 相似文献
17.
运动想象脑电信号作为一种典型的非线性、非平稳信号,在传统基于单一特征提取的分类方法中难以取得理想的分类性能。针对该问题,将分数阶傅里叶变换(Fractional Fourier Transform, FrFT)引入到脑电信号特征提取过程中。首先利用FrFT对信号进行分析,在扩展特征域的同时从不同维度提取信号中的有用信息并构成特征向量,然后利用支持向量机(Support Vector Machine, SVM)分类器对所提取的特征向量进行分类,最后采用Graz数据开展实验。实验结果表明所提方法能够获得高达92.57%的正确分类结果,明显高于传统采用单一特征提取的分类方法。 相似文献
18.
针对PCA没有有效利用样本的类别信息而导致方言识别率低的问题,采用PCA和LDA组合方法进行特征提取。首先用PCA对普通话、上海话、广东话和闽南话四种方言进行降维,然后在降维后的空间中用LDA进一步特征提取,最后将该特征向量送入BP神经网络进行辨识。仿真实验结果表明,基于PCA和LDA的方言识别的平均识别率高达85%。 相似文献
19.
We present a new linear discriminant analysis method based on information theory, where the mutual information between linearly transformed input data and the class labels is maximized. First, we introduce a kernel-based estimate of mutual information with a variable kernel size. Furthermore, we devise a learning algorithm that maximizes the mutual information w.r.t. the linear transformation. Two experiments are conducted: the first one uses a toy problem to visualize and compare the transformation vectors in the original input space; the second one evaluates the performance of the method for classification by employing cross-validation tests on four datasets from the UCI repository. Various classifiers are investigated. Our results show that this method can significantly boost class separability over conventional methods, especially for nonlinear classification. 相似文献