首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Fast recognition of musical genres using RBF networks   总被引:2,自引:0,他引:2  
This paper explores the automatic classification of audio tracks into musical genres. Our goal is to achieve human-level accuracy with fast training and classification. This goal is achieved with radial basis function (RBF) networks by using a combination of unsupervised and supervised initialization methods. These initialization methods yield classifiers that are as accurate as RBF networks trained with gradient descent (which is hundreds of times slower). In addition, feature subset selection further reduces training and classification time while preserving classification accuracy. Combined, our methods succeed in creating an RBF network that matches the musical classification accuracy of humans. The general algorithmic contribution of this paper is to show experimentally that RBF networks initialized with a combination of methods can yield good classification performance without relying on gradient descent. The simplicity and computational efficiency of our initialization methods produce classifiers that are fast to train as well as fast to apply to novel data. We also present an improved method for initializing the k-means clustering algorithm, which is useful for both unsupervised and supervised initialization methods.  相似文献   

2.
针对微阵列基因表达数据高维小样本、高冗余且高噪声的问题,提出一种基于FCBF特征选择和集成优化学习的分类算法FICS-EKELM。首先使用快速关联过滤方法FCBF滤除部分不相关特征和噪声,找出与类别相关性较高的特征集合;其次,运用抽样技术生成多个样本子集,在每个训练子集上利用改进乌鸦搜索算法同步实现最优特征子集选择和核极限学习机KELM分类器参数优化;然后基于基分类器构建集成分类模型对目标数据进行分类识别;此外运用多核平台多线程并行方式进一步提高算法计算效率。在六组基因数据集上的实验结果表明,本文算法不仅能用较少特征基因达到较优的分类效果,并且分类结果显著高于已有和相似方法,是一种有效的高维数据分类方法。  相似文献   

3.
作为一种典型的大数据,数据流具有连续、无限、概念漂移和快速到达等特点,因此传统的分类技术无法直接有效地应用于数据流挖掘。本文在经典的精度加权集成(Accuracy weighted ensemble,AWE)算法的基础上提出概念自适应快速决策树更新集成(Concept very fast decision tree update ensemble,CUE)算法。该算法不仅在基分类器的权重分配方面进行了改进,而且在解决数据块大小的敏感性问题以及增加基分类器之间的相异性方面,有明显的改善。实验表明在分类准确率上,CUE算法高于AWE算法。最后,提出聚类动态分类器选择(Dynamic classifier selection with clustering,DCSC)算法。该算法基于分类器动态选择的思想,没有繁琐的赋权值机制,所以时间效率较高。实验结果验证了DCSC算法的有效和高效性,并能有效地处理概念漂移。  相似文献   

4.
Several pattern classifiers give high classification accuracy but their storage requirements and processing time are severely expensive. On the other hand, some classifiers require very low storage requirement and processing time but their classification accuracy is not satisfactory. In either of the cases the performance of the classifier is poor. In this paper, we have presented a technique based on the combination of minimum distance classifier (MDC), class-dependent principal component analysis (PCA) and linear discriminant analysis (LDA) which gives improved performance as compared with other standard techniques when experimented on several machine learning corpuses.  相似文献   

5.
分类准确性是分类器最重要的性能指标,特征子集选择是提高分类器分类准确性的一种有效方法。现有的特征子集选择方法主要针对静态分类器,缺少动态分类器特征子集选择方面的研究。首先给出具有连续属性的动态朴素贝叶斯网络分类器和动态分类准确性评价标准,在此基础上建立动态朴素贝叶斯网络分类器的特征子集选择方法,并使用真实宏观经济时序数据进行实验与分析。  相似文献   

6.

In this article, we are addressing the question of effective usage of the feature set extracted from deep learning models pre-trained on ImageNet. Exploring this option will offer very fast and attractive alternative to transfer learning strategies. The traditional task of skin lesion recognition consists of several stages, where the automated system is typically trained on preprocessed images with known diagnosis, which allows classification of new samples to predefined categories. For this task, we are proposing here an improved melanoma detection method based on the combination of linear discriminant analysis (LDA) and the features extracted from the deep learning approach. We are examining the usage of the LDA approach on activation of the fully-connected layer of deep learning in order to increase the classification accuracy and at the same time to reduce the feature space dimensionality. We tested our method on five different classifiers and evaluated results using various metrics. The presented comparison demonstrates the very high effectiveness of the suggested feature reduction, which leads not only to the significant lowering of employed features but also to the increasing performance of all tested classifiers in almost all measured characteristics.

  相似文献   

7.
A new fast prototype selection method based on clustering   总被引:2,自引:1,他引:1  
In supervised classification, a training set T is given to a classifier for classifying new prototypes. In practice, not all information in T is useful for classifiers, therefore, it is convenient to discard irrelevant prototypes from T. This process is known as prototype selection, which is an important task for classifiers since through this process the time for classification or training could be reduced. In this work, we propose a new fast prototype selection method for large datasets, based on clustering, which selects border prototypes and some interior prototypes. Experimental results showing the performance of our method and comparing accuracy and runtimes against other prototype selection methods are reported.  相似文献   

8.
Data mining for case-based reasoning in high-dimensional biological domains   总被引:1,自引:0,他引:1  
Case-based reasoning (CBR) is a suitable paradigm for class discovery in molecular biology, where the rules that define the domain knowledge are difficult to obtain and the number and the complexity of the rules affecting the problem are too large for formal knowledge representation. To extend the capabilities of CBR, we propose the mixture of experts for case-based reasoning (MOE4CBR), a method that combines an ensemble of CBR classifiers with spectral clustering and logistic regression. Our approach not only achieves higher prediction accuracy, but also leads to the selection of a subset of features that have meaningful relationships with their class labels. We evaluate MOE4CBR by applying the method to a CBR system called TA3 - a computational framework for CBR systems. For two ovarian mass spectrometry data sets, the prediction accuracy improves from 80 percent to 93 percent and from 90 percent to 98.4 percent, respectively. We also apply the method to leukemia and lung microarray data sets with prediction accuracy improving from 65 percent to 74 percent and from 60 percent to 70 percent, respectively. Finally, we compare our list of discovered biomarkers with the lists of selected biomarkers from other studies for the mass spectrometry data sets.  相似文献   

9.
In case-based reasoning (CBR) classification systems, the similarity metrics play a key role and directly affect the system's performance. Based on our previous work on the learning pseudo metrics (LPM), we propose a case-based reasoning method for pattern classification, where the widely used Euclidean distance is replaced by the LPM to measure the closeness between the target case and each source case. The same type of case as the target case can be retrieved and the category of the target case can be defined by using the majority of reuse principle. Experimental results over some benchmark datasets and a fault diagnosis of the Tennessee-Eastman (TE) process demonstrate that the proposed reasoning techniques in this paper can effectively improve the classification accuracy, and the LPM-based retrieval method can substantially improve the quality and learning ability of CBR classifiers.  相似文献   

10.
Whenever there is any fault in an automotive engine ignition system or changes of an engine condition, an automotive mechanic can conventionally perform an analysis on the ignition pattern of the engine to examine symptoms, based on specific domain knowledge (domain features of an ignition pattern). In this paper, case-based reasoning (CBR) approach is presented to help solve human diagnosis problem using not only the domain features but also the extracted features of signals captured using a computer-linked automotive scope meter. CBR expert system has the advantage that it provides user with multiple possible diagnoses, instead of a single most probable diagnosis provided by traditional network-based classifiers such as multi-layer perceptions (MLP) and support vector machines (SVM). In addition, CBR overcomes the problem of incremental and decremental knowledge update as required by both MLP and SVM. Although CBR is effective, its application for high dimensional domains is inefficient because every instance in a case library must be compared during reasoning. To overcome this inefficiency, a combination of preprocessing methods, such as wavelet packet transforms (WPT), kernel principal component analysis (KPCA) and kernel K-means (KKM) is proposed. Considering the ignition signals captured by a scope meter are very similar, WPT is used for feature extraction so that the ignition signals can be compared with the extracted features. However, there exist many redundant points in the extracted features, which may degrade the diagnosis performance. Therefore, KPCA is employed to perform a dimension reduction. In addition, the number of cases in a case library can be controlled through clustering; KKM is adopted for this purpose. In this paper, several diagnosis methods are also used for comparison including MLP, SVM and CBR. Experimental results showed that CBR using WPT and KKM generated the highest accuracy and fitted better the requirements of the expert system.  相似文献   

11.
This paper presents a novel application of advanced machine learning techniques for Mars terrain image classification. Fuzzy-rough feature selection (FRFS) is adapted and then employed in conjunction with Support Vector Machines (SVMs) to construct image classifiers. These techniques are integrated to address problems in space engineering where the images are of many classes, large-scale, and diverse representational properties. The use of the adapted FRFS allows the induction of low-dimensionality feature sets from feature patterns of a much higher dimensionality. To evaluate the proposed work, K-Nearest Neighbours (KNNs) and decision trees (DTREEs) based image classifiers as well as information gain rank (IGR) based feature selection are also investigated here, as possible alternatives to the underlying machine learning techniques adopted. The results of systematic comparative studies demonstrate that in general, feature selection improves the performance of classifiers that are intended for use in high dimensional domains. In particular, the proposed approach helps to increase the classification accuracy, while enhancing classification efficiency by requiring considerably less features. This is evident in that the resultant SVM-based classifiers which utilise FRFS-selected features generally outperform KNN and DTREE based classifiers and those which use IGR-returned features. The work is therefore shown to be of great potential for on-board or ground-based image classification in future Mars rover missions.  相似文献   

12.
基于遥感案例推理的海岸带养殖信息提取   总被引:2,自引:0,他引:2  
目前基于目视解释或光谱分类的养殖信息提取效率低,难以克服由于地物混杂带来的“椒盐”噪声现象且难以融合地学知识。针对养殖信息提取中存在的问题,首先在分析现有养殖信息提取方法和案例推理CBR(Case\|Based Reasoning)用于遥感图像处理的基础上,提出基于遥感案例推理的海岸带养殖信息提取的研究思路;其次,结合养殖区域的空间特征和属性特征,构建案例的表达模型以及CBR相似性推理模型;最后,对不属于案例构建区的粤西沙田镇进行养殖信息提取的CBR实验,精度达到84.56%。对比CBR方法和传统监督分类方法可知,CBR方法是实现海岸带养殖信息快速准确提取的一种有效手段。  相似文献   

13.
Reducing SVM classification time using multiple mirror classifiers.   总被引:3,自引:0,他引:3  
We propose an approach that uses mirror point pairs and a multiple classifier system to reduce the classification time of a support vector machine (SVM). Decisions made with multiple simple classifiers formed from mirror pairs are integrated to approximate the classification rule of a single SVM. A coarse-to-fine approach is developed for selecting a given number of member classifiers. A clustering method, derived from the similarities between classifiers, is used for a coarse selection. A greedy strategy is then used for fine selection of member classifiers. Selected member classifiers are further refined by finding a weighted combination with a perceptron. Experiment results show that our approach can successfully speed up SVM decisions while maintaining comparable classification accuracy.  相似文献   

14.
Neural network ensemble based on rough sets reduct is proposed to decrease the computational complexity of conventional ensemble feature selection algorithm. First, a dynamic reduction technology combining genetic algorithm with resampling method is adopted to obtain reducts with good generalization ability. Second, Multiple BP neural networks based on different reducts are built as base classifiers. According to the idea of selective ensemble, the neural network ensemble with best generalization ability can be found by search strategies. Finally, classification based on neural network ensemble is implemented by combining the predictions of component networks with voting. The method has been verified in the experiment of remote sensing image and five UCI datasets classification. Compared with conventional ensemble feature selection algorithms, it costs less time and lower computing complexity, and the classification accuracy is satisfactory.  相似文献   

15.
分类器的动态选择与循环集成方法   总被引:1,自引:0,他引:1  
针对多分类器系统设计中最优子集选择效率低下、集成方法缺乏灵活性等问题, 提出了分类器的动态选择与循环集成方法 (Dynamic selection and circulating combination, DSCC). 该方法利用不同分类器模型之间的互补性, 动态选择出对目标有较高识别率的分类器组合, 使参与集成的分类器数量能够随识别目标的复杂程度而自适应地变化, 并根据可信度实现系统的循环集成. 在手写体数字识别实验中, 与其他常用的分类器选择方法相比, 所提出的方法灵活高效, 识别率更高.  相似文献   

16.
针对流量分类问题中,传统单一的机器学习分类算法存在分类准确率难以提升和对网络环境变化适应能力不足的缺点,提出一种多分类器集成流量分类方法。该方法结合不同算法分类器的特点,使用多数投票和实例选择集成方法实现流量分类。对比实验表明,该方法在分类准确率和算法泛化性能上的表现均有所提升,对环境变化适应能力增强。但值得注意的是,该算法比独立分类法从实现复杂度和实际运行的时间复杂度均有所增加。  相似文献   

17.
Khuwaja, G. A., An Adaptive Combined Classifier System for Invariant Face Recognition, Digital Signal Processing12 (2002) 21–46In classification tasks it may be wise to combine observations from different sources. In this paper, to obtain classification systems with both good generalization performance and efficiency in space and time, a learning vector quantization learning method based on combinations of weak classifiers is proposed. The weak classifiers are generated using automatic elimination of redundant hidden layer neurons of the network on both the entire face images and the extracted features: forehead, right eye, left eye, nose, mouth, and chin. The neuron elimination is based on the killing of blind neurons, which are redundant. The classifiers are then combined through majority voting on the decisions available from input classifiers. It is demonstrated that the proposed system is capable of achieving better classification results with both good generalization performance and a fast training time on a variety of test problems using a large and variable database. The selection of stable and representative sets of features that efficiently discriminate between faces in a huge database is discussed.  相似文献   

18.
基于内容的邮件过滤本质是二值文本分类问题。特征选择在分类之前约简特征空间以减少分类器在计算和存储上的开销,同时过滤部分噪声以提高分类的准确性,是影响邮件过滤准确性和时效性的重要因素。但各特征选择算法在同一评价环境中性能不同,且对分类器和数据集分布特征具有依赖性。结合邮件过滤自身特点,从分类器适应性、数据集依赖性及时间复杂度三个方面评价与分析各特征选择算法在邮件过滤领域的性能。实验结果表明,优势率和文档频数用于邮件过滤时垃圾邮件识别的准确率较高,运算时间较少。  相似文献   

19.
李琼  陈利  王维虎 《微机发展》2014,(2):205-208
手写体数字识别是图像处理与模式识别中具有较高实用价值的研究热点之一。在保证较高识别精度的前提下,为提高手写体数字的识别速度,提出了一种基于SVM的快速手写体数字识别方法。该方法通过各类别在特征空间中的可分性强度确定SVM最优核参数,快速训练出SVM分类器对手写体数字进行分类识别。由于可分性强度的计算是一个简单的迭代过程,所需时间远小于传统参数优化方法中训练相应SVM分类器所需时间,故参数确定时间被大大缩减,训练速度得到相应提高,从而加快了手写体数字的识别过程,同时保证了较好的分类准确率。通过对MNIST手写体数字库的实验验证,结果表明该算法是可行有效的。  相似文献   

20.
结构化集成学习垃圾邮件过滤   总被引:4,自引:0,他引:4  
为了解决垃圾邮件过滤算法低计算复杂度与高分类准确率之间的矛盾,在多域学习框架下提出一种结构化集成学习思想,它根据文档结构组合多个基分类器的结果以追求更高分类性能.采用邮件文档的字符串特征生成多个轻量基分类器,并采用字符串-频率索引存储标注数据,使得每次更新和查询的时间开销是常数量级.根据邮件文档的多域结构特性,提出历史域分类器效力线性组合权和当前域文档分类能力线性组合权.综合考虑历史域分类器效力和当前域文档分类能力,还提出一种能够提高整体分类准确率的综合线性组合权.在TREC立即全反馈垃圾邮件过滤任务上的实验结果表明:基于综合线性组合权的结构化集成学习方法能够在较短的时间(47.24min)内完成过滤任务,整体性能1-ROCA达到参加TREC2007评测的最优过滤器性能(0.005 5).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号