首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
摘 要: 多维分类根据数据实例的特征向量将数据实例在多个维度上进行分类,具有广泛的应用前景。在多维分类算法的模型学习过程中,海量的训练数据使得准确的分类算法需要很长的模型训练时间。为了提高多维分类的执行效率,同时保持高的预测准确性,本文提出了一种基于贝叶斯网络的多维分类学习方法。首先,将多维分类问题描述为条件概率分布问题。其次,根据类别向量之间的依赖关系建立了条件树贝叶斯网络模型。最后,根据训练数据集对条件树贝叶斯网络模型的结构和参数进行学习,并提出了一种多维分类预测算法。大量的真实数据集实验表明,本文提出的方法与当前最好的多维分类算法MMOC相比,在保持高准确性的同时将模型的训练时间降低了两个数量级。因此,本文提出的方法更适用于海量数据的多维分类应用中。  相似文献   

2.
A common approach to solving multi-label learning problems is to use problem transformation methods and dichotomizing classifiers as in the pair-wise decomposition strategy. One of the problems with this strategy is the need for querying a quadratic number of binary classifiers for making a prediction that can be quite time consuming, especially in learning problems with a large number of labels. To tackle this problem, we propose a Two Stage Architecture (TSA) for efficient multi-label learning. We analyze three implementations of this architecture the Two Stage Voting Method (TSVM), the Two Stage Classifier Chain Method (TSCCM) and the Two Stage Pruned Classifier Chain Method (TSPCCM). Eight different real-world datasets are used to evaluate the performance of the proposed methods. The performance of our approaches is compared with the performance of two algorithm adaptation methods (Multi-Label k-NN and Multi-Label C4.5) and five problem transformation methods (Binary Relevance, Classifier Chain, Calibrated Label Ranking with majority voting, the Quick Weighted method for pair-wise multi-label learning and the Label Powerset method). The results suggest that TSCCM and TSPCCM outperform the competing algorithms in terms of predictive accuracy, while TSVM has comparable predictive performance. In terms of testing speed, all three methods show better performance as compared to the pair-wise methods for multi-label learning.  相似文献   

3.
多代表点的子空间分类算法   总被引:1,自引:0,他引:1       下载免费PDF全文
多代表点近邻分类克服了传统近邻分类算法的缺点,使用以代表点为中心的模型簇构造分类模型并自动确定近邻数目.此类算法在不同类别的样本存在大量重叠时将导致模型簇数量增大,造成预测精度下降.提出了一种多代表点的子空间分类算法,将不同类别的训练样本投影到多个不同的子空间,使用子空间模型簇构造分类模型,有效分隔了不同类别样本在全空...  相似文献   

4.
Land-cover mapping is an important research topic with broad applicability in the remote-sensing domain. Machine learning algorithms such as Maximum Likelihood Classifier (MLC), Support Vector Machine (SVM), Artificial Neural Network (ANN), and Random Forest (RF) have been playing an important role in this field for many years, although deep neural networks are experiencing a resurgence of interest. In this article, we demonstrate early efforts to apply deep learning-based classification methods to large-scale land-cover mapping. Based on the Stacked Autoencoder (SAE), one of the deep learning models, we built a classification framework for large-scale remote-sensing image processing. We adjusted and optimized the model parameters based on our test samples. We compared the performance of the SAE-based approach with traditional classification algorithms including RF, SVM, and ANN with multiple performance analytics. Results show that the SAE classifier trained with an entire set of African training samples achieves an overall classification accuracy of 78.99% when assessed by test samples collected independently of training samples, which is higher than the accuracies achieved by the other three classifiers (76.03%, 77.74%, and 77.86% of RF, SVM, and ANN, respectively) based on the same set of test samples. We also demonstrated the advantages of SAE in prediction time and land-cover mapping results in this study.  相似文献   

5.
This paper studies the state-of-the-art classification techniques for electroencephalogram (EEG) signals. Fuzzy Functions Support Vector Classifier, Improved Fuzzy Functions Support Vector Classifier and a novel technique that has been designed by utilizing Particle Swarm Optimization and Radial Basis Function Networks (PSO-RBFN) have been studied. The classification performances of the techniques are compared on standard EEG datasets that are publicly available and used by brain–computer interface (BCI) researchers. In addition to the standard EEG datasets, the proposed classifier is also tested on non-EEG datasets for thorough comparison. Within the scope of this study, several data clustering algorithms such as Fuzzy C-means, K-means and PSO clustering algorithms are studied and their clustering performances on the same datasets are compared. The results show that PSO-RBFN might reach the classification performance of state-of-the art classifiers and might be a better alternative technique in the classification of EEG signals for real-time application. This has been demonstrated by implementing the proposed classifier in a real-time BCI application for a mobile robot control.  相似文献   

6.
作为概率图模型,无限制多维贝叶斯网络分类器(GMBNC)是贝叶斯网络(BN)应用在多维分类应用时的精简模型,只包含对预测有效的局部结构.为了获得GMBNC,传统方法是先学习全局BN;为了避免全局搜索,提出了仅执行局部搜索的结构学习算法DOS-GMBNC.该算法继承了之前提出的IPC-GMBNC算法的主体框架,基于进一步挖掘的结构拓扑信息来动态调整搜索次序,以避免执行无效用的计算.实验研究验证了DOS-GMBNC算法的效果和效率:(1)该算法输出的网络质量与IPC-GMBNC一致,优于经典的PC算法;(2)在一个包含100个节点的问题中,该算法相对于PC和IPC-GMBNC算法分别节省了近89%和45%的计算量.  相似文献   

7.
演化算法中,预选择算子用于为后续的环境选择过程筛选出好的潜在候选后代解.现有预选择算子大多基于适应值评估、代理模型或分类模型.由于预选择过程本质上是一个分类过程,因此基于分类的预选择过程天然适用于演化算法.先前研究工作采用二分类或多分类模型进行预选择,需预先准备“好”和“差”两组或具有区分性的多组训练样本来构建分类模型,而随着演化算法的执行,“好”解和“差”解之间的界限将愈加模糊,因此准备具有区分性的两组或多组训练样本将变得具有挑战性.为解决该问题,本文提出了一种基于单分类的预选择策略(One-class Classification based PreSelection,OCPS),首先将当前种群中的解均视为“好”类样本,之后只利用该类“好”样本构建单分类模型,然后利用构建的模型对产生的多个候选解进行标记与选择.提出的策略应用在三个代表性演化算法中,数值实验结果表明,提出的策略能够提升现有演化算法的收敛速度.  相似文献   

8.
王欢  张丽萍  闫盛 《计算机应用》2016,36(12):3468-3475
针对克隆代码有害性预测中有害和无害数据分类不平衡的问题,提出一种基于随机下采样(RUS)的能够自动调整分类不平衡的K-Balance算法。首先对克隆代码提取静态特征和演化特征构建样本数据集;然后选取比例不同的分类不平衡新数据集;接着对已选取的新数据集进行有害性预测;最后,通过观察分类器的不同表现自动选择一个最适合的分类不平衡比例值。在7款C语言开源软件共170个版本上对克隆有害性预测模型的性能进行评估,并和其他分类不平衡解决方法进行对比,实验结果表明所提方法对有害和无害克隆的分类预测效果(受试者工作特征曲线下方面积(AUC)值)提高了2.62个百分点~36.70个百分点,能有效地改善分类不平衡的预测问题。  相似文献   

9.
胡耀炜  段磊  李岭  韩超 《计算机应用》2018,38(2):427-432
针对现有的基于模式的序列分类算法对于生物序列存在分类精度不理想、模型训练时间长的问题,提出密度感知模式,并设计了基于密度感知模式的生物序列分类算法——BSC。首先,在生物序列中挖掘具有"密度感知"的频繁序列模式;然后,对挖掘出的频繁序列模式进行筛选、排序制定成分类规则;最后,通过分类规则对没有分类的序列进行分类预测。在4组真实生物序列中进行实验,分析了BSC算法参数对结果的影响并提供了推荐参数设置;同时分类结果表明,相比其他四种基于模式的分类算法,BSC算法在实验数据集上的准确率至少提高了2.03个百分点。结果表明,BSC算法有较高的生物序列分类精度和执行效率。  相似文献   

10.
在实际生活中,可以很容易地获得大量系统数据样本,却只能获得很小一部分的准确标签。为了获得更好的分类学习模型,引入半监督学习的处理方式,对基于未标注数据强化集成多样性(UDEED)算法进行改进,提出了UDEED+——一种基于权值多样性的半监督分类算法。UDEED+主要的思路是在基学习器对未标注数据的预测分歧的基础上提出权值多样性损失,通过引入基学习器权值的余弦相似度来表示基学习器之间的分歧,并且从损失函数的不同角度充分扩展模型的多样性,使用未标注数据在模型训练过程中鼓励集成学习器的多样性的表示,以此达到提升分类学习模型性能和泛化性的目的。在8个UCI公开数据集上,与UDEED算法、S4VM(Safe Semi-Supervised Support Vector Machine)和SSWL(Semi-Supervised Weak-Label)半监督算法进行了对比,相较于UDEED算法,UDEED+在正确率和F1分数上分别提升了1.4个百分点和1.1个百分点;相较于S4VM,UDEED+在正确率和F1分数上分别提升了1.3个百分点和3.1个百分点;相较于SSWL,UDEED+在正确率和F1分数上分别提升了0.7个百分点和1.5个百分点。实验结果表明,权值多样性的提升可以改善UDEED+算法的分类性能,验证了其对所提算法UDEED+的分类性能提升的正向效果。  相似文献   

11.
刘栋  宋国杰 《计算机应用》2011,31(5):1374-1377
为解决多维时间序列的分类并获取易于理解的分类规则,引入了时序熵的概念及构造时序熵的方法,基于属性选择和属性值划分两方面扩展了决策树模型。并给出了两种构造多维时间序列分类的决策树模型算法。最后,采用移动客户流失的真实数据,对过程决策树进行测试,展示了方法的可行性。  相似文献   

12.
张芳娟  杨燕  杜圣东 《计算机应用》2018,38(11):3150-3155
针对高校资助管理办法效率低下、工作量大等问题,提出一种增强特征判别性的典型相关分析(EN-DCCA)方法,并结合分类集成方法实现高校学生助学金预测。将学生在校多维度数据划分为两个不同视图,已有的各种多视图判别典型相关分析算法没有综合考虑视图类别之间的相关性和视图组合特征的判别性两者因素。EN-DCCA的优化目标在最大化类内相关的同时最小化类间相关,并且考虑了视图组合特征的判别性,进一步强化了属性的判别性能,更有利于分类预测。高校学生助学金预测的实现过程:首先,根据学生生活行为和学习表现将数据预处理为两个不同视图,然后用EN-DCCA方法对这两个视图数据进行特征学习,最后用分类集成方法完成预测。在真实的数据集上进行实验,所提方法的预测准确率达到90.01%,较增强视图组合特征判别性的典型相关分析(CECCA)的集成方法提高了2个百分点,实验结果表明,所提方法能有效实现高校助学金预测。  相似文献   

13.
The substitution of missing values, also called imputation, is an important data preparation task for data mining applications. Imputation algorithms have been traditionally compared in terms of the similarity between imputed and original values. However, this traditional approach, sometimes referred to as prediction ability, does not allow inferring the influence of imputed values in the ultimate modeling tasks (e.g., in classification). Based on an extensive experimental work, we study the influence of five nearest-neighbor based imputation algorithms (KNNImpute, SKNN, IKNNImpute, KMI and EACImpute) and two simple algorithms widely used in practice (Mean Imputation and Majority Method) on classification problems. In order to experimentally assess these algorithms, simulations of missing values were performed on six datasets by means of two missingness mechanisms: Missing Completely at Random (MCAR) and Missing at Random (MAR). The latter allows the probabilities of missingness to depend on observed data but not on missing data, whereas the former occurs when the distribution of missingness does not depend on the observed data either. The quality of the imputed values is assessed by two measures: prediction ability and classification bias. Experimental results show that IKNNImpute outperforms the other algorithms in the MCAR mechanism. KNNImpute, SKNN and EACImpute, by their turn, provided the best results in the MAR mechanism. Finally, our experiments also show that best prediction results (in terms of mean squared errors) do not necessarily yield to less classification bias.  相似文献   

14.
传统的多标签分类算法是以二值标签预测为基础的,而二值标签由于仅能指示数据是否具有相关类别,所含语义信息较少,无法充分表示标签语义信息。为充分挖掘标签空间的语义信息,提出了一种基于非负矩阵分解和稀疏表示的多标签分类算法(MLNS)。该算法结合非负矩阵分解与稀疏表示技术,将数据的二值标签转化为实值标签,从而丰富标签语义信息并提升分类效果。首先,对标签空间进行非负矩阵分解以获得标签潜在语义空间,并将标签潜在语义空间与原始特征空间结合以形成新的特征空间;然后,对此特征空间进行稀疏编码来获得样本间的全局相似关系;最后,利用该相似关系重构二值标签向量,从而实现二值标签与实值标签的转化。在5个标准多标签数据集和5个评价指标上将所提算法与MLBGM、ML2、LIFT和MLRWKNN等算法进行对比。实验结果表明,所提MLNS在多标签分类中优于对比的多标签分类算法,在50%的案例中排名第一,在76%的案例中排名前二,在全部的案例中排名前三。  相似文献   

15.
动态数据包分类是目前新兴网络服务的基础,但现有包分类算法的更新性能不能令人满意。基于递归空间分解和解释器方法,设计和实现了一个支持快速增量更新的两阶段多维包分类算法TICS,利用局部数据结构重建替换方法允许规则集增量更新,并通过适当的内存管理允许查找和更新的并行同步进行。实验表明,算法的更新速度比目前更新最快的BRPS算法至少提升了一个数量级,且内存消耗少,具有良好的并行扩放性。  相似文献   

16.
信息处理过程中对异常信息的智能化处理是一个前沿的且富有挑战性的研究方向;针对所获取的信息由于噪声干扰等因素存在缺失这一异常现象,提出了一种不完整(缺失)数据的智能分类算法;对于某一个不完整样本,该方法首先根据找到的近邻类别信息得到单个或多个版本的估计样本,这样在保证插补的准确性的同时能够有效地表征由于缺失引起的不精确性,然后用分类器分类带有估计值的样本;最后,在证据推理框架下提出一种新的信任分类方法,将难以划分类别的样本分配到对应的复合类来描述由于缺失值引起的样本类别的不确定性,同时降低错误分类的风险;用UCI数据库的真实数据集来验证算法的有效性,实验结果表明该算法能够有效地处理不完整数据分类问题.  相似文献   

17.
Manual segmentation of Magnetic Resonance Images (MRI) is a time-consuming process, thus automatic segmentation of brain MR images has attracted more attention in recent years. In this paper, we introduce Dynamic Classifier Selection Markov Random Field (DCSMRF) algorithm for supervised segmentation of brain MR images into three main tissues such as White Matter (WM), Gray Matter (GM) and Cerebrospinal Fluid (CSF). DCSMRF combines a novel ensemble method with the Markov Random Field (MRF) algorithm and tries to obtain the advantages of both algorithms. For the ensemble part of DCSMRF, we propose an ensemble method called Dynamic Classifier System-Weighted Local Accuracy (DCS-WLA) which is a type of Combination of Multiple Classifier (CMC) algorithm. Later, the MRF algorithm is utilized for incorporating spatial, contextual and textural information in this paper. For the MRF section, an energy function based on the output of the DCS-WLA algorithm is proposed, then maximum value for Maximum A Posterior (MAP) criterion is searched to obtain optimal segmentation. The MRF algorithm applies similar to a post processing step in which only a subset of pixels is selected for optimization step. Hence, a vast amount of search space is pruned. Consequently, the computational burden of the proposed algorithm is more tolerable than the conventional MRF-based methods. Moreover, by employing ensemble algorithms, the accuracy and reliability of final results are enhanced compared to the individual methods.  相似文献   

18.
Video surveillance on highway is a hot topic and a great challenge in Intelligent Transportation Systems. In such applications requiring objects extraction, cast shadows induce shape distortions and object fusions interfering performance of high level algorithms. Shadow elimination allows to improve the performances of video object extraction, classification and tracking. In other hand, it is very important to recognize the type of a detected object in order to track reliably and estimate traffic parameters correctly. This paper presents two approaches to enhance automatic traffic surveillance systems. The first deals with the elimination of shadows and the second concerns the classification of vehicles, based on robust vision and image processing. For moving shadow elimination, a contrast model is proposed to describe and remove dynamic shadows based on the idea that a shadow transformation is a change in contrast. For vehicles classification, Hu moments are calculated in a manner to reduce the perspective effects and used to describe vehicles in knowledge base. Experimental results on the various challenging video sequences show that the proposed approach outperforms classification methods of related works (with a classification accuracy of 96.96%), and that the shadow elimination approach performs better than compared works (with detection rate of 95–99% and discrimination rate of 85.7–89%).  相似文献   

19.
递归流包分类(RFC)算法是目前分类速度较快的一种基于软件实现的多维包分类算法.但是该算法随着规则集规模的增大,占用大量的内存空间,存储开销巨大.针对这一问题,提出一种内存优化的RFC算法Merge_RFC.该算法提出一种位串合并的方法,对RFC算法的交叉乘积表进行压缩,消除冗余空间.仿真结果表明,Merge_RFC在保持较高分类速度的前提下,可以将RFC算法占用的内存空间压缩80%以上.  相似文献   

20.
情绪句分类是情绪分析研究领域的核心问题之一,旨在解决情绪句类别的自动判断问题。传统基于情绪认知模型(OCC模型)的情绪句分类方法大多依赖词典和规则,在文本信息缺失的情况下分类精度不高。文中提出基于OCC模型和贝叶斯网络的情绪句分类方法,通过分析OCC模型的情绪生成规则,提取情绪评估变量并结合情绪句中含有的表情符号特征构建情绪分类贝叶斯网络;通过概率推理,可以实现句子级文本的情绪分类,并减小句中信息缺失所带来的影响。与NLPCC2014中文微博情绪分析评测的子任务情绪句分类评测结果的对比表明,所提方法具有有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号