首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 9 毫秒
This paper presents a new form of exemplar-based learning, based on a representation scheme called feature partitioning, and a particular implementation of this technique called CFP (for Classification by Feature Partitioning). Learning in CFP is accomplished by storing the objects separately in each feature dimension as disjoint sets of values called segments. A segment is expanded through generalization or specialized by dividing it into sub-segments. Classification is based on a weighted voting among the individual predictions of the features, which are simply the class values of the segments corresponding to the values of a test instance for each feature. An empirical evaluation of CFP and its comparison with two other classification techniques that consider each feature separately are given.  相似文献   

We propose a novel linear dimensionality reduction algorithm, namely Locally Regressive Projections (LRP). To capture the local discriminative structure, for each data point, a local patch consisting of this point and its neighbors is constructed. LRP assumes that the low dimensional representations of points in each patch can be well estimated by a locally fitted regression function. Specifically, we train a linear function for each patch via ridge regression, and use its fitting error to measure how well the new representations can respect the local structure. The optimal projections are thus obtained by minimizing the summation of the fitting errors over all the local patches. LRP can be performed under either supervised or unsupervised settings. Our theoretical analysis reveals the connections between LRP and the classical methods such as PCA and LDA. Experiments on face recognition and clustering demonstrate the effectiveness of our proposed method.  相似文献   

张要  马盈仓  朱恒东  李恒  陈程 《计算机工程》2022,48(3):90-99+106
对于多标签特征选择算法,通常假设数据与标签间呈现某种关系,以该关系为基础并通过正则项的约束可解决多标签特征选择问题,但该关系也可能是两种或多种关系的结合。为准确描述数据与标签间的关系并去除不相关的特征和冗余特征,基于logistic回归模型与标签流形结构提出多标签特征选择算法FSML。使用logistic回归模型的损失函数学习回归系数矩阵,利用标签流形结构学习数据特征的权重矩阵,通过L2,1-范数将系数矩阵和权重矩阵进行柔性结合,约束系数矩阵与权重矩阵的稀疏性并实现多标签特征选择。在经典多标签数据集上的实验结果表明,与CMLS、SCLS等特征选择算法相比,FSML算法在汉明损失、排名损失、1-错误率、覆盖率、平均精度等5个性能评价指标上表现良好,能更准确地描述数据与标签间的关系。  相似文献   

基于流形学习的多示例回归算法   总被引:2,自引:0,他引:2  
詹德川  周志华 《计算机学报》2006,29(11):1948-1955
多示例学习是一种新型机器学习框架,以往的研究主要集中在多示例分类上,最近多示例回归受到了国际机器学习界的关注.流形学习旨在获得非线性分布数据的内在结构,可以用于非线性降维.文中基于流形学习技术,提出了用于解决多示例同归问题的Mani MIL算法.该算法首先对训练包中的示例降维,利用降维结果出现坍缩的特性对多示例包进行预测.实验表明,Mani MIL算法比现有的多示例算法例如Citation-kNN等有更好的性能.  相似文献   

First Order Regression   总被引:2,自引:0,他引:2  
Karalič  Aram  Bratko  Ivan 《Machine Learning》1997,26(2-3):147-176

一种基于Schur分解的正交鉴别局部保持投影方法   总被引:2,自引:0,他引:2       下载免费PDF全文
人脸识别是模式识别领域中的一项重要的研究课题。到目前为止,已经提出了许多方法来处理人脸的识别问题。最近,许多流形学习算法被提出并且成功地应用于人脸识别当中。这些流形学习方法能够保持人脸图像数据的局部结构,同时,还可以发现人脸的非线性结构。在这些流形学习方法中,局部保持投影方法(LPP)是最有效的方法之一。基于LPP方法,提出了一种新的人脸识别方法——基于Schur分解的正交鉴别局部保持投影方法(ODLPPS)。与LPP方法相比,ODLPPS 把类间散度与类内散度之差的信息融入到LPP的目标函数中并且获得了正交的基向量。在ORL和Yale 人脸数据库上的实验结果表明,该方法在识别性能上优于一些已经存在的方法,如eigenface,Fisherface,LPP 和orthogonal LPP(OLPP)。  相似文献   

极限学习机(ELM)作为一种无监督分类方法,具有学习速度快、泛化性能高、逼近能力好的优点。随着无监督学习的发展,将ELM与自动编码器集成已成为无标签数据集提取特征的新视角,如极限学习机自动编码器(ELM-AE)是一种无监督的神经网络,无需迭代即可找到代表原始样本和其学习过程的主要成分。其重建输入信号获取原始样本的主要特征,且考虑了原始数据的全局信息以避免信息的丢失,然而这类方法未考虑数据的固有流形结构即样本间的近邻结构关系。借鉴极限学习机自动编码器的思想,提出了一种基于流形的极限学习机自动编码器算法(M-ELM)。该算法是一种非线性无监督特征提取方法,结合流形学习保持数据的局部信息,且在特征提取过程中同时对相似度矩阵进行学习。通过在IRIS数据集、脑电数据集和基因表达数据集上进行实验,将该算法与其他无监督学习方法PCA、LPP、NPE、LE和ELM-AE算法经过[k]-means聚类后的准确率进行了比较,以表明该算法的有效性。  相似文献   

数据集中含有不相关特征和冗余特征会使学习任务难度提高,特征选择可以有效解决该问题,从而提高学习效率和学习器性能.现有的特征选择方法大多针对分类问题,面向回归问题的较少,特别是当数据集含异常点时,现有方法对异常点敏感.虽然某些方法可以通过给样本损失函数加权来提高其稳健性,但是其权值一般都已预先设定好,且在特征选择和学习器训练过程中固定不变,因此方法的自适应性不强.针对上述问题,提出了一种针对异常点的回归特征选择方法(adaptive weight LASSO, AWLASSO),它首先根据回归系数更新样本误差,并通过自适应正则项将误差大于当前阈值的样本的损失函数赋予较小权重,误差小于阈值的样本的损失函数赋予较大权重,再在更新权重后的加权损失函数下重新估计回归系数,不断迭代上述过程.AWLASSO算法采用阈值来控制样本是否参与回归系数的估计,在阈值作用下,误差较小的样本才可参与估计,所以迭代完成后会获得较优的回归系数估计.另外,AWLASSO算法的阈值不是固定不变的,而是不断增大的(为使初始回归系数估计值较准确,其初始值较小),这样误判为异常点的样本可以重新进入训练集,并保证训练集含有足够的样本.对于误差大于最大阈值的样本点,由于其学习代价较大,算法将其识别为异常点,令其损失函数权重为0,从而有效降低了异常点的影响.在构造数据和标准数据上的实验结果表明:对于含有异常点的数据集,提出的方法比经典方法具有更好的稳健性和稀疏性.  相似文献   

特征提取是人脸识别的一个重要研究领域,能否有效地提取判别特征是决定人脸识别算法好坏的关键。一般的人脸识别算法都是基于图像向量的,需要将2维人脸图像压缩成1维向量,这不仅破坏了像素之间原有的空间结构关系,而且转换后的向量维数过高。为了避免这种情况,提出了一种直接基于图像矩阵的人脸识别算法——2维保局投影算法。由于该算法是在保局投影的基础上进行扩展,使其可以直接面向2维图像矩阵进行处理,同时在构建相似矩阵的时候引入了样本类别信息,因而可有效地提取人脸图片的2维判别特征。另外还采用最小近邻分类器估算识别率。在AT&T人脸库的实验结果表明,与Eigenface、Fisherface以及Laplacianface算法相比,该方法具有较好的识别率。  相似文献   

Incremental Feature Selection   总被引:6,自引:3,他引:6  
Feature selection is a problem of finding relevant features. When the number of features of a dataset is large and its number of patterns is huge, an effective method of feature selection can help in dimensionality reduction. An incremental probabilistic algorithm is designed and implemented as an alternative to the exhaustive and heuristic approaches. Theoretical analysis is given to support the idea of the probabilistic algorithm in finding an optimal or near-optimal subset of features. Experimental results suggest that (1) the probabilistic algorithm is effective in obtaining optimal/suboptimal feature subsets; (2) its incremental version expedites feature selection further when the number of patterns is large and can scale up without sacrificing the quality of selected features.  相似文献   

提出了推导密度函数的基本假设,对密度函数进行了推导,通过密度函数实现了密度区域的划分;对同一密度范围内的未标签值标记的估计给出了具体的处理方法;最后介绍了基于密度分布的半监督回归算法的具体实现步骤。该算法实现了对未标签点的标记,能够减小对未标签点标签值的估计误差,提高估计的准确度。  相似文献   

基于特征选择的网络入侵检测方法   总被引:1,自引:0,他引:1  
针对现有入侵检测算法中存在着冗余或噪音特征导致的检测模型精度下降与训练时间过长的问题进行了研究,将特征选择算法引入到入侵检测领域,提出了一种基于特征选择的入侵检测方法.利用不同的离散化与特征选择算法生成具有差异的多个最优特征子集,并对每个特征子集进行归一化处理,用分类算法对提取后的特征进行学习建模.通过实验将该方法与基于传统算法(决策树、朴素贝叶斯、支持向量机)的入侵检测方法作比较,实验结果表明,该方法有效地提高了检测攻击的准确率,并且降低了模型的训练时间.  相似文献   

杨国亮  谢乃俊  余嘉玮  梁礼明 《计算机科学》2015,42(3):296-300, 306
为了在特征提取过程中保持数据低秩特性不变,提出了一种基于低秩表示的线性保持投影算法用于维数约简。它能够使降维后的低维空间中的数据依旧较好地保持在原始高维空间中的低秩特性,准确地学习出数据的低维子空间。通过构建两个不同的低秩表示模型来 揭示两种不同结构特性的低秩权重,然后以保持数据的这两个低秩权重关系为目的来求解高维数据的低维空间。 在ORL库和Yale库人脸库上的实验结果证明,该算法比传统的特征提取方法更有效。  相似文献   

Instance-Based Learning Algorithms   总被引:45,自引:1,他引:45  
Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several real-world databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.  相似文献   

子空间学习是机器学习领域的重要研究方向.为了降低子空间学习的复杂度,Cai等人提出了谱回归降维框架,并针对结合标签构造对应图的子空间学习提出了高效谱回归.近年来,量子计算的发展使进一步降低子空间学习算法的复杂度成为了可能.Meng等人率先提出了量子谱回归算法(MYXZ算法).MYXZ算法用了稀疏哈密顿量模拟技术来处理由权重矩阵生成的矩阵,但这个矩阵在较多的情况下是稠密矩阵.针对这种情况,指出了MYXZ算法的局限性,提出了一个改进的量子谱回归算法.改进算法采用了量子奇异值估计技术,在处理稠密矩阵时相对MYXZ算法有多项式加速.另外,提出了一个新的量子算法,对经典的高效谱回归进行加速.新算法能处理的这类问题是MYXZ算法无法处理的.新算法利用了量子岭回归和量子矩阵向量乘技术,在相同的参数条件下相对经典算法具有多项式加速效果.  相似文献   

支持向量机是一种基于结构风险最小化原理的学习技术,也是一种新的具有很好泛化性能的回归方法。目前,如何设计快速有效的回归估计算法仍然是支持向量机实际应用中的问题之一。文中对标准SVM回归估计算法加以改进,提出一种改进的SVM回归估计算法,并从学习速度和回归估计精度两个方面对提出的改进的SVM回归估计算法与标准SVM回归估计算法进行了比较。实验结果表明,在学习速度与回归估计精度之间取折衷时,文中提出的回归估计算法自由度更大。  相似文献   

High sensitivity to irrelevant features is arguably the main shortcoming of simple lazy learners. In response to it, many feature selection methods have been proposed, including forward sequential selection (FSS) and backward sequential selection (BSS). Although they often produce substantial improvements in accuracy, these methods select the same set of relevant features everywhere in the instance space, and thus represent only a partial solution to the problem. In general, some features will be relevant only in some parts of the space; deleting them may hurt accuracy in those parts, but selecting them will have the same effect in parts where they are irrelevant. This article introduces RC, a new feature selection algorithm that uses a clustering-like approach to select sets of locally relevant features (i.e., the features it selects may vary from one instance to another). Experiments in a large number of domains from the UCI repository show that RC almost always improves accuracy with respect to FSS and BSS, often with high significance. A study using artificial domains confirms the hypothesis that this difference in performance is due to RC's context sensitivity, and also suggests conditions where this sensitivity will and will not be an advantage. Another feature of RC is that it is faster than FSS and BSS, often by an order of magnitude or more.  相似文献   

财务预测是财务管理工作中的一项十分重要的工作,对企业投资、预算等决策非常重要。财务预测的回归分析,是利用一系列的历史资料求得各资产负债表项目和销售额的函数关系,据此预测计划销售额与资产、负债数量,然后预测融资需求。利用Excel能够有效地解决财务预测的回归分析的问题。本文以销售额的多元回归分析预测为例来说明Excel在财务预测回归分析中的应用。  相似文献   

Tabular knowledge-based systems are known to be extremely versatile for verification and validation of knowledge bases. However, a major disadvantage of these systems is the combinatorial explosion that accompanies addition of new attributes or condition entries in the table. One of the means of alleviating this problem in tabular knowledge-based systems is through modularization, which is the process of breaking a big comprehensive table into smaller tables that are easy to deal with. In this study, we propose and illustrate another means to deal with this problem through use of feature selection methodology. The proposed method can be used synergistically with modularization to alleviate problems associated with combinatorial explosion in tabular knowledge bases.  相似文献   

回归型支持向量机的简化算法   总被引:17,自引:0,他引:17  
田盛丰  黄厚宽 《软件学报》2002,13(6):1169-1172
针对支持向量机应用于函数估计时支持向量过多所引起的计算复杂性,提出一种简化算法,可以大幅度地减少支持向量的数量,从而简化其应用.采用简化算法还可以将最小平方支持向量机算法和串行最小化算法结合起来,达到学习效率高且生成的支持向量少的效果.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号