共查询到20条相似文献,搜索用时 46 毫秒
1.
基于监督学习深度自编码器的图像重构 总被引:1,自引:0,他引:1
针对数字图像受损信息的重构问题,提出一种将经典无监督学习自编码器(Auto-Encoder,AE)用于监督学习的新方法,并对深度模型结构与训练策略进行了研究。通过设计多组监督学习单层AE模型,提出了逐组“递进学习”和“关联编码”的学习策略,构建了一个新的基于监督学习的深度AE模型结构;对于新模型结构,采用多对一(一个输入样本的多种形式对应一个输出)的训练方法代替经典AE中一对一(一个输入样本对应一个输出)的训练方法。将该模型的结构和训练策略用于部分数据受损或遮挡的图像中进行数据重构测试,提高了模型对受损数据特征编码的表达能力和重构能力。实验结果表明,提出的新方法对于受损及遮挡样本的图像具有良好的重构效果和适应性。 相似文献
2.
近年来,深度学习在计算机视觉领域表现出优异的性能,然而研究者们却发现深度学习系统并不具备良好的鲁棒性,对深度学习系统的输入添加少许的人类无法察觉的干扰就能导致深度学习模型失效,这些使模型失效的样本被研究者们称为对抗样本。我们提出迭代自编码器,一种全新的防御对抗样本方案,其原理是把远离流形的对抗样本推回到流形周围。我们先把输入送给迭代自编码器,然后将重构后的输出送给分类器分类。在正常样本上,经过迭代自编码器的样本分类准确率和正常样本分类准确率类似,不会显著降低深度学习模型的性能;对于对抗样本,我们的实验表明,即使使用最先进的攻击方案,我们的防御方案仍然拥有较高的分类准确率和较低的攻击成功率。 相似文献
3.
处理高维复杂数据的聚类问题, 通常需先降维后聚类, 但常用的降维方法未考虑数据的同类聚集性和样本间相关关系, 难以保证降维方法与聚类算法相匹配, 从而导致聚类信息损失. 非线性无监督降维方法极限学习机自编码器(Extreme learning machine, ELM-AE)因其学习速度快、泛化性能好, 近年来被广泛应用于降维及去噪. 为使高维数据投影至低维空间后仍能保持原有子空间结构, 提出基于子空间结构保持的多层极限学习机自编码器降维方法(Multilayer extreme learning machine autoencoder based on subspace structure preserving, ML-SELM-AE). 该方法在保持聚类样本多子空间结构的同时, 利用多层极限学习机自编码器捕获样本集的深层特征. 实验结果表明, 该方法在UCI数据、脑电数据和基因表达谱数据上可以有效提高聚类准确率且取得较高的学习效率. 相似文献
4.
5.
针对极限学习机算法(ELM)参数随机赋值降低算法鲁棒性及性能受噪声影响显著的问题,将去噪自编码器(DAE)与ELM算法相结合,提出了基于去噪自编码器的极限学习机算法(DAE-ELM)。首先,通过去噪自编码器产生ELM的输入数据、输入权值与隐含层参数;然后,以ELM求得隐含层输出权值,完成对分类器的训练。该算法一方面继承了DAE的优点,自动提取的特征更具代表性与鲁棒性,对于噪声有较强的抑制作用;另一方面克服了ELM参数赋值的随机性,增强了算法鲁棒性。实验结果表明,在不含噪声影响下DAE-ELM相较于ELM、PCA-ELM、SAA-2算法,其分类错误率在MNIST数据集中至少下降了5.6%,在Fashion MNIST数据集中至少下降了3.0%,在Rectangles数据集中至少下降了2.0%,在Convex数据集中至少下降了12.7%。 相似文献
6.
7.
针对深度聚类算法对多变量时间序列数据(MTS)的特征提取能力不足等问题,提出一种新的深度聚类结构模型(MDTC)。为了提取MTS的关键特征并实现降维,提出一维卷积学习MTS的属性和时序维度的特征表示与循环神经网络等网络层组成的自编码器结构;为了提高模型对时序特征的表示能力,提出了MCBAM时序注意力模块,用于增强MTS序列中不同时间段的表示特征。在九个公开UEA多元时序数据集进行了实验,模型的自编码器结构相较其他自编码器在七个数据集上提升了2%~9%;模型的MCBAM模块相较其他注意力模块在六个数据集上提升了0.3%~2%。实验表明MDTC模型结构和MCBAM模块的有效性,同时模型对比其他聚类算法具有优异的表现。 相似文献
8.
由于每个目标仅有一幅已知样本,无法描述目标的类内变化,诸多人脸识别算法在解决单样本人脸识别问题时识别性能较低.因此文中提出基于深度自编码器的单样本人脸识别算法.算法首先采用所有已知样本训练深度自编码器,得到广义深度自编码器,然后使用每个单样本目标的单个样本微调广义深度自编码器,得到特定类别的深度自编码器.识别时,将识别图像输入每个特定类别的深度自编码器,得到包含与测试图像相同类内变化的该类别的重构图像,使用重构图像训练Softmax回归模型,分类测试图像.在公共测试库上进行测试,并与其它算法在相同环境下进行对比,结果表明文中算法在获得更优识别率的同时,识别一幅图像所需平均时间更少. 相似文献
9.
传统的深度置信网络(DBN)采用随机初始化受限玻尔兹曼机(RBM)的权值和偏置的方法初始化网络。虽然这在一定程度上克服了由BP算法带来的易陷入局部最优和训练时间长的问题,但随机初始化仍然会导致网络重构和原始输入的较大差别,这使得网络无论在准确率还是学习效率上都无法得到进一步提升。针对以上问题,提出一种基于稀疏降噪自编码器(SDAE)的深度网络模型,其核心是稀疏降噪自编码器对数据的特征提取。首先,训练稀疏降噪自编码;然后,用训练后得到的权值和偏置来初始化深度置信网络;最后,训练深度置信网络。在Poker Hand 纸牌游戏数据集和MNIST、USPS手写数据集上测试模型性能,在Poker Hand数据集下,方法的误差率比传统的深度置信网络降低46.4%,准确率和召回率依次提升15.56%和14.12%。实验结果表明,所提方法能有效地改善模型性能。 相似文献
10.
11.
基于认知的流形学习方法概要 总被引:1,自引:0,他引:1
流形学习是一种新出现的机器学习方法,近年来引起越来越多的计算机科学工作者和认知科学工作者的重视.为了加深对流形学习的认识和理解,从流形与流形学习的基本概念入手,追溯它的发展历程.针对目前的几种主要的流形算法,分析它们各自的优势和不足,然后引用LLE的应用示例.说明流形学习较之于传统的线性降维方法如PCA等,能够有效地发现非线性高维数据的本质维数,可以有效地进行维数约简和数据分析.最后对流形学习未来的研究方向做出展望,以期进一步拓展流形学习的应用领域. 相似文献
12.
Feng PanAuthor Vitae Jiandong WangAuthor VitaeXiaohui LinAuthor Vitae 《Neurocomputing》2011,74(5):812-819
Most manifold learning algorithms adopt the k nearest neighbors function to construct the adjacency graph. However, severe bias may be introduced in this case if the samples are not uniformly distributed in the ambient space. In this paper a semi-supervised dimensionality reduction method is proposed to alleviate this problem. Based on the notion of local margin, we simultaneously maximize the separability between different classes and estimate the intrinsic geometric structure of the data by both the labeled and unlabeled samples. For high-dimensional data, a discriminant subspace is derived via maximizing the cumulative local margins. Experimental results on high-dimensional classification tasks demonstrate the efficacy of our algorithm. 相似文献
13.
Data-driven non-parametric models, such as manifold learning algorithms, are promising data analysis tools. However, to fit an off-training-set data point in a learned model, one must first “locate” the point in the training set. This query has a time cost proportional to the problem size, which limits the model's scalability. In this paper, we address the problem of selecting a subset of data points as the landmarks helping locate the novel points on the data manifolds. We propose a new category of landmarks defined with the following property: the way the landmarks represent the data in the ambient Euclidean space should resemble the way they represent the data on the manifold. Given the data points and the subset of landmarks, we provide procedures to test whether the proposed property presents for the choice of landmarks. If the data points are organized with a neighbourhood graph, as it is often conducted in practice, we interpret the proposed property in terms of the graph topology. We also discuss the extent to which the topology is preserved for landmark set passing our test procedure. Another contribution of this work is to develop an optimization based scheme to adjust an existing landmark set, which can improve the reliability for representing the manifold data. Experiments on the synthetic data and the natural data have been done. The results support the proposed properties and algorithms. 相似文献
14.
Non-negative matrix factorization (NMF) has become a popular technique for finding low-dimensional representations of data. While the standard NMF can only be performed in the original feature space, one variant of NMF, named concept factorization, can be naturally kernelized and inherits all the strengths of NMF. To make use of label information, we propose a semi-supervised concept factorization technique called discriminative concept factorization (DCF) for data representation in this paper. DCF adopts a unified objective to combine the task of data reconstruction with the task of classification. These two tasks have mutual impacts on each other, which results in a concept factorization adapted to the classification task and a classifier built on the low-dimensional representations. Furthermore, we develop an iterative algorithm to solve the optimization problem through alternative convex programming. Experimental results on three real-word classification tasks demonstrate the effectiveness of DCF. 相似文献
15.
Semi-supervised dimensionality reduction for analyzing high-dimensional data with constraints 总被引:1,自引:0,他引:1
In this paper, we present a novel semi-supervised dimensionality reduction technique to address the problems of inefficient learning and costly computation in coping with high-dimensional data. Our method named the dual subspace projections (DSP) embeds high-dimensional data in an optimal low-dimensional space, which is learned with a few user-supplied constraints and the structure of input data. The method projects data into two different subspaces respectively the kernel space and the original input space. Each projection is designed to enforce one type of constraints and projections in the two subspaces interact with each other to satisfy constraints maximally and preserve the intrinsic data structure. Compared to existing techniques, our method has the following advantages: (1) it benefits from constraints even when only a few are available; (2) it is robust and free from overfitting; and (3) it handles nonlinearly separable data, but learns a linear data transformation. As a conclusion, our method can be easily generalized to new data points and is efficient in dealing with large datasets. An empirical study using real data validates our claims so that significant improvements in learning accuracy can be obtained after the DSP-based dimensionality reduction is applied to high-dimensional data. 相似文献
16.
17.
In the setting of multi-instance learning, each object is represented by a bag composed of multiple instances instead of by a single instance in a traditional learning setting. Previous works in this
area only concern multi-instance prediction problems where each bag is associated with a binary (classification) or real-valued (regression) label. However, unsupervised multi-instance learning where bags are without labels has not been studied. In this paper, the problem of unsupervised multi-instance
learning is addressed where a multi-instance clustering algorithm named Bamic is proposed. Briefly, by regarding bags as atomic data items and using some form of distance metric to measure distances
between bags, Bamic adapts the popular k
-Medoids algorithm to partition the unlabeled training bags into k disjoint groups of bags. Furthermore, based on the clustering results, a novel multi-instance prediction algorithm named Bartmip is developed. Firstly, each bag is re-represented by a k-dimensional feature vector, where the value of the i-th feature is set to be the distance between the bag and the medoid of the i-th group. After that, bags are transformed into feature vectors so that common supervised learners are used to learn from
the transformed feature vectors each associated with the original bag’s label. Extensive experiments show that Bamic could effectively discover the underlying structure of the data set and Bartmip works quite well on various kinds of multi-instance prediction problems. 相似文献
18.
Francesco Gullo Author Vitae Author Vitae Andrea Tagarelli Author Vitae Sergio Greco Author Vitae 《Pattern recognition》2009,42(11):2998-3014
Similarity search and detection is a central problem in time series data processing and management. Most approaches to this problem have been developed around the notion of dynamic time warping, whereas several dimensionality reduction techniques have been proposed to improve the efficiency of similarity searches. Due to the continuous increasing of sources of time series data and the cruciality of real-world applications that use such data, we believe there is a challenging demand for supporting similarity detection in time series in a both accurate and fast way. Our proposal is to define a concise yet feature-rich representation of time series, on which the dynamic time warping can be applied for effective and efficient similarity detection of time series. We present the Derivative time series Segment Approximation (DSA) representation model, which originally features derivative estimation, segmentation and segment approximation to provide both high sensitivity in capturing the main trends of time series and data compression. We extensively compare DSA with state-of-the-art similarity methods and dimensionality reduction techniques in clustering and classification frameworks. Experimental evidence from effectiveness and efficiency tests on various datasets shows that DSA is well-suited to support both accurate and fast similarity detection. 相似文献
19.
20.
基于深度学习的特征抽取是目前数据降维问题的研究热点,堆叠自编码器作为一种较为常用的模型,无法对混有噪声及较稀疏的数据进行良好的特征表达。面向微博情感分析,通过在堆叠降噪自编码器的各隐藏层中加入稀疏因子,来解决样本数据所含噪声和稀疏性对特征抽取的影响。使用COAE评测数据集进行的情感分析实验表明所提模型分类的准确率和召回率都有所提高。 相似文献