期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

易淼刘小兰《计算机应用》2011,31(10):2793-2795

为了增强基于图的局部和全部一致性(LGC)半监督算法的处理稀疏和噪声数据的能力,提出了一种基于相对变换的LGC算法。该算法通过相对变换将原始数据空间转换到相对空间,在相对空间中噪声和孤立点远离正常点,稀疏的数据变得相对密集,从而可以提高算法的性能。仿真实验结果表明,基于相对变换的LGC算法有更强的处理稀疏和噪声数据的能力。相似文献

2.

Semi-supervised classification based on random subspace dimensionality reduction

Guoxian Yu Guoji Zhang Carlotta Domeniconi Zhiwen Yu Jane You 《Pattern recognition》2012,45(3):1119-1135

Graph structure is vital to graph based semi-supervised learning. However, the problem of constructing a graph that reflects the underlying data distribution has been seldom investigated in semi-supervised learning, especially for high dimensional data. In this paper, we focus on graph construction for semi-supervised learning and propose a novel method called Semi-Supervised Classification based on Random Subspace Dimensionality Reduction, SSC-RSDR in short. Different from traditional methods that perform graph-based dimensionality reduction and classification in the original space, SSC-RSDR performs these tasks in subspaces. More specifically, SSC-RSDR generates several random subspaces of the original space and applies graph-based semi-supervised dimensionality reduction in these random subspaces. It then constructs graphs in these processed random subspaces and trains semi-supervised classifiers on the graphs. Finally, it combines the resulting base classifiers into an ensemble classifier. Experimental results on face recognition tasks demonstrate that SSC-RSDR not only has superior recognition performance with respect to competitive methods, but also is robust against a wide range of values of input parameters. 相似文献

3.

图Laplacian和自训练用于高光谱数据半监督波段选择

黄睿吕智强《数据采集与处理》2014,29(6):981-985

波段选择是数据降维的有效手段,但有限的标记样本影响了监督波段选择的性能。提出一种利用图Laplacian和自训练策略实现半监督波段选择的方法。该方法首先定义基于图的半监督特征评分准则以产生初始波段子集,接着在该子集基础上进行分类,采用自训练策略将部分可信度较高的非标记样本扩展至标记样本集合,再用特征评分准则对波段子集进行更新。重复该过程,获得最终波段子集。高光谱波段选择与分类实验比较了多种非监督、监督和半监督方法,实验结果表明所提算法能选择出更好的波段子集。相似文献

4.

成对约束降维集成下的MicroRNA预测

下载免费PDF全文

魏爽杨明肖袁《计算机科学与探索》2011,5(10):921-931

MicroRNA(miRNA)是一类在生物体内发挥重要调控作用的非编码小RNA,对miRNA的预测有助于研究和理解其生物学功能。已经提出的基于成对约束的降维算法(local semi-supervised linear discriminant analysis,LSLDA)在对miRNA降维的同时,也能保持数据的局部结构信息和判别能力,可有效改进miRNA的预测性能。因此,在LSLDA算法基础上,提出了一种新的集成LSLDA算法(ensemble of local semi-supervised linear discriminant analysis,En-LSLDA)。该算法对不同约束个数下的分类结果进行集成,以集成结果作为最后的分类结果,以此进一步改进miRNA的预测性能。miRNA数据集上的实验结果表明,En-LSLDA算法是有效可行的。同时,UCI数据集上的实验结果也验证了新提出的集成方法同样适用于其他数据集。相似文献

5.

Random Multi-Graphs: A semi-supervised learning framework for classification of high dimensional data

《Image and vision computing》2017

Currently, high dimensional data processing confronts two main difficulties: inefficient similarity measure and high computational complexity in both time and memory space. Common methods to deal with these two difficulties are based on dimensionality reduction and feature selection. In this paper, we present a different way to solve high dimensional data problems by combining the ideas of Random Forests and Anchor Graph semi-supervised learning. We randomly select a subset of features and use the Anchor Graph method to construct a graph. This process is repeated many times to obtain multiple graphs, a process which can be implemented in parallel to ensure runtime efficiency. Then the multiple graphs vote to determine the labels for the unlabeled data. We argue that the randomness can be viewed as a kind of regularization. We evaluate the proposed method on eight real-world data sets by comparing it with two traditional graph-based methods and one state-of-the-art semi-supervised learning method based on Anchor Graph to show its effectiveness. We also apply the proposed method to the subject of face recognition. 相似文献

6.

结合改进密度峰值聚类的LGC半监督学习方法优化

薛子晗潘迪何丽《计算机工程》2021,47(2):77-83,89

基于图的局部与全局一致性（LGC）半监督学习方法具有较高的标注正确率,但时间复杂度较高,难以适用于数据规模较大的实际应用场景。从缩小图的规模入手,提出一种全局一致性优化方法。使用改进后的密度峰值聚类算法,迭代地从数据集中筛选出多个中心点,以每个中心点为簇中心进行局部聚类,并以中心点为顶点构建图,实现基于LGC的半监督学习。实验结果表明,优化后的LGC方法在D31、Aggregation等数据集上具有较好的鲁棒性,在标注正确率和算法执行时间上优势明显。相似文献

7.

稀疏判别分析

陈小冬林焕祥《计算机应用》2012,32(4):1017-1021

针对流形嵌入降维方法中在高维空间构建近邻图无益于后续工作,以及不容易给近邻大小和热核参数赋合适值的问题,提出一种稀疏判别分析算法（SEDA）。首先使用稀疏表示构建稀疏图保持数据的全局信息和几何结构,以克服流形嵌入方法的不足;其次,将稀疏保持作为正则化项使用Fisher判别准则,能够得到最优的投影。在一组高维数据集上的实验结果表明,SEDA是非常有效的半监督降维方法。相似文献

8.

基于相对流形的局部线性嵌入 总被引：1，自引：0，他引：1

文贵华陆庭辉江丽君文军《软件学报》2009,20(9):3476-2386

局部线性嵌入算法极大地依赖于邻域是否真实地反映了流形的内在结构,现有方法构造的邻域结构是拓扑不稳定的,对噪音和稀疏数据敏感.根据认知的相对性规律提出了相对变换,并用其构造了相对空间和相对流形.相对变换可以提高数据之间的可区分性,并能抑制噪音和数据稀疏的影响.在构造的相对空间和相对流形上确定数据点的邻域能够更真实地反映流形的内在结构,由此提出了增强的局部线性嵌入算法,明显地提高了性能,特别是基于流形的方法还同时提高了速度.标准数据集上的实验结果验证了该方法的有效性. 相似文献

9.

Enhanced graph-based dimensionality reduction with repulsion Laplaceans

E. Kokiopoulou Author Vitae Y. Saad^{Author Vitae} 《Pattern recognition》2009,42(11):2392-2402

Graph-based methods for linear dimensionality reduction have recently attracted much attention and research efforts. The main goal of these methods is to preserve the properties of a graph representing the affinity between data points in local neighborhoods of the high-dimensional space. It has been observed that, in general, supervised graph-methods outperform their unsupervised peers in various classification tasks. Supervised graphs are typically constructed by allowing two nodes to be adjacent only if they are of the same class. However, such graphs are oblivious to the proximity of data from different classes. In this paper, we propose a novel methodology which builds on ‘repulsion graphs’, i.e., graphs that model undesirable proximity between points. The main idea is to repel points from different classes that are close by in the input high-dimensional space. The proposed methodology is generic and can be applied to any graph-based method for linear dimensionality reduction. We provide ample experimental evidence in the context of face recognition, which shows that the proposed methodology (i) offers significant performance improvement to various graph-based methods and (ii) outperforms existing solutions relying on repulsion forces. 相似文献

10.

Semi-supervised ensemble classification in subspaces

Guoxian YuAuthor VitaeGuoji ZhangAuthor Vitae Zhiwen YuAuthor Vitae Carlotta DomeniconiAuthor VitaeJane YouAuthor Vitae Guoqiang HanAuthor Vitae 《Applied Soft Computing》2012,12(5):1511-1522

Graph-based semi-supervised classification depends on a well-structured graph. However, it is difficult to construct a graph that faithfully reflects the underlying structure of data distribution, especially for data with a high dimensional representation. In this paper, we focus on graph construction and propose a novel method called semi-supervised ensemble classification in subspaces, SSEC in short. Unlike traditional methods that execute graph-based semi-supervised classification in the original space, SSEC performs semi-supervised linear classification in subspaces. More specifically, SSEC first divides the original feature space into several disjoint feature subspaces. Then, it constructs a neighborhood graph in each subspace, and trains a semi-supervised linear classifier on this graph, which will serve as the base classifier in an ensemble. Finally, SSEC combines the obtained base classifiers into an ensemble classifier using the majority-voting rule. Experimental results on facial images classification show that SSEC not only has higher classification accuracy than the competitive methods, but also can be effective in a wide range of values of input parameters. 相似文献

11.

Semi-supervised graph clustering: a kernel approach 总被引：6，自引：0，他引：6

Brian Kulis Sugato Basu Inderjit Dhillon Raymond Mooney 《Machine Learning》2009,74(1):1-22

Semi-supervised clustering algorithms aim to improve clustering results using limited supervision. The supervision is generally given as pairwise constraints; such constraints are natural for graphs, yet most semi-supervised clustering algorithms are designed for data represented as vectors. In this paper, we unify vector-based and graph-based approaches. We first show that a recently-proposed objective function for semi-supervised clustering based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of constraint penalty functions, can be expressed as a special case of the weighted kernel k-means objective (Dhillon et al., in Proceedings of the 10th International Conference on Knowledge Discovery and Data Mining, 2004a). A recent theoretical connection between weighted kernel k-means and several graph clustering objectives enables us to perform semi-supervised clustering of data given either as vectors or as a graph. For graph data, this result leads to algorithms for optimizing several new semi-supervised graph clustering objectives. For vector data, the kernel approach also enables us to find clusters with non-linear boundaries in the input data space. Furthermore, we show that recent work on spectral learning (Kamvar et al., in Proceedings of the 17th International Joint Conference on Artificial Intelligence, 2003) may be viewed as a special case of our formulation. We empirically show that our algorithm is able to outperform current state-of-the-art semi-supervised algorithms on both vector-based and graph-based data sets. 相似文献

12.

基于局部尺度转换的拉普拉斯核方法

下载免费PDF全文

张亮杜子平李杨张俊《计算机工程》2011,37(8):202-203

采用数据点的结构信息可以提高半监督学习的性能。为此,提出一种基于图的半监督学习方法。利用局部尺度转换对不同密度区域中的边权重设置不同的尺度参数,在此基础上构造图的拉普拉斯核分类器进行分类学习。在多个数据集上的实验显示该方法优于其他基于核的半监督分类方法。相似文献

13.

Instance selection method for improving graph-based semi-supervised learning

Hai WANG Shao-Bo WANG Yu-Feng LI 《Frontiers of Computer Science》2018,12(4):725-735

Graph-based semi-supervised learning is an important semi-supervised learning paradigm. Although graph-based semi-supervised learning methods have been shown to be helpful in various situations, they may adversely affect performance when using unlabeled data. In this paper, we propose a new graph-based semi-supervised learning method based on instance selection in order to reduce the chances of performance degeneration. Our basic idea is that given a set of unlabeled instances, it is not the best approach to exploit all the unlabeled instances; instead, we should exploit the unlabeled instances that are highly likely to help improve the performance, while not taking into account the ones with high risk. We develop both transductive and inductive variants of our method. Experiments on a broad range of data sets show that the chances of performance degeneration of our proposed method are much smaller than those of many state-of-the-art graph-based semi-supervised learning methods. 相似文献

14.

基于两阶段学习的半监督支持向量机分类算法

陶新民曹盼东宋少宇付丹丹《信息与控制》2012,41(1):7-13

提出了一种基于两阶段学习的半监督支持向量机(semi-supervised SVM)分类算法.首先使用基于图的标签传递算法给未标识样本赋予初始伪标识,并利用k近邻图将可能的噪声样本点识别出来并剔除;然后将去噪处理后的样本集视为已标识样本集输入到支持向量机(SVM)中,使得SVM在训练时能兼顾整个样本集的信息,从而提高SVM的分类准确率.实验结果证明,同其它半监督学习算法相比较,本文算法在标识的训练样本较少的情况下,分类性能有所提高且具有较高的可靠性. 相似文献

15.

Distance metric learning guided adaptive subspace semi-supervised clustering

Xuesong YIN Enliang HU 《Frontiers of Computer Science》2011,5(1):100-108

Most existing semi-supervised clustering algorithms are not designed for handling high-dimensional data. On the other hand, semi-supervised dimensionality reduction methods may not necessarily improve the clustering performance, due to the fact that the inherent relationship between subspace selection and clustering is ignored. In order to mitigate the above problems, we present a semi-supervised clustering algorithm using adaptive distance metric learning (SCADM) which performs semi-supervised clustering and distance metric learning simultaneously. SCADM applies the clustering results to learn a distance metric and then projects the data onto a low-dimensional space where the separability of the data is maximized. Experimental results on real-world data sets show that the proposed method can effectively deal with high-dimensional data and provides an appealing clustering performance. 相似文献

16.

基于密度缩放因子的ISOMAP算法

李香元蔡骋何进荣《计算机科学》2018,45(7):207-213

等度量映射(ISOMAP)算法是一种被广泛应用的非线性无监督降维算法,通过保持各个观测样本间的测地距离进行等距嵌入,从而实现高维空间向低维空间的坐标转换。但在实际应用中,观测数据无可避免地会存在噪声,由于测地距离的计算对噪声比较敏感,并且也没有考虑数据集的密度分布,导致ISOMAP算法降维后低维坐标表示存在几何变形。针对这一缺点,根据局部密度的思想,提出一种基于密度缩放因子的ISOMAP(Density Scaling Factor Based ISOMAP,D-ISOMAP)算法。在传统的ISOMAP算法框架下,首先,针对每个观测样本计算一个局部密度缩放因子;然后,在测地距离的计算过程中,将直接相邻的两个样本之间的测地距离除以这两个样本密度缩放因子的乘积;最后,通过最短路径算法求得改进后的距离矩阵,并对其进行降维处理。改进的测地距离在密度较大的区域被缩小,而在密度较小的区域被放大,这样可以减小噪声对降维效果的影响,提升可视化和聚类效果。人工数据集和UCI数据集上的实验结果表明,在数据集的可视化和聚类效果方面, D-ISOMAP算法较经典的无监督降维算法具有一定的优势。相似文献

17.

基于局部重构与全局保持的半监督维数约减算法

韦佳文贵华王文丰王家兵《计算机科学》2011,38(8):201-204

针对基于局部与全局保持的半监督维数约减算法(LGSSDR)对部域参数选择比较敏感以及对部域图边权值设定不够准确的问题,提出一种基于局部重构与全局保持的半监督维数约减算法(工RGPSSDR)。该算法通过最小化局部重构误差来确定部域图的边权值,在保持数据集局部结构的同时能够保持其全局结构。在Extended YaleB和 CMU PIE标准人脸库上的实验结果表明LRGPSSDR算法的分类性能要优于其它半监督维数约减算法。相似文献

18.

基于局部聚类与图方法的半监督学习算法

李明杨艳屏占惠融《自动化学报》2010,36(12):1655-1660

基于图的算法已经成为半监督学习中的一种流行方法, 该方法把数据定义为图的节点, 用图的边表示数据之间的关系, 在各种数据分布情况下都具有很高的分类准确度. 然而图方法的计算复杂度比较高, 当图的规模比较大时, 计算所需要的时间和存储都非常大, 这在一定程度上限制了图方法的使用. 因此, 如何控制图的大小是基于图的半监督学习算法中的一个重要问题. 本文提出了一种基于密度估计的快速聚类方法, 可以在局部范围对数据点进行聚类, 以聚类形成的子集作为构图的节点, 从而大大降低了图的复杂度. 新的聚类方法计算量较小, 通过推导得到的距离函数能较好地保持原有数据分布. 实验结果表明, 通过局部聚类后构建的小图在分类效果上与在原图上的结果相当, 同时在计算速度上有极大的提高. 相似文献

19.

半监督复杂结构数据降维方法

下载免费PDF全文

陈斌辉白清源《计算机工程与应用》2011,35(35):135-138

现有的一些典型半监督降维算法,往往在利用标记信息的同时却忽略了样本数据本身的流形特征,或者是对流形特征使用不当,导致算法性能表现不佳并且应用领域狭窄。针对上述问题提出了半监督复杂结构数据降维方法,同时保持样本数据的全局与局部的流形特征。通过设置适当的目标函数,使算法结果能有更广泛的应用场合,实验证明了算法的有效性。相似文献

20.

Semi-supervised classification using multiple clusterings

G. X. Yu L. Feng G. J. Yao J. Wang 《Pattern Recognition and Image Analysis》2016,26(4):681-687

Graph determines the performance of graph-based semi-supervised classification. In this paper, we investigate how to construct a graph from multiple clusterings and propose a method called Semi-Supervised Classification using Multiple Clusterings (SSCMC in short). SSCMC firstly projects original samples into different random subspaces and performs clustering on the projected samples. Then, it constructs a graph by setting an edge between two samples if these two samples are clustered in the same cluster for each clustering. Next, it combines these graphs into a composite graph and incorporates the resulting composite graph with a graph-based semi-supervised classifier based on local and global consistency. Our experimental results on two publicly available facial images show that SSCMC not only achieves higher accuracy than other related methods, but also is robust to input parameters. 相似文献