首页 | 本学科首页   官方微博 | 高级检索  
 共查询到19条相似文献,搜索用时 375 毫秒
为充分利用高光谱影像中蕴含的空谱特征,提出了一种半监督空谱局部判别分析的高光谱影像特征提取算法(S4LFDA)。鉴于高光谱数据集具有空间一致性,首先将像元进行空间重构,保存高光谱数据的近邻关系;其次引入光谱信息散度重构像元间的相似度;为了充分利用大量无标签样本提高算法性能,采用模糊C均值聚类算法对样本进行聚类分析得到伪标签;然后通过增加规范化项到局部力导引算法(FDA)的类内散度矩阵和类间散度矩阵中,以此保持无标签样本的聚类结构一致性;最后通过局部FDA算法来保持有标签样本类间散度最大化和类内散度最小化并求解最佳投影向量。S4LFDA算法既保持了数据集在光谱域的可分性,又保持了像元在空间区域内的近邻关系,合理利用有标签样本及无标签样本,提高了算法的分类性能。在Pavia University和Indian Pines数据集上进行实验,总体分类精度达到95.60%和94.38%。与其他维数约简算法相比,该算法有效提高了地物分类性能。  相似文献   

为了进一步提高改进的渐进直推式支持向量机学习算法(IPTSVML)的速度,提出了一种结合K近邻法(KNN)的改进的渐进直推式支持向量机学习算法,利用KNN对无标签样本集进行删减,去掉对学习作用不大的无标签样本,再对有标签样本集和剩余的无标签样本集利用IPTSVML算法进行学习与分类。雷达实测数据实验结果表明该算法是有效的。  相似文献   

传统固定资产分类查询技术在应用过程中,易出现通信冗余问题,导致分类查询计算开销较大。因此,设计基于智能盘点平台的电力企业固定资产分类查询方法。优化RFID标签数据存储格式,规范RFID标签数据,建立无线接收器和数据通信网络结构,优化查询通信方式,减少冗余。结合SVM分类和K近邻算法,对不同的资产分类情况选择不同算法,计算K近邻算法中的欧氏距离,实现固定资产分类查询。实验结果表明:分类维度偏大时,所设计的分类查询技术在智能盘点平台和用户端的计算开销均小于传统方法。  相似文献   

针对高光谱图像谱段数目较多、近邻谱段相关性过高而导致分类困难的问题,提出了一种自适应差分进化特征选择的高光谱图像分类算法.首先初始化种群向量集,利用自适应差分进化算法搜索特征的自适应性生成特征子集;然后,通过使用ReliefF技术根据特征排序去除重复特征,从而为所有的特征构建一个特征列表;最后,借助于模糊k-近邻分类器计算每个向量的分类精度,利用包裹模型评估特征子集.在印第安纳数据集和KSC数据集上的实验结果验证了算法的有效性及可靠性,实验结果表明,相比其他几种特征选择算法,该算法取得了更高的总分类精度和更好的Kappa系数.  相似文献   

作为一种非参数的分类算法,K近邻(KNN)算法简单有效并且易于实现。但传统的KNN算法认为所有的近邻样本贡献相等,这就使得算法容易受到噪声的干扰,同时对于大的数据集,KNN的计算代价非常大。针对上述问题,该文提出了一种新的基于距离加权的模板约简K近邻算法(TWKNN)。利用模板约简技术,将训练集中远离分类边界的样本去掉,同时按照各个近邻与待测样本的距离为K个近邻赋予不同的权值,增强了算法的鲁棒性。实验结果表明,该方法可以有效地减少训练样本数目,同时还能保持传统KNN的分类精度。  相似文献   

赵凤  吝晓娟  刘汉强 《信号处理》2020,36(9):1544-1556
现有的直觉模糊聚类算法应用于图像分割时,往往只考虑图像的像素信息,忽略了图像的几何特征和区域信息,使得分割效果不太理想。为了提高直觉模糊聚类算法的分割性能,提出一种融合对称特性的混合标签传递半监督直觉模糊聚类算法。该算法首先对图像进行对称轴检测获取图像的对称特性,接着利用图像的对称特性进行对称像素的标签传递并改进像素对聚类中心的直觉模糊距离测度,然后设计一种混合标签传递半监督策略,对所有像素进行隶属度的估计并将其作为监督隶属度进行引入,随后构建融合对称特性的混合标签传递半监督直觉模糊聚类目标函数,通过聚类获得最终的分割结果。两个彩色图像库上的实验结果表明,该算法能够将目标从复杂背景中完整的分割出来,分割性能优于对比算法。   相似文献   

如今,图数据分类变得越来越重要。最近几十年对它的研究也越来越多,并且得到了广泛应用。传统的图数据分类研究主要集中在单标签集,然而在很多应用中,每个图数据都会同时具有多个标签集。这篇文章研究了关于图数据的多标签特征提取问题,并提出基于模糊测量函数的多标签图数据特征提取算法,用于搜索最优子图集。算法采用模糊测量函数作为评估标准评估子图特征的重要性,然后通过边枝界定算法修剪子图搜索空间有效地搜索最优子图特征。实验证明,该方法在现实应用中有较高的精度。  相似文献   

保持近邻嵌入(NPE)算法对局部线性嵌入(LLE)算法进行了改进,克服了新来样本问题,但在处理分类问题上表现不足。本文提出了一种半监督稀疏保持近邻判别嵌入算法,该方法首先采用小波变换对数据进行预处理,然后执行等距离映射(Isomap)算法选择合适的低维嵌入维数,最后结合稀疏表示理论、NPE和线性判别分析(LDA)的思想,重构邻域图,并在建立目标函数时使得已标签信息中同类样本点之间相互靠近,异类样本点之间相互远离,未标签信息邻域信息得以保持,这样,既得到了高维映射函数,又提高了分类正确率。通过在人脸数据库上实验,并和其他半监督算法作比较,本文提出的算法在识别率上表现较好。  相似文献   

支持向量机(Support Vector Machine,SVM)是在统计学习理论基础上发展起来的一种新的机器学习方法,已成为目前研究的热点,并在模式识别领域有了广泛的应用.首先分析了支持向量机原理,随后引入一种改进的径向基核函数,在此基础上,提出了一种改进核函数的SVM模式分类方法.与基于IRIS数据,进行了计算机仿真实验,与基干模糊k-近邻的模式分类仿真结果比较,结果表明改进的SVM方法分类性能比模糊k-近邻算法(Fuzzy k-Nearest Neighbor,FKNN)的分类性能更好,运算时间更短,更易于实时实现.  相似文献   

一种改进的快速k-近邻分类算法   总被引:14,自引:0,他引:14       下载免费PDF全文
乔玉龙  潘正祥  孙圣和 《电子学报》2005,33(6):1146-1149
本文提出了一种新的有效的k-近邻分类快速算法.该算法利用向量的方差和在小波域中的逼近系数得出两个重要的不等式.在搜索k-近邻的过程中,首先判断每个训练向量是否满足这两个不等式,由此排除大量不可能成为k-近邻的向量,从而可以快速的找到未知样本的k个近邻,使得在保持k-近邻法分类性能不变的情况下,分类的效率得到很大地提高.最后,我们以纹理分类为例验证算法的有效性.  相似文献   

Because of computational complexity, the deep neural network (DNN) in embedded devices is usually trained on high-performance computers or graphic processing units (GPUs), and only the inference phase is implemented in embedded devices. Data processed by embedded devices, such as smartphones and wearables, are usually personalized, so the DNN model trained on public data sets may have poor accuracy when inferring the personalized data. As a result, retraining DNN with personalized data collected locally in embedded devices is necessary. Nevertheless, retraining needs labeled data sets, while the data collected locally are unlabeled, then how to retrain DNN with unlabeled data is a problem to be solved. This paper proves the necessity of retraining DNN model with personalized data collected in embedded devices after trained with public data sets. It also proposes a label generation method by which a fake label is generated for each unlabeled training case according to users’ feedback, thus retraining can be performed with unlabeled data collected in embedded devices. The experimental results show that our fake label generation method has both good training effects and wide applicability. The advanced neural networks can be trained with unlabeled data from embedded devices and the individualized accuracy of the DNN model can be gradually improved along with personal using.  相似文献   

Classifier design often faces a lack of sufficient labeled data because the class labels are identified by experienced analysts and therefore collecting labeled data often costs much. To mitigate this problem, several learning methods have been proposed to effectively use unlabeled data that can be inexpensively collected. In these methods, however, only static data have been considered; time series unlabeled data cannot be dealt with by these methods. Focusing on Hidden Markov Models (HMMs), in this paper we first present an extension of HMMs, named Extended Tied-Mixture HMMs (ETM-HMMs), in which both labeled and unlabeled time series data can be utilized simultaneously. We also formally derive a learning algorithm for the ETM-HMMs based on the maximum likelihood framework. Experimental results using synthetic and real time series data show that we can obtain a certainly better classification accuracy when unlabeled time series data are added to labeled training data than the case only labeled data are used.  相似文献   

The authors study the use of unlabeled samples in reducing the problem of small training sample size that can severely affect the recognition rate of classifiers when the dimensionality of the multispectral data is high. The authors show that by using additional unlabeled samples that are available at no extra cost, the performance may be improved, and therefore the Hughes phenomenon can be mitigated. Furthermore, by experiments, they show that by using additional unlabeled samples more representative estimates can be obtained. They also propose a semiparametric method for incorporating the training (i.e., labeled) and unlabeled samples simultaneously into the parameter estimation process  相似文献   

针对大规模的高光谱数据分类,为了利用未标签样本所含信息,来提升分类器性能,提出了一种半监督分类算法。该算法根据聚类假设,即属于同一类地物的样本点在聚类中被分为同一类的可能性较大的原则来改进核函数,采用基于光谱角度量的K均值聚类算法对样本集进行聚类,根据多次聚类的结果,构造包袋核函数,然后利用加法和乘法运算将包袋核函数和RBF核函数组合成新的核函数,从而把未标签样本信息融入分类器。而且采用最小二乘支持向量机,将标准支持向量机的二次规划问题转换为求解线性方程组的问题。高光谱实测数据实验表明了本文方法的优越性。   相似文献   

This paper presents a bigdata framework based on regularized deep neural networks for automated diagnostics for manufacturing machinery based on emitted sound, vibration, and magnetic field data. More precisely, we present SpotCheck, a prototype system that uses well-regularized deep neural networks to analyze sound, vibrational, and magnetic emissions of industrial machinery to provide noninvasive machine diagnostics, both for fault detection and to meter the day to day mode of operation of the machinery. It is completely automatic requiring no manual extraction of handcrafted features. It can operate with relatively small amounts of training data, but can take advantage of large volumes of unlabeled data when available, and scale to very large volumes of labeled or unlabeled data to improve performance as more data becomes available after deployment.  相似文献   

For large-scale radio frequency identification ( RFID) indoor positioning system, the positioning scale isrelatively large, with less labeled data and more unlabeled data, and it is easily affected by multipath and whitenoise. An RFID positioning algorithm based on semi-supervised actor-critic co-training (SACC) was proposed tosolve this problem. In this research, the positioning is regarded as Markov decision-making process. Firstly, theactor-critic was combined with random actions and selects the unlabeled best received signal arrival intensity(RSSI) data by co-training of the semi-supervised. Secondly, the actor and the critic were updated by employingKronecker-factored approximation calculate (K-FAC) natural gradient. Finally, the target position was obtained byco-locating with labeled RSSI data and the selected unlabeled RSSI data. The proposed method reduced the cost ofindoor positioning significantly by decreasing the number of labeled data. Meanwhile, with the increase of thepositioning targets, the actor could quickly select unlabeled RSSI data and updates the location model. Experimentshows that, compared with other RFID indoor positioning algorithms, such as twin delayed deep deterministic policygradient (TD3), deep deterministic policy gradient (DDPG), and actor-critic using Kronecker-factored trustregion ( ACKTR), the proposed method decreased the average positioning error respectively by 50.226%,41.916%, and 25.004%. Meanwhile, the positioning stability was improved by 23.430%, 28.518%, and38.631%.  相似文献   

分类是一种监督学习方法,通过在训练数据集学习模型判定未知样本的类标号。与传统的分类思想不同,该文从影响函数的角度理解分类,即从训练样本集对未知样本的影响来判定未知样本的类标号。首先介绍基于影响函数分类的思想;其次给出影响函数的定义,设计3种影响函数;最后基于这3种影响函数,提出基于影响函数的k-近邻(kNN)分类方法。并将该方法应用到非平衡数据集分类中。在18个UCI数据集上的实验结果表明,基于影响函数的k-近邻分类方法的分类性能好于传统的k-近邻分类方法,且对非平衡数据集分类有效。  相似文献   

余立  李哲  高飞  袁向阳  杨永 《电信科学》2021,37(10):136-142
质差用户识别是降低用户投诉率、提升用户满意度的重要环节。针对当前电信网络系统中业务感知相关的大量结构化及非结构化数据难以有效标注、质差用户标签不完备、现有监督学习模型训练样本不均衡而导致质差识别率低的问题,采用改进自训练半监督学习模型,利用少量满意度低分和投诉用户作为质差用户标签对网络数据进行标注,并通过标签迁移对大量未标注数据进行训练识别质差用户。实验表明,相比于识别准确率高但是训练成本高的全监督学习和识别准确率低的无监督学习,半监督学习可以充分利用无标签样本数据进行有效训练,保证较低训练成本的同时显著提升质差用户识别准确率。  相似文献   

Canonical correlation analysis (CCA) is an efficient method for dimensionality reduction on two-view data. However, as an unsupervised learning method, CCA cannot utilize partly given label information in multi-view semi-supervised scenarios. In this paper, we propose a novel two-view semi-supervised learning method, called semi-supervised canonical correlation analysis based on label propagation (LPbSCCA). LPbSCCA incorporates a new sparse representation based label propagation algorithm to infer label information for unlabeled data. Specifically, it firstly constructs dictionaries consisting of all labeled samples; and then obtains reconstruction coefficients of unlabeled samples using sparse representation technique; at last, by combining given labels of labeled samples, estimates label information for unlabeled ones. After that, it constructs soft label matrices of all samples and probabilistic within-class scatter matrices in each view. Finally, in order to enhance discriminative power of features, it is formulated to maximize the correlations between samples of the same class from cross views, while minimizing within-class variations in the low-dimensional feature space of each view simultaneously. Furthermore, we also extend a general model called LPbSMCCA to handle data from multiple (more than two) views. Extensive experimental results from several well-known datasets demonstrate that the proposed methods can achieve better recognition performances and robustness than existing related methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号