首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 20 毫秒
1.
Many researchers have explored the use of neural-network representations for the adaptive processing of data structures. One of the most popular learning formulations of data structure processing is backpropagation through structure (BPTS). The BPTS algorithm has been successful applied to a number of learning tasks that involve structural patterns such as logo and natural scene classification. The main limitations of the BPTS algorithm are attributed to slow convergence speed and the long-term dependency problem for the adaptive processing of data structures. In this paper, an improved algorithm is proposed to solve these problems. The idea of this algorithm is to optimize the free learning parameters of the neural network in the node representation by using least-squares-based optimization methods in a layer-by-layer fashion. Not only can fast convergence speed be achieved, but the long-term dependency problem can also be overcome since the vanishing of gradient information is avoided when our approach is applied to very deep tree structures.  相似文献   

2.
This paper describes a method of structural pattern recognition based on a genetic evolution processing of data structures with neural networks representation. Conventionally, one of the most popular learning formulations of data structure processing is backpropagation through structures (BPTS) [C. Goller et al., (1996)]. The BPTS algorithm has been successfully applied to a number of learning tasks that involved structural patterns such as image, shape, and texture classifications. However, this BPTS typed algorithm suffers from the long-term dependency problem in learning very deep tree structures. In this paper, we propose the genetic evolution for this data structures processing. The idea of this algorithm is to tune the learning parameters by the genetic evolution with specified chromosome structures. Also, the fitness evaluation as well as the adaptive crossover and mutation for this structural genetic processing are investigated in this paper. An application to flowers image classification by a structural representation is provided for the validation of our method. The obtained results significantly support the capabilities of our proposed approach to classify and recognize flowers in terms of generalization and noise robustness.  相似文献   

3.
Probabilistic sequential independent components analysis   总被引:1,自引:0,他引:1  
Under-complete models, which derive lower dimensional representations of input data, are valuable in domains in which the number of input dimensions is very large, such as data consisting of a temporal sequence of images. This paper presents the under-complete product of experts (UPoE), where each expert models a one-dimensional projection of the data. Maximum-likelihood learning rules for this model constitute a tractable and exact algorithm for learning under-complete independent components. The learning rules for this model coincide with approximate learning rules proposed earlier for under-complete independent component analysis (UICA) models. This paper also derives an efficient sequential learning algorithm from this model and discusses its relationship to sequential independent component analysis (ICA), projection pursuit density estimation, and feature induction algorithms for additive random field models. This paper demonstrates the efficacy of these novel algorithms on high-dimensional continuous datasets.  相似文献   

4.
Incremental learning of neural networks has attracted much interest in recent years due to its wide applicability to large scale data sets and to distributed learning scenarios. Moreover, nonstationary learning paradigms have also emerged as a subarea of study in Machine Learning literature due to the problems of classical methods when dealing with data set shifts. In this paper we present an algorithm to train single layer neural networks with nonlinear output functions that take into account incremental, nonstationary and distributed learning scenarios. Moreover, it is demonstrated that introducing a regularization term into the proposed model is equivalent to choosing a particular initialization for the devised training algorithm, which may be suitable for real time systems that have to work under noisy conditions. In addition, the algorithm includes some previous models as special cases and can be used as a block component to build more complex models such as multilayer perceptrons, extending the capacity of these models to incremental, nonstationary and distributed learning paradigms. In this paper, the proposed algorithm is tested with standard data sets and compared with previous approaches, demonstrating its higher accuracy.  相似文献   

5.
Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. We call this as the Semi-supervised improvement problem, to distinguish the proposed approach from the existing approaches. We design a metasemi-supervised learning algorithm that wraps around the underlying supervised algorithm and improves its performance using unlabeled data. This problem is particularly important when we need to train a supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semi-supervised learning, termed as SemiBoost. The key advantages of the proposed semi-supervised learning approach are: 1) performance improvement of any supervised learning algorithm with a multitude of unlabeled data, 2) efficient computation by the iterative boosting algorithm, and 3) exploiting both manifold and cluster assumption in training classification models. An empirical study on 16 different data sets and text categorization demonstrates that the proposed framework improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples. We also show that the performance of the proposed algorithm, SemiBoost, is comparable to the state-of-the-art semi-supervised learning algorithms.  相似文献   

6.
Fuzzy cognitive maps have been widely used as abstract models for complex networks. Traditional ways to construct fuzzy cognitive maps rely on domain knowledge. In this paper, we propose to use fuzzy cognitive map learning algorithms to discover domain knowledge in the form of causal networks from data. More specifically, we propose to infer gene regulatory networks from gene expression data. Furthermore, a new efficient fuzzy cognitive map learning algorithm based on a decomposed genetic algorithm is developed to learn large scale networks. In the proposed algorithm, the simulation error is used as the objective function, while the model error is expected to be minimized. Experiments are performed to explore the feasibility of this approach. The high accuracy of the generated models and the approximate correlation between simulation errors and model errors suggest that it is possible to discover causal networks using fuzzy cognitive map learning. We also compared the proposed algorithm with ant colony optimization, differential evolution, and particle swarm optimization in a decomposed framework. Comparison results reveal the advantage of the decomposed genetic algorithm on datasets with small data volumes, large network scales, or the presence of noise.  相似文献   

7.
近年来,机器学习技术飞速发展,并在自然语言处理、图像识别、搜索推荐等领域得到了广泛的应用.然而,现有大量开放部署的机器学习模型在模型安全与数据隐私方面面临着严峻的挑战.本文重点研究黑盒机器学习模型面临的成员推断攻击问题,即给定一条数据记录以及某个机器学习模型的黑盒预测接口,判断此条数据记录是否属于给定模型的训练数据集....  相似文献   

8.
跨镜行人追踪是计算机视觉和视频监控公共安全体系构建等领域的重要课题。伴随大规模数据集的发展和深度学习网络的广泛研究,深度学习在跨镜行人追踪问题中取得了良好效果。然而在应用中,除了监控视频自身的不同摄像头、不同视角引起的不同视觉表象变化外,面向跨镜行人追踪的整体数据集偏小,具有标记的训练数据样本量更小,从而制约了基于深度学习的跨镜行人追踪效果。提出了改进型深度迁移学习的跨镜行人追踪算法,将在大数据集上训练好的成熟模型进行微调并迁移到目标数据集上,结合目标数据进行优化,使其能更好地针对新数据集做特征提取。在模型训练过程中,通过改进三元组损失函数,拉近相同样本之间的距离,加大不同样本之间的距离,同时设定正样本之间的最大距离阈值,从而保证特征空间生成的簇不会太大,利于模型的优化。该算法减少了深度学习训练模型的时间,避免了小数据集上数据量不足等缺点,提高了跨镜行人追踪的准确度。在五个基准数据集上的跨镜行人追踪对比实验显示,改进算法取得了良好效果。  相似文献   

9.
针对车联网联邦学习服务难以满足用户训练个性化模型的需求,提出一种创新性的车联网联邦学习模型定制化服务框架。该框架采用了一种融合设备贡献度和数据集相似性的联邦学习聚合算法,实现了个性化联邦学习。该算法通过不同权重分配方式和相似性计算,使得不同用户可以根据自己的需求和数据特征,选择合适的模型训练方案。该框架还提出了一种双重抽样验证方法,解决了模型性能和可信度问题;此外,利用智能合约支持数据协作,保障了数据的安全性。实验结果表明,提出算法在大多数实验场景中表现出较高的准确率,该框架可以显著提高车联网服务的个性化水平,同时保证模型的准确性和可靠性。  相似文献   

10.
Liang  Shunpan  Pan  Weiwei  You  Dianlong  Liu  Ze  Yin  Ling 《Applied Intelligence》2022,52(12):13398-13414

Multi-label learning has attracted many attentions. However, the continuous data generated in the fields of sensors, network access, etc., that is data streams, the scenario brings challenges such as real-time, limited memory, once pass. Several learning algorithms have been proposed for offline multi-label classification, but few researches develop it for dynamic multi-label incremental learning models based on cascading schemes. Deep forest can perform representation learning layer by layer, and does not rely on backpropagation, using this cascading scheme, this paper proposes a multi-label data stream deep forest (VDSDF) learning algorithm based on cascaded Very Fast Decision Tree (VFDT) forest, which can receive examples successively, perform incremental learning, and adapt to concept drift. Experimental results show that the proposed VDSDF algorithm, as an incremental classification algorithm, is more competitive than batch classification algorithms on multiple indicators. Moreover, in dynamic flow scenarios, the adaptability of VDSDF to concept drift is better than that of the contrast algorithm.

  相似文献   

11.
方伟 《计算机应用研究》2021,38(9):2640-2645
由传统机器学习方法组成的空气质量预测模型得到了普遍应用,但是此类模型对于数据有效性,特别是时空相关数据的选取仍旧存在不足.针对深度学习输入数据有效性问题进行研究,提出了一种基于时空相似LSTM的预测模型(spatial-temporal similarity LSTM model,STS-LSTM),以便在时间和空间层面选取更加有效的数据.STS-LSTM分为前序、中序和后序三个模块,前序模块为时空相似选择输入模块,提出了格兰杰因果权重动态时间折叠(Granger causal index weighted dynamic time warping,GCWDTW)算法,用于选取具有更高时空相似性的数据;中序模块使用LSTM作为深度学习网络进行训练;后序模块根据目标站点特征选择不同的输出组合进行集成.STS-LSTM整体模型在空气质量预测误差上较现有算法提升了8%左右,经过有效性选取的数据对于模型精度达到了最高21%的提升.实验结果表明,对于有效数据的选取该算法取得了显著效果,将数据输入输出方法作为应用型深度学习网络的一部分,可以有效提升深度学习网络的最终效果.  相似文献   

12.
本文提出一种基于K-means聚类与机器学习回归算法的预测模型以解决零售行业多个商品的销售预测问题,首先通过聚类分析识别出具有相似销售模式的商品从而实现数据集的划分,然后分别在每个子数据集上训练了支持向量回归、随机森林以及XGBoost模型,通过构建数据池的方式增加了用于训练模型的数据量以及预测变量的选择范围.在一家零售企业的真实销售数据集上对提出的模型进行了验证,实验结果表明基于K-means和支持向量回归的预测模型表现最优,且所提出的模型预测效果明显优于基准模型以及不使用聚类的机器学习模型.  相似文献   

13.
14.
针对目前几种免疫网络模型在数据聚类方面应用的一些不足,在aiNet免疫算法的基础上结合函数优化的思想提出一种基于目标可调控的免疫模型。并在算法中给出目标控制函数,和细胞记忆库的概念。本算法提高了免疫学习质量并从整体上对免疫网络进行优化。  相似文献   

15.
Improving energy efficiency in buildings represents one of the main challenges faced by engineers. In fields like lighting control systems, the effect of low quality sensors compromises the control strategy and the emergence of new technologies also degrades the data quality introducing linguistic values. This research analyzes the aforementioned problem and shows that, in the field of lighting control systems, the uncertainty in the measurements gathered from sensors should be considered in the design of control loops. To cope with this kind of problems Hybrid Intelligent methods will be used. Moreover, a method for learning equation-based white box models with this low quality data is proposed. The equation-based models include a representation of the uncertainty inherited in the data. Two different evolutive algorithms are use for learning the models: the well-known NSGA-II genetic algorithm and a multi-objective simulated annealing algorithm hybridized with genetic operators. The performance of both algorithms is found valid to evolve this learning process. This novel approach is evaluated with synthetic problems.  相似文献   

16.
Self-organizing maps (SOM) have been applied on numerous data clustering and visualization tasks and received much attention on their success. One major shortage of classical SOM learning algorithm is the necessity of predefined map topology. Furthermore, hierarchical relationships among data are also difficult to be found. Several approaches have been devised to conquer these deficiencies. In this work, we propose a novel SOM learning algorithm which incorporates several text mining techniques in expanding the map both laterally and hierarchically. On training a set of text documents, the proposed algorithm will first cluster them using classical SOM algorithm. We then identify the topics of each cluster. These topics are then used to evaluate the criteria on expanding the map. The major characteristic of the proposed approach is to combine the learning process with text mining process and makes it suitable for automatic organization of text documents. We applied the algorithm on the Reuters-21578 dataset in text clustering and categorization tasks. Our method outperforms two comparing models in hierarchy quality according to users’ evaluation. It also receives better F1-scores than two other models in text categorization task.  相似文献   

17.
In this paper, we study the problem of learning from multiple model data for the purpose of document classification. In this problem, each document is composed of two different models of data, i.e., an image and a text. We propose to represent the data of two models by projecting them to a shared data space by using cross-model factor analysis formula and classify them in the shared space by using a linear class label predictor, named cross-model classifier. The parameters of both cross-model classifier and cross-model factor analysis are learned jointly, so that they can regularize the learning of each other. We construct a unified objective function for this learning problem. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projections measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple model document data sets show the advantage of the proposed algorithm over state-of-the-art multimedia data classification methods.  相似文献   

18.
温亚兰  陈美娟 《计算机工程》2022,48(5):145-153+161
随着医疗大数据的发展,医疗数据安全、个人隐私保护等问题日益突出。为在高效利用各个医疗机构医疗数据的同时保护病人的隐私,提出一种将联邦学习与区块链相结合的医疗数据共享与隐私保护方案。使用联邦学习对多源医疗数据进行建模,将训练的模型参数和医疗机构的声誉值存储于区块链上,并利用区块链对贡献高质量数据的医院进行奖励。通过分析数据源质量对联邦学习算法性能的影响,提出一种基于双重主观逻辑模型的声誉值计算算法来改进联邦学习的精确度,使用改进的声誉机制保证在数据共享中筛选数据源的效率,并利用区块链和联邦学习技术,提高共享效率和实现隐私保护。此外,利用Tensorflow搭建分布式平台并对算法性能进行对比分析,实验结果表明,所提方案能够筛选出高质量的数据源,减少边缘节点与恶意节点的交互时间,即使当声誉值在0.5以上时,也能达到0.857的学习精确度。  相似文献   

19.
A structured-based neural network (NN) with backpropagation through structure (BPTS) algorithm is conducted for image classification in organizing a large image database, which is a challenging problem under investigation. Many factors can affect the results of image classification. One of the most important factors is the architecture of a NN, which consists of input layer, hidden layer and output layer. In this study, only the numbers of nodes in hidden layer (hidden nodes) of a NN are considered. Other factors are kept unchanged. Two groups of experiments including 2,940 images in each group are used for the analysis. The assessment of the effects for the first group is carried out with features described by image intensities, and, the second group uses features described by wavelet coefficients. Experimental results demonstrate that the effects of the numbers of hidden nodes on the reliability of classification are significant and non-linear. When the number of hidden nodes is 17, the classification rate on training set is up to 95%, and arrives at 90% on the testing set. The results indicate that 17 is an appropriate choice for the number of hidden nodes for the image classification when a structured-based NN with BPTS algorithm is applied.  相似文献   

20.
In this paper, a new approach for centralised and distributed learning from spatial heterogeneous databases is proposed. The centralised algorithm consists of a spatial clustering followed by local regression aimed at learning relationships between driving attributes and the target variable inside each region identified through clustering. For distributed learning, similar regions in multiple databases are first discovered by applying a spatial clustering algorithm independently on all sites, and then identifying corresponding clusters on participating sites. Local regression models are built on identified clusters and transferred among the sites for combining the models responsible for identified regions. Extensive experiments on spatial data sets with missing and irrelevant attributes, and with different levels of noise, resulted in a higher prediction accuracy of both centralised and distributed methods, as compared to using global models. In addition, experiments performed indicate that both methods are computationally more efficient than the global approach, due to the smaller data sets used for learning. Furthermore, the accuracy of the distributed method was comparable to the centralised approach, thus providing a viable alternative to moving all data to a central location.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号