期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An accelerated learning algorithm for multilayer perceptrons: optimization layer by layer. 总被引：14，自引：0，他引：14

S Ergezinger E Thomsen 《Neural Networks, IEEE Transactions on》1995,6(1):31-42

Multilayer perceptrons are successfully used in an increasing number of nonlinear signal processing applications. The backpropagation learning algorithm, or variations hereof, is the standard method applied to the nonlinear optimization problem of adjusting the weights in the network in order to minimize a given cost function. However, backpropagation as a steepest descent approach is too slow for many applications. In this paper a new learning procedure is presented which is based on a linearization of the nonlinear processing elements and the optimization of the multilayer perceptron layer by layer. In order to limit the introduced linearization error a penalty term is added to the cost function. The new learning algorithm is applied to the problem of nonlinear prediction of chaotic time series. The proposed algorithm yields results in both accuracy and convergence rates which are orders of magnitude superior compared to conventional backpropagation learning. 相似文献

2.

Sparse coding and dictionary learning with class-specific group sparsity

Sun Yuping Quan Yuhui Fu Jia 《Neural computing & applications》2018,30(4):1265-1275

In recent years, sparse coding via dictionary learning has been widely used in many applications for exploiting sparsity patterns of data. For classification, useful sparsity patterns should have discrimination, which cannot be well achieved by standard sparse coding techniques. In this paper, we investigate structured sparse coding for obtaining discriminative class-specific group sparsity patterns in the context of classification. A structured dictionary learning approach for sparse coding is proposed by considering the $\ell _{2,0}$ norm on each class of data. An efficient numerical algorithm with global convergence is developed for solving the related challenging $\ell _{2,0}$ minimization problem. The learned dictionary is decomposed into class-specific dictionaries for the classification that is done according to the minimum reconstruction error among all the classes. For evaluation, the proposed method was applied to classifying both the synthetic data and real-world data. The experiments show the competitive performance of the proposed method in comparison with several existing discriminative sparse coding methods.

相似文献

3.

Combination of hyperbolic functions for multimodal biometrics data fusion.

Kar-Ann Toh Wei-Yun Yau 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2004,34(2):1196-1209

In this paper, we treat the problem of combining fingerprint and speech biometric decisions as a classifier fusion problem. By exploiting the specialist capabilities of each classifier, a combined classifier may yield results which would not be possible in a single classifier. The Feedforward Neural Network provides a natural choice for such data fusion as it has been shown to be a universal approximator. However, the training process remains much to be a trial-and-error effort since no learning algorithm can guarantee convergence to optimal solution within finite iterations. In this work, we propose a network model to generate different combinations of the hyperbolic functions to achieve some approximation and classification properties. This is to circumvent the iterative training problem as seen in neural networks learning. In many decision data fusion applications, since individual classifiers or estimators to be combined would have attained a certain level of classification or approximation accuracy, this hyperbolic functions network can be used to combine these classifiers taking their decision outputs as the inputs to the network. The proposed hyperbolic functions network model is first applied to a function approximation problem to illustrate its approximation capability. This is followed by some case studies on pattern classification problems. The model is finally applied to combine the fingerprint and speaker verification decisions which show either better or comparable results with respect to several commonly used methods. 相似文献

4.

Lazy Learning in Radial Basis Neural Networks: A Way of Achieving More Accurate Models

José M. Valls Inés M. Galván Pedro Isasi 《Neural Processing Letters》2004,20(2):105-124

Radial Basis Neural Networks have been successfully used in a large number of applications having in its rapid convergence time one of its most important advantages. However, the level of generalization is usually poor and very dependent on the quality of the training data because some of the training patterns can be redundant or irrelevant. In this paper, we present a learning method that automatically selects the training patterns more appropriate to the new sample to be approximated. This training method follows a lazy learning strategy, in the sense that it builds approximations centered around the novel sample. The proposed method has been applied to three different domainsan artificial regression problem and two time series prediction problems. Results have been compared to standard training method using the complete training data set and the new method shows better generalization abilities. 相似文献

5.

Efficient greedy feature selection for unsupervised learning

Ahmed K. Farahat Ali Ghodsi Mohamed S. Kamel 《Knowledge and Information Systems》2013,35(2):285-310

Reducing the dimensionality of the data has been a challenging task in data mining and machine learning applications. In these applications, the existence of irrelevant and redundant features negatively affects the efficiency and effectiveness of different learning algorithms. Feature selection is one of the dimension reduction techniques, which has been used to allow a better understanding of data and improve the performance of other learning tasks. Although the selection of relevant features has been extensively studied in supervised learning, feature selection in the absence of class labels is still a challenging task. This paper proposes a novel method for unsupervised feature selection, which efficiently selects features in a greedy manner. The paper first defines an effective criterion for unsupervised feature selection that measures the reconstruction error of the data matrix based on the selected subset of features. The paper then presents a novel algorithm for greedily minimizing the reconstruction error based on the features selected so far. The greedy algorithm is based on an efficient recursive formula for calculating the reconstruction error. Experiments on real data sets demonstrate the effectiveness of the proposed algorithm in comparison with the state-of-the-art methods for unsupervised feature selection. 相似文献

6.

Trace Ratio Problem Revisited

Yangqing Jia Feiping Nie Changshui Zhang 《Neural Networks, IEEE Transactions on》2009,20(4):729-735

Dimensionality reduction is an important issue in many machine learning and pattern recognition applications, and the trace ratio (TR) problem is an optimization problem involved in many dimensionality reduction algorithms. Conventionally, the solution is approximated via generalized eigenvalue decomposition due to the difficulty of the original problem. However, prior works have indicated that it is more reasonable to solve it directly than via the conventional way. In this brief, we propose a theoretical overview of the global optimum solution to the TR problem via the equivalent trace difference problem. Eigenvalue perturbation theory is introduced to derive an efficient algorithm based on the Newton-Raphson method. Theoretical issues on the convergence and efficiency of our algorithm compared with prior literature are proposed, and are further supported by extensive empirical results. 相似文献

7.

An information theoretic sparse kernel algorithm for online learning

《Expert systems with applications》2014,41(9):4349-4359

Kernel-based algorithms have been proven successful in many nonlinear modeling applications. However, the computational complexity of classical kernel-based methods grows superlinearly with the increasing number of training data, which is too expensive for online applications. In order to solve this problem, the paper presents an information theoretic method to train a sparse version of kernel learning algorithm. A concept named instantaneous mutual information is investigated to measure the system reliability of the estimated output. This measure is used as a criterion to determine the novelty of the training sample and informative ones are selected to form a compact dictionary to represent the whole data. Furthermore, we propose a robust learning scheme for the training of the kernel learning algorithm with an adaptive learning rate. This ensures the convergence of the learning algorithm and makes it converge to the steady state faster. We illustrate the performance of our proposed algorithm and compare it with some recent kernel algorithms by several experiments. 相似文献

8.

Fuzzy Clustering Using A Compensated Fuzzy Hopfield Network 总被引：1，自引：0，他引：1

Lin Jzau-Sheng 《Neural Processing Letters》1999,10(1):35-48

Hopfield neural networks are well known for cluster analysis with an unsupervised learning scheme. This class of networks is a set of heuristic procedures that suffers from several problems such as not guaranteed convergence and output depending on the sequence of input data. In this paper, a Compensated Fuzzy Hopfield Neural Network (CFHNN) is proposed which integrates a Compensated Fuzzy C-Means (CFCM) model into the learning scheme and updating strategies of the Hopfield neural network. The CFCM, modified from Penalized Fuzzy C-Means algorithm (PFCM), is embedded into a Hopfield net to avoid the NP-hard problem and to speed up the convergence rate for the clustering procedure. The proposed network also avoids determining values for the weighting factors in the energy function. In addition, its training scheme enables the network to learn more rapidly and more effectively than FCM and PFCM. In experimental results, the CFHNN method shows promising results in comparison with FCM and PFCM methods. 相似文献

9.

强化学习研究综述 总被引：10，自引：2，他引：8

陈学松杨宜民a 《计算机应用研究》2010,27(8):2834-2838

在未知环境中,关于agent的学习行为是一个既充满挑战又有趣的问题,强化学习通过试探与环境交互获得策略的改进,其学习和在线学习的特点使其成为机器学习研究的一个重要分支。介绍了强化学习在理论、算法和应用研究三个方面最新的研究成果,首先介绍了强化学习的环境模型和其基本要素;其次介绍了强化学习算法的收敛性和泛化有关的理论研究问题;然后结合最近几年的研究成果,综述了折扣型回报指标和平均回报指标强化学习算法;最后列举了强化学习在非线性控制、机器人控制、人工智能问题求解、多agent 系统问题等若干领域的成功应用和未来的发展方向。相似文献

10.

Convergence analysis of the OJAn MCA learning algorithm by the deterministic discrete time method

Dezhong Peng Zhang Yi 《Theoretical computer science》2007

Minor component analysis (MCA) is a statistical method of extracting the eigenvector associated with the smallest eigenvalue of the covariance matrix of input signals. Convergence is essential for MCA algorithms towards practical applications. Traditionally, the convergence of MCA algorithms is indirectly analyzed via their corresponding deterministic continuous time (DCT) systems. However, the DCT method requires the learning rate to approach zero, which is not reasonable in many applications due to the round-off limitation and tracking requirements. This paper studies the convergence of the deterministic discrete time (DDT) system associated with the OJAn MCA learning algorithm. Unlike the DCT method, the DDT method does not require the learning rate to approach zero. In this paper, some important convergence results are obtained for the OJAn MCA learning algorithm via the DDT method. Simulations are carried out to illustrate the theoretical results achieved. 相似文献

11.

Global convergence of Oja's PCA learning algorithm with a non-zero-approaching adaptive learning rate

Jian Cheng Lv Zhang Yi K.K. Tan 《Theoretical computer science》2006

A non-zero-approaching adaptive learning rate is proposed to guarantee the global convergence of Oja's principal component analysis (PCA) learning algorithm. Most of the existing adaptive learning rates for Oja's PCA learning algorithm are required to approach zero as the learning step increases. However, this is not practical in many applications due to the computational round-off limitations and tracking requirements. The proposed adaptive learning rate overcomes this shortcoming. The learning rate converges to a positive constant, thus it increases the evolution rate as the learning step increases. This is different from learning rates which approach zero which slow the convergence considerably and increasingly with time. Rigorous mathematical proofs for global convergence of Oja's algorithm with the proposed learning rate are given in detail via studying the convergence of an equivalent deterministic discrete time (DDT) system. Extensive simulations are carried out to illustrate and verify the theory derived. Simulation results show that this adaptive learning rate is more suitable for Oja's PCA algorithm to be used in an online learning situation. 相似文献

12.

Effective gaussian mixture learning for video background subtraction 总被引：15，自引：0，他引：15

Lee DS 《IEEE transactions on pattern analysis and machine intelligence》2005,27(5):827-832

Adaptive Gaussian mixtures have been used for modeling nonstationary temporal distributions of pixels in video surveillance applications. However, a common problem for this approach is balancing between model convergence speed and stability. This paper proposes an effective scheme to improve the convergence rate without compromising model stability. This is achieved by replacing the global, static retention factor with an adaptive learning rate calculated for each Gaussian at every frame. Significant improvements are shown on both synthetic and real video data. Incorporating this algorithm into a statistical framework for background subtraction leads to an improved segmentation performance compared to a standard method. 相似文献

13.

Learning from positive and unlabeled examples

《Theoretical computer science》2005,348(1):70-83

In many machine learning settings, labeled examples are difficult to collect while unlabeled data are abundant. Also, for some binary classification problems, positive examples which are elements of the target concept are available. Can these additional data be used to improve accuracy of supervised learning algorithms? We investigate in this paper the design of learning algorithms from positive and unlabeled data only. Many machine learning and data mining algorithms, such as decision tree induction algorithms and naive Bayes algorithms, use examples only to evaluate statistical queries (SQ-like algorithms). Kearns designed the statistical query learning model in order to describe these algorithms. Here, we design an algorithm scheme which transforms any SQ-like algorithm into an algorithm based on positive statistical queries (estimate for probabilities over the set of positive instances) and instance statistical queries (estimate for probabilities over the instance space). We prove that any class learnable in the statistical query learning model is learnable from positive statistical queries and instance statistical queries only if a lower bound on the weight of any target concept f can be estimated in polynomial time. Then, we design a decision tree induction algorithm POSC4.5, based on C4.5, that uses only positive and unlabeled examples and we give experimental results for this algorithm. In the case of imbalanced classes in the sense that one of the two classes (say the positive class) is heavily underrepresented compared to the other class, the learning problem remains open. This problem is challenging because it is encountered in many real-world applications. 相似文献

14.

一种前馈神经网的快速算法 总被引：2，自引：0，他引：2

张学东姜宏洲张宏勋《信息与控制》2000,29(1):34-39

前馈神经网已经被大量用于非线性信号处理．经典反向传播算法是一种标准的前馈网络学习算法,但是,对许多应用,反向传播算法的收敛速度却很慢．本文根据对网络的非线性单元进行线性化而提出一种新的算法,该算法在非线性信号处理中在精度和收敛速度方面都优于传统的反向传播算法．相似文献

15.

Neural clouds for monitoring of complex systems

B. Lang T. Poppe A. Minin I. Mokhov Y. Kuperin A. Mekler I. Liapakina 《Optical Memory & Neural Networks》2008,17(3):183-192

Condition monitoring is an important and challenging task actual for many areas of industry, medicine and economics. Nowadays it is necessary to provide on-line monitoring of the complex systems status, e.g. the steel production, in order to avoid faults, breakdowns or wrong diagnostics. In the present paper a novel machine learning method for the automated condition monitoring is presented. Neural Clouds (NC) is a novel data encapsulation method, which provides a confidence measure regarding classification of the complex system conditions. The presented adaptive algorithm requires only the data which corresponds to the normal system conditions, which is typically available. At the same time the fault related data acquisition is expensive and fault modeling is not always possible, especially in case one is dealing with steel production, power stations operation, human health condition or critical phenomena in financial markets. These real word applications are also presented in the paper. 相似文献

16.

改进的小波神经网络算法对变流器的故障诊断方法 总被引：1，自引：0，他引：1

段其昌张亮袁景明《计算机应用》2011,31(8):2143-2145

变流器是双馈风力发电系统中的枢纽设备,其运行可靠性直接关系到发电系统的安全与稳定。针对基于递推最小二乘(RLS)算法的离散小波神经网络(DWNN)存在收敛速度慢、收敛精度不高、搜索局部极小等不足,以变流器的电流为分析对象,提出一种采用变加权和变学习率改进算法的小波神经网络的变流器故障诊断方法。选择变流器电流作为离散小波神经网络训练及故障识别样本,对训练过程和仿真结果进行对比分析。实验结果表明:较之RLS算法,改进的小波神经网络故障诊断方法在故障识别准确率和收敛时间方面表现更优。相似文献

17.

粒子群优化算法在神经网络控制中的应用

徐天《工业控制计算机》2010,23(8):68-71

考虑粒子群优化算法在不确定系统的自适应控制中的应用。神经网络在不确定系统的自适应控制中起着重要作用。但传统的梯度下降法训练神经网络时收敛速度慢,容易陷入局部极小,且对网络的初始权值等参数极为敏感。为了克服这些缺点,提出了一种基于粒子群算法优化的RBF神经网络整定PID的控制策略。首先,根据粒子群算法的基本原理提出了优化得到RBF神经网络输出权、节点中心和节点基宽参数的初值的算法。其次,再利用梯度下降法对控制器参数进一步调节。将传统的神经网络控制与基于粒子群优化的神经网络控制进行了对比,结果表明,后者有更好逼近精度。以PID控制器参数整定为例,对一类非线性控制系统进行了仿真。仿真结果表明基于粒子群优化的神经网络控制具有较强的鲁棒性和自适应能力。相似文献

18.

Multivariate time series prediction of lane changing behavior using deep neural network

Jun Gao Yi Lu Murphey Honghui Zhu 《Applied Intelligence》2018,48(10):3523-3537

Many real world pattern classification problems involve the process and analysis of multiple variables in temporal domain. This type of problem is referred to as Multivariate Time Series (MTS) problem. It remains a challenging problem due to the nature of time series data: high dimensionality, large data size and updating continuously. In this paper, we use three types of physiological signals from the driver to predict lane changes before the event actually occurs. These are the electrocardiogram (ECG), galvanic skin response (GSR), and respiration rate (RR) and were determined, in prior studies, to best reflect a driver’s response to the driving environment. A novel Group-wise Convolutional Neural Network, MTS-GCNN model is proposed for MTS pattern classification. In our MTS-GCNN model, we present a new structure learning algorithm in training stage. The algorithm exploits the covariance structure over multiple time series to partition input volume into groups, then learns the MTS-GCNN structure explicitly by clustering input sequences with spectral clustering. Different from other feature-based classification approaches, our MTS-GCNN can select and extract the suitable internal structure to generate temporal and spatial features automatically by using convolution and down-sample operations. The experimental results showed that, in comparison to other state-of-the-art models, our MTS-GCNN performs significantly better in terms of prediction accuracy. 相似文献

19.

Robust Adaptive Gradient-Descent Training Algorithm for Recurrent Neural Networks in Discrete Time Domain

《Neural Networks, IEEE Transactions on》2008,19(11):1841-1853

For a recurrent neural network (RNN), its transient response is a critical issue, especially for real-time signal processing applications. The conventional RNN training algorithms, such as backpropagation through time (BPTT) and real-time recurrent learning (RTRL), have not adequately addressed this problem because they suffer from low convergence speed. While increasing the learning rate may help to improve the performance of the RNN, it can result in unstable training in terms of weight divergence. Therefore, an optimal tradeoff between RNN training speed and weight convergence is desired. In this paper, a robust adaptive gradient-descent (RAGD) training algorithm of RNN is developed based on a novel RNN hybrid training concept. It switches the training patterns between standard real-time online backpropagation (BP) and RTRL according to the derived convergence and stability conditions. The weight convergence and $L_2$-stability of the algorithm are derived via the conic sector theorem. The optimized adaptive learning maximizes the training speed of the RNN for each weight update without violating the stability and convergence criteria. Computer simulations are carried out to demonstrate the applicability of the theoretical results. 相似文献

20.

回声状态网络及其在图像边缘检测中的应用 总被引：2，自引：0，他引：2

下载免费PDF全文

裴承丹《计算机工程与应用》2008,44(19):172-174

循环神经网络（RNN,也称反馈神经网络）是一种重要的人工神经网络,与前馈神经网络相比具有更好的学习能力和更快的收敛速度,但其隐层结构的设计一直是个难点问题。回声状态网络（ESN）有效地解决了上述问题,相比于以前的循环神经网络,其具有结构独特、稳定性好、学习过程简单快捷等特点。介绍了回声状态网络及其学习方法,将其用于图像的边缘检测中,取得了良好的效果。相似文献