期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Dynamic tunneling technique for efficient training of multilayerperceptrons

RoyChowdhury P. Singh Y.P. Chansarkar R.A. 《Neural Networks, IEEE Transactions on》1999,10(1):48-55

A new efficient computational technique for training of multilayer feedforward neural networks is proposed. The proposed algorithm consists of two learning phases. The first phase is a local search which implements gradient descent, and the second phase is a direct search scheme which implements dynamic tunneling in weight space avoiding the local trap and thereby generates the point of next descent. The repeated application of these two phases alternately forms a new training procedure which results in a global minimum point from any arbitrary initial choice in the weight space. The simulation results are provided for five test examples to demonstrate the efficiency of the proposed method which overcomes the problem of initialization and local minimum point in multilayer perceptrons. 相似文献

2.

Inversion of neural networks by gradient descent 总被引：1，自引：0，他引：1

J Kindermann A Linden 《Parallel Computing》1990,14(3):277-286

Inversion answers the question of which input patterns to a trained multilayer neural network approximate a given output target. This method is a tool for visualization of the information processing capability of a network stored in its weights. This knowledge about the network enables one to make informed decisions on the improvement of the training task and the choice of training sets.

An inversion algorithm for multilayer perceptrons is derived from the backpropagation scheme. We apply inversion to networks for digit recognition. We observe that the multilayer perceptrons perform well with respect to generalization, i.e. correct classification of untrained digits. They are however bad on rejection of counterexamples, i.e. random patterns. Inversion gives an explanation for this drawback. We suggest an improved training scheme, and we show that a tradeoff exists between generalization and rejection of counterexamples. 相似文献

3.

Nonlinear single layer neural network training algorithm for incremental,nonstationary and distributed learning scenarios

David Martínez-Rego Oscar Fontenla-Romero Amparo Alonso-Betanzos 《Pattern recognition》2012,45(12):4536-4546

Incremental learning of neural networks has attracted much interest in recent years due to its wide applicability to large scale data sets and to distributed learning scenarios. Moreover, nonstationary learning paradigms have also emerged as a subarea of study in Machine Learning literature due to the problems of classical methods when dealing with data set shifts. In this paper we present an algorithm to train single layer neural networks with nonlinear output functions that take into account incremental, nonstationary and distributed learning scenarios. Moreover, it is demonstrated that introducing a regularization term into the proposed model is equivalent to choosing a particular initialization for the devised training algorithm, which may be suitable for real time systems that have to work under noisy conditions. In addition, the algorithm includes some previous models as special cases and can be used as a block component to build more complex models such as multilayer perceptrons, extending the capacity of these models to incremental, nonstationary and distributed learning paradigms. In this paper, the proposed algorithm is tested with standard data sets and compared with previous approaches, demonstrating its higher accuracy. 相似文献

4.

A review on weight initialization strategies for neural networks

Narkhede Meenal V. Bartakke Prashant P. Sutaone Mukul S. 《Artificial Intelligence Review》2022,55(1):291-322

Over the past few years, neural networks have exhibited remarkable results for various applications in machine learning and computer vision. Weight initialization is a significant step employed before training any neural network. The weights of a network are initialized and then adjusted repeatedly while training the network. This is done till the loss converges to a minimum value and an ideal weight matrix is obtained. Thus weight initialization directly drives the convergence of a network. Therefore, the selection of an appropriate weight initialization scheme becomes necessary for end-to-end training. An appropriate technique initializes the weights such that the training of the network is accelerated and the performance is improved. This paper discusses various advances in weight initialization for neural networks. The weight initialization techniques in the literature adopted for feed-forward neural network, convolutional neural network, recurrent neural network and long short term memory network have been discussed in this paper. These techniques are classified as (1) initialization techniques without pre-training, which are further classified into random initialization and data-driven initialization, (2) initialization techniques with pre-training. The different weight initialization and weight optimization techniques which select optimal weights for non-iterative training mechanism have also been discussed. We provide a close overview of different initialization schemes in these categories. This paper concludes with discussions on existing schemes and the future scope for research.

相似文献

5.

Linear-least-squares initialization of multilayer perceptrons through backpropagation of the desired response 总被引：1，自引：0，他引：1

Erdogmus D. Fontenla-Romero O. Principe J.C. Alonso-Betanzos A. Castillo E. 《Neural Networks, IEEE Transactions on》2005,16(2):325-337

Training multilayer neural networks is typically carried out using descent techniques such as the gradient-based backpropagation (BP) of error or the quasi-Newton approaches including the Levenberg-Marquardt algorithm. This is basically due to the fact that there are no analytical methods to find the optimal weights, so iterative local or global optimization techniques are necessary. The success of iterative optimization procedures is strictly dependent on the initial conditions, therefore, in this paper, we devise a principled novel method of backpropagating the desired response through the layers of a multilayer perceptron (MLP), which enables us to accurately initialize these neural networks in the minimum mean-square-error sense, using the analytic linear least squares solution. The generated solution can be used as an initial condition to standard iterative optimization algorithms. However, simulations demonstrate that in most cases, the performance achieved through the proposed initialization scheme leaves little room for further improvement in the mean-square-error (MSE) over the training set. In addition, the performance of the network optimized with the proposed approach also generalizes well to testing data. A rigorous derivation of the initialization algorithm is presented and its high performance is verified with a number of benchmark training problems including chaotic time-series prediction, classification, and nonlinear system identification with MLPs. 相似文献

6.

A novel high-order associative memory system via discrete Taylor series

Ning-Shou Xu Yun-Fei Bai Li Zhang 《Neural Networks, IEEE Transactions on》2003,14(4):734-747

This paper proposes a novel high-order associative memory system (AMS) based on the discrete Taylor series (DTS). The mathematical foundation for the new AMS scheme is derived, three training algorithms are proposed, and the convergence of learning is proved. The DTS-AMS thus developed is capable of implementing error-free approximation to multivariable polynomial functions of arbitrary order. Compared with cerebellar model articulation controllers and radial basis function neural networks, it provides higher learning precision and less memory request. Furthermore, it offers less training computation and faster convergence rate than that attainable by multilayer perceptron. Numerical simulations show that the proposed DTS-AMS is effective in higher order function approximation and has potential in practical applications. 相似文献

7.

Weight initialization with reference patterns

Mikko Lehtokangas Jukka Saarinen 《Neurocomputing》1998,20(1-3):265-278

The problem of weight initialization in multilayer perceptron networks is considered. A new computationally simple weight initialization method based on the usage of reference patterns is presented. A reference pattern is a vector which is used to represent data points that fall in its vicinity in the data space. On one hand, the proposed method aims to set the initial weight values to be such that inputs to network nodes are within the active region (in other words, nodes are not saturated). On the other hand, the goal is to distribute the discriminant functions formed by the hidden units evenly into the input space area where training data is located. The proposed method is tested with the widely used two-spirals classification benchmark problem and channel equalization problem where several alternatives for obtaining suitable reference patterns are investigated. Also, the effect of the initialization is studied when two commonly used cost functions are used in the training phase. These are the mean square error and relative entropy cost functions. A comparison with the conventional random initialization shows that significant improvement in convergence can be achieved with the proposed method. In addition, the computational cost of the initialization was found to be negligible compared with the cost of training. 相似文献

8.

Comparative analysis of backpropagation and the extended Kalmanfilter for training multilayer perceptrons

Ruck D.W. Rogers S.K. Kabrisky M. Maybeck P.S. Oxley M.E. 《IEEE transactions on pattern analysis and machine intelligence》1992,14(6):686-691

The relationship between backpropagation and extended Kalman filtering for training multilayer perceptrons is examined. These two techniques are compared theoretically and empirically using sensor imagery. Backpropagation is a technique from neural networks for assigning weights in a multilayer perceptron. An extended Kalman filter can also be used for this purpose. A brief review of the multilayer perceptron and these two training methods is provided. Then, it is shown that backpropagation is a degenerate form of the extended Kalman filter. The training rules are compared in two examples: an image classification problem using laser radar Doppler imagery and a target detection problem using absolute range images. In both examples, the backpropagation training algorithm is shown to be three orders of magnitude less costly than the extended Kalman filter algorithm in terms of a number of floating-point operations 相似文献

9.

Maintaining filter structure: A Gabor-based convolutional neural network for image analysis

《Applied Soft Computing》2020

In image segmentation and classification tasks, utilizing filters based on the target object improves performance and requires less training data. We use the Gabor filter as initialization to gain more discriminative power. Considering the mechanism of the error backpropagation procedure to learn the data, after a few updates, filters will lose their initial structure. In this paper, we modify the updating rule in Gradient Descent to maintain the properties of Gabor filters. We use the Left Ventricle (LV) segmentation task and handwritten digit classification task to evaluate our proposed method. We compare Gabor initialization with random initialization and transfer learning initialization using convolutional autoencoders and convolutional networks. We experimented with noisy data and we reduced the amount of training data to compare how different methods of initialization can deal with these matters. The results show that the pixel predictions for the segmentation task are highly correlated with the ground truth. In the classification task, in addition to Gabor and random initialization, we initialized the network using pre-trained weights obtained from a convolutional Autoencoder using two different data sets and pre-trained weights obtained from a convolutional neural network. The experiments confirm the out-performance of Gabor filters comparing to the other initialization method even when using noisy inputs and a lesser amount of training data. 相似文献

10.

Using multithreshold quadratic sigmoidal neurons to improveclassification capability of multilayer perceptrons

Cheng-Chin Chiang Hsin-Chia Fu 《Neural Networks, IEEE Transactions on》1994,5(3):516-519

This letter proposes a new type of neurons called multithreshold quadratic sigmoidal neurons to improve the classification capability of multilayer neural networks. In cooperation with single-threshold quadratic sigmoidal neurons, the multithreshold quadratic sigmoidal neurons can be used to improve the classification capability of multilayer neural networks by a factor of four compared to committee machines and by a factor of two compared to the conventional sigmoidal multilayer perceptrons. 相似文献

11.

Analog implementation of ANN with inherent quadratic nonlinearity of the synapses

Milev M. Hristov M. 《Neural Networks, IEEE Transactions on》2003,14(5):1187-1200

In real-life applications of multilayer neural networks, the scale of integration, processing speed, and manufacturability are of key importance. A simple analog-signal synapse model is implemented on a standard 0.35 /spl mu/m CMOS process requiring no floating-gate capability. A neural-matrix of 2176 analog current-mode synapses arranged in eight layers of 16 neurons with 16 inputs each is constructed for the purpose of a fingerprint feature extraction application. Synapse weights are stored on the analog storage capacitors, and synapse nonlinearity with respect to weight is investigated. The capability of the synapse to operate in feedforward and learning modes is studied and demonstrated. The effect of the synapse's inherent quadratic nonlinearity on learning convergence and on the optimization of vector direction is analyzed. Transistor-level analog simulations verify the hardware circuit. System-level MatLab simulations verify the synapse mathematical model. The conclusion reached is that the proposed implementation is very suitable for large-scale artificial neural networks - especially if on-chip integration with other products on a standard CMOS process is required. 相似文献

12.

The application of ridge polynomial neural network to multi-step ahead financial time series prediction

R. Ghazali A. J. Hussain P. Liatsis H. Tawfik 《Neural computing & applications》2008,17(3):311-323

Motivated by the slow learning properties of multilayer perceptrons (MLPs) which utilize computationally intensive training algorithms, such as the backpropagation learning algorithm, and can get trapped in local minima, this work deals with ridge polynomial neural networks (RPNN), which maintain fast learning properties and powerful mapping capabilities of single layer high order neural networks. The RPNN is constructed from a number of increasing orders of Pi–Sigma units, which are used to capture the underlying patterns in financial time series signals and to predict future trends in the financial market. In particular, this paper systematically investigates a method of pre-processing the financial signals in order to reduce the influence of their trends. The performance of the networks is benchmarked against the performance of MLPs, functional link neural networks (FLNN), and Pi–Sigma neural networks (PSNN). Simulation results clearly demonstrate that RPNNs generate higher profit returns with fast convergence on various noisy financial signals. 相似文献

13.

Multilayer Potts Perceptrons With Levenberg–Marquardt Learning

《Neural Networks, IEEE Transactions on》2008,19(12):2032-2043

This paper presents learning multilayer Potts perceptrons (MLPotts) for data driven function approximation. A Potts perceptron is composed of a receptive field and a $K$ -state transfer function that is generalized from sigmoid-like transfer functions of traditional perceptrons. An MLPotts network is organized to perform translation from a high-dimensional input to the sum of multiple postnonlinear projections, each with its own postnonlinearity realized by a weighted $K$-state transfer function. MLPotts networks span a function space that theoretically covers network functions of multilayer perceptrons. Compared with traditional perceptrons, weighted Potts perceptrons realize more flexible postnonlinear functions for nonlinear mappings. Numerical simulations show MLPotts learning by the Levenberg–Marquardt (LM) method significantly improves traditional supervised learning of multilayer perceptrons for data driven function approximation. 相似文献

14.

A novel population initialization method for accelerating evolutionary algorithms

《Computers & Mathematics with Applications》2007,53(10):1605-1614

Population initialization is a crucial task in evolutionary algorithms because it can affect the convergence speed and also the quality of the final solution. If no information about the solution is available, then random initialization is the most commonly used method to generate candidate solutions (initial population). This paper proposes a novel initialization approach which employs opposition-based learning to generate initial population. The conducted experiments over a comprehensive set of benchmark functions demonstrate that replacing the random initialization with the opposition-based population initialization can accelerate convergence speed. 相似文献

15.

多层前向小世界神经网络及其函数逼近 总被引：1，自引：0，他引：1

李小虎杜海峰张进华王孙安《控制理论与应用》2010,27(7):836-842

借鉴复杂网络的研究成果, 探讨一种在结构上处于规则和随机连接型神经网络之间的网络模型—-多层前向小世界神经网络. 首先对多层前向规则神经网络中的连接依重连概率p进行重连, 构建新的网络模型, 对其特征参数的分析表明, 当0 < p < 1时, 该网络在聚类系数上不同于Watts-Strogatz 模型; 其次用六元组模型对网络进行描述; 最后, 将不同p值下的小世界神经网络用于函数逼近, 仿真结果表明, 当p = 0:1时, 网络具有最优的逼近性能, 收敛性能对比试验也表明, 此时网络在收敛性能、逼近速度等指标上要优于同规模的规则网络和随机网络. 相似文献

16.

Growing subspace pattern recognition methods and theirneural-network models 总被引：3，自引：0，他引：3

Prakash M. Murty M.N. 《Neural Networks, IEEE Transactions on》1997,8(1):161-168

In statistical pattern recognition, the decision of which features to use is usually left to human judgment. If possible, automatic methods are desirable. Like multilayer perceptrons, learning subspace methods (LSMs) have the potential to integrate feature extraction and classification. In this paper, we propose two new algorithms, along with their neural-network implementations, to overcome certain limitations of the earlier LSMs. By introducing one cluster at a time and adapting it if necessary, we eliminate one limitation of deciding how many clusters to have in each class by trial-and-error. By using the principal component analysis neural networks along with this strategy, we propose neural-network models which are better in overcoming another limitation, scalability. Our results indicate that the proposed classifiers are comparable to classifiers like the multilayer perceptrons and the nearest-neighbor classifier in terms of classification accuracy. In terms of classification speed and scalability in design, they appear to be better for large-dimensional problems. 相似文献

17.

Deterministic convergence of conjugate gradient method for feedforward neural networks

Jian Wang Wei Wu Jacek M. Zurada 《Neurocomputing》2011,74(14-15):2368-2376

Conjugate gradient methods have many advantages in real numerical experiments, such as fast convergence and low memory requirements. This paper considers a class of conjugate gradient learning methods for backpropagation neural networks with three layers. We propose a new learning algorithm for almost cyclic learning of neural networks based on PRP conjugate gradient method. We then establish the deterministic convergence properties for three different learning modes, i.e., batch mode, cyclic and almost cyclic learning. The two deterministic convergence properties are weak and strong convergence that indicate that the gradient of the error function goes to zero and the weight sequence goes to a fixed point, respectively. It is shown that the deterministic convergence results are based on different learning modes and dependent on different selection strategies of learning rate. Illustrative numerical examples are given to support the theoretical analysis. 相似文献

18.

一种回声状态网络的权值初始化方法

王磊乔俊飞李晓理《控制与决策》2018,33(2):356-360

为了避免奇异解,提高网络性能,给出一种回声状态网络的权值初始化方法(WIESN).利用柯西不等式和线性代数确定优化的初始权值的范围与输入维数、储备池维数、输入变量和储备池状态相关,从而确保神经元的输出位于sigmoid函数的激活区域.实验结果表明,权值初始化方法的精度和训练时间要优于随机初始化方法,且相比于训练时间,权值初始化的时间是可以忽略不计的. 相似文献

19.

Influences of variable scales and activation functions on the performances of multilayer feedforward neural networks

Gao DaqiAuthor Vitae Yang GenxingAuthor Vitae 《Pattern recognition》2003,36(4):869-878

This paper gives insight into the methods about how to improve the learning capabilities of multilayer feedforward neural networks with linear basis functions in the case of limited number of patterns according to the basic principles of support vector machine (SVM), namely, about how to get the optimal separating hyperplanes. And furthermore, this paper analyses the characteristics of sigmoid-type activation functions, and investigates the influences of absolute sizes of variables on the convergence rate, classification ability and non-linear fitting accuracy of multilayer feedforward networks, and presents the way of how to select suitable activation functions. As a result, this proposed method effectively enhances the learning abilities of multilayer feedforward neural networks by introducing the sum-of-squares weight term into the networks’ error functions and appropriately enlarging the variable components with the help of the SVM theory. Finally, the effectiveness of the proposed method is verified through three classification examples as well as a non-linear mapping one. 相似文献

20.

深度学习中的权重初始化方法研究

邢彤彤孙仁诚邵峰晶隋毅《计算机工程》2022,48(7):104-113

深度神经网络训练的实质是初始化权重不断调整的过程,整个训练过程存在耗费时间长、需要数据量大等问题。大量预训练网络由经过训练的权重数据组成,若能发现预训练网络权重分布规律,利用这些规律来初始化未训练网络,势必会减少网络训练时间。通过对AlexNet、ResNet18网络在ImageNet数据集上的预训练模型权重进行概率分布分析,发现该权重分布具备单侧幂律分布的特征,进而使用双对数拟合的方式进一步验证权重的单侧分布服从截断幂律分布的性质。基于该分布规律,结合防止过拟合的正则化思想提出一种标准化对称幂律分布（NSPL）的初始化方法,并基于AlexNet和ResNet32网络,与He初始化的正态分布、均匀分布两种方法在CIFAR10数据集上进行实验对比,结果表明,NSPL方法收敛速度优于正态分布、均匀分布两种初始化方法,且在ResNet32上取得了更高的精确度。相似文献