首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Previous developments in conditional density estimation have used neural nets to estimate statistics of the distribution or the marginal or joint distributions of the input-output variables. We modify the joint distribution estimating sigmoidal neural network to estimate the conditional distribution. Thus, the probability density of the output conditioned on the inputs is estimated using a neural network. We derive and implement the learning laws to train the network. We show that this network has computational advantages over a brute force ratio of joint and marginal distributions. We also compare its performance to a kernel conditional density estimator in a larger scale (higher dimensional) problem simulating more realistic conditions.  相似文献   

2.
Previous work on statistical language modeling has shown that it is possible to train a feedforward neural network to approximate probabilities over sequences of words, resulting in significant error reduction when compared to standard baseline models based on n-grams. However, training the neural network model with the maximum-likelihood criterion requires computations proportional to the number of words in the vocabulary. In this paper, we introduce adaptive importance sampling as a way to accelerate training of the model. The idea is to use an adaptive n-gram model to track the conditional distributions produced by the neural network. We show that a very significant speedup can be obtained on standard problems.  相似文献   

3.
Context-specific independence representations, such as tree-structured conditional probability distributions, capture local independence relationships among the random variables in a Bayesian network (BN). Local independence relationships among the random variables can also be captured by using attribute-value hierarchies to find an appropriate abstraction level for the values used to describe the conditional probability distributions. Capturing this local structure is important because it reduces the number of parameters required to represent the distribution. This can lead to more robust parameter estimation and structure selection, more efficient inference algorithms, and more interpretable models. In this paper, we introduce Tree-Abstraction-Based Search (TABS), an approach for learning a data distribution by inducing the graph structure and parameters of a BN from training data. TABS combines tree structure and attribute-value hierarchies to compactly represent conditional probability tables. To construct the attribute-value hierarchies, we investigate two data-driven techniques: a global clustering method, which uses all of the training data to build the attribute-value hierarchies, and can be performed as a preprocessing step; and a local clustering method, which uses only the local network structure to learn attribute-value hierarchies. We present empirical results for three real-world domains, finding that (1) combining tree structure and attribute-value hierarchies improves the accuracy of generalization, while providing a significant reduction in the number of parameters in the learned networks, and (2) data-derived hierarchies perform as well or better than expert-provided hierarchies.  相似文献   

4.
In this study, the deep multi-layered group method of data handling (GMDH)-type neural network algorithm using revised heuristic self-organization method is proposed and applied to medical image diagnosis of liver cancer. The deep GMDH-type neural network can automatically organize the deep neural network architecture which has many hidden layers. The structural parameters such as the number of hidden layers, the number of neurons in hidden layers and useful input variables are automatically selected to minimize prediction error criterion defined as Akaike’s information criterion (AIC) or prediction sum of squares (PSS). The architecture of the deep neural network is automatically organized using the revised heuristic self-organization method which is a type of the evolutionary computation. This new neural network algorithm is applied to the medical image diagnosis of the liver cancer and the recognition results are compared with the conventional 3-layered sigmoid function neural network.  相似文献   

5.
发现高维观测数据空间的低维流形结构,是流形学习的主要目标。在前人利用神经网络进行非线性降维的基础上,提出一种新的连续自编码(Continuous Autoencoder,C-Autoencoder)网络,该方法特别采用CRBM(Continuous Restricted Boltzmann Machine)的网络结构,通过训练具有多个中间层的双向深层神经网络可将高维连续数据转换成低维嵌套并继而重构高维连续数据。特别地,这种连续自编码网络可以提供高维连续数据空间和低维嵌套结构的双向映射,不仅有效解决了大多数非线性降维方法所不具备的逆向映射问题,而且特别适用于高维连续数据的降维和重构。将C-Autoencoder用于人工连续数据的实验表明,C-Autoencoder不仅能发现嵌入在高维连续数据中的非线性流形结构,也能有效地从低维嵌套中恢复原始高维连续数据。  相似文献   

6.
基于DNN的低资源语音识别特征提取技术   总被引:1,自引:0,他引:1  
秦楚雄  张连海 《自动化学报》2017,43(7):1208-1219
针对低资源训练数据条件下深层神经网络(Deep neural network,DNN)特征声学建模性能急剧下降的问题,提出两种适合于低资源语音识别的深层神经网络特征提取方法.首先基于隐含层共享训练的网络结构,借助资源较为丰富的语料实现对深层瓶颈神经网络的辅助训练,针对BN层位于共享层的特点,引入Dropout,Maxout,Rectified linear units等技术改善多流训练样本分布不规律导致的过拟合问题,同时缩小网络参数规模、降低训练耗时;其次为了改善深层神经网络特征提取方法,提出一种基于凸非负矩阵分解(Convex-non-negative matrix factorization,CNMF)算法的低维高层特征提取技术,通过对网络的权值矩阵分解得到基矩阵作为特征层的权值矩阵,然后从该层提取一种新的低维特征.基于Vystadial 2013的1小时低资源捷克语训练语料的实验表明,在26.7小时的英语语料辅助训练下,当使用Dropout和Rectified linear units时,识别率相对基线系统提升7.0%;当使用Dropout和Maxout时,识别率相对基线系统提升了12.6%,且网络参数数量相对其他系统降低了62.7%,训练时间降低了25%.而基于矩阵分解的低维特征在单语言训练和辅助训练的两种情况下都取得了优于瓶颈特征(Bottleneck features,BNF)的识别率,且在辅助训练的情况下优于深层神经网络隐马尔科夫识别系统,提升幅度从0.8%~3.4%不等.  相似文献   

7.
In this study, a revised group method of data handling (GMDH)-type neural network algorithm which self-selects the optimum neural network architecture is applied to 3-dimensional medical image analysis of the heart. The GMDH-type neural network can automatically organize the neural network architecture by using the heuristic self-organization method, which is the basic theory of the GMDH algorism. The heuristic self-organization method is a kind of evolutionary computation method. In this revised GMDH-type neural network algorithm, the optimum neural network architecture was automatically organized using the polynomial and sigmoid function neurons. Furthermore, the structural parameters, such as the number of layers, the number of neurons in the hidden layers, and the useful input variables, are selected automatically in order to minimize the prediction error criterion, defined as the prediction sum of squares (PSS).  相似文献   

8.
深度学习通过学习深层非线性网络结构即可实现复杂函数的逼近,可以从大量无标注样本集中学习数据集的本质特征。而深度信念网络(DBN)是由多层随机隐变量组成的贝叶斯概率生成模型,可以作为深度神经网络的预训练环节,为该网络提供初始权重。基于该模型的一个高效学习算法不仅解决了模型训练速度慢的问题,还能产生非常好的参数初始值,极大地提升了模型的建模能力。金融市场是一个多变量非线性系统,通过运用DBN模型进行分析预测可以很好地解决其他预测方法初始权重难以确定的问题。文中以原油期货市场价格预测为例,说明了运用DBN模型进行预测和决策的可行性及有效性。  相似文献   

9.
Echo state networks (ESNs) constitute a novel approach to recurrent neural network (RNN) training, with an RNN (the reservoir) being generated randomly, and only a readout being trained using a simple, computationally efficient algorithm. ESNs have greatly facilitated the practical application of RNNs, outperforming classical approaches on a number of benchmark tasks. This paper studies the formulation of a class of copula-based semiparametric models for sequential data modeling, characterized by nonparametric marginal distributions modeled by postulating suitable echo state networks, and parametric copula functions that help capture all the scale-free temporal dependence of the modeled processes. We provide a simple algorithm for the data-driven estimation of the marginal distribution and the copula parameters of our model under the maximum-likelihood framework. We exhibit the merits of our approach by considering a number of applications; as we show, our method offers a significant enhancement in the dynamical data modeling capabilities of ESNs, without significant compromises in the algorithm's computational efficiency.  相似文献   

10.
This study examines the capability of neural networks for linear time-series forecasting. Using both simulated and real data, the effects of neural network factors such as the number of input nodes and the number of hidden nodes as well as the training sample size are investigated. Results show that neural networks are quite competent in modeling and forecasting linear time series in a variety of situations and simple neural network structures are often effective in modeling and forecasting linear time series.Scope and purposeNeural network capability for nonlinear modeling and forecasting has been established in the literature both theoretically and empirically. The purpose of this paper is to investigate the effectiveness of neural networks for linear time-series analysis and forecasting. Several research studies on neural network capability for linear problems in regression and classification have yielded mixed findings. This study aims to provide further evidence on the effectiveness of neural network with regard to linear time-series forecasting. The significance of the study is that it is often difficult in reality to determine whether the underlying data generating process is linear or nonlinear. If neural networks can compete with traditional forecasting models for linear data with noise, they can be used in even broader situations for forecasting researchers and practitioners.  相似文献   

11.
Probability distributions have been in use for modeling of random phenomenon in various areas of life. Generalization of probability distributions has been the area of interest of several authors in the recent years. Several situations arise where joint modeling of two random phenomenon is required. In such cases the bivariate distributions are needed. Development of the bivariate distributions necessitates certain conditions, in a field where few work has been performed. This paper deals with a bivariate beta-inverse Weibull distribution. The marginal and conditional distributions from the proposed distribution have been obtained. Expansions for the joint and conditional density functions for the proposed distribution have been obtained. The properties, including product, marginal and conditional moments, joint moment generating function and joint hazard rate function of the proposed bivariate distribution have been studied. Numerical study for the dependence function has been implemented to see the effect of various parameters on the dependence of variables. Estimation of the parameters of the proposed bivariate distribution has been done by using the maximum likelihood method of estimation. Simulation and real data application of the distribution are presented.  相似文献   

12.
诸多神经网络模型已被证明极易遭受对抗样本攻击。对抗样本则是攻击者为模型所恶意构建的输入,通过对原始样本输入添加轻微的扰动,导致其极易被机器学习模型错误分类。这些对抗样本会对日常生活中的高要求和关键应用的安全构成严重威胁,如自动驾驶、监控系统和生物识别验证等应用。研究表明在模型的训练期间,检测对抗样本方式相比通过增强模型来预防对抗样本攻击更为有效,且训练期间神经网络模型的中间隐层可以捕获并抽象样本信息,使对抗样本与干净样本更容易被模型所区分。因此,本文针对神经网络模型中的不同隐藏层,其对抗样本输入和原始自然输入的隐层表示进行统计特征差异进行研究。本文研究表明,统计差异可以在不同层之间进行区别。本文通过确定最有效层识别对抗样本和原始自然训练数据集统计特征之间的差异,并采用异常值检测方法,设计一种基于特征分布的对抗样本检测框架。该框架可以分为广义对抗样本检测方法和条件对抗样本检测方法,前者通过在每个隐层中提取学习到的训练数据表示,得到统计特征后,计算测试集的异常值分数,后者则通过深层神经网络模型对测试数据的预测结果比较,得到对应训练数据的统计特征。本文所计算的统计特征包括到原点的范数距离L2和样本协方差矩阵的顶奇异向量的相关性。实验结果显示了两种检测方法均可以利用隐层信息检测出对抗样本,且对由不同攻击产生的对抗样本均具有较好的检测效果,证明了本文所提的检测框架在检测对抗样本中的有效性。  相似文献   

13.
使用密码猜测算法是评估用户密码强度和安全性的有效方法,提出一种基于条件变分自编码密码猜测算法PassCVAE。算法基于条件变分自编码模型,将用户个人信息作为条件特征,训练密码攻击模型。在编码器端,分别使用双向循环神经网络(GRU)和文本卷积神经网络(TextCNN),实现对密码序列和用户个人信息的编码和特征的抽象提取;在解码器端使用两层GRU神经网络,实现对用户个人信息和密码数据隐编码的解码,生成密码序列。该算法可以有效地拟合密码数据的分布和字符组合规律,生成高质量的猜测密码数据。多组实验结果表明,提出的PassCVAE算法优于现有的主流密码猜测算法。  相似文献   

14.
Experiments on the application of IOHMMs to model financial returnsseries   总被引:1,自引:0,他引:1  
Input-output hidden Markov models (IOHMM) are conditional hidden Markov models in which the emission (and possibly the transition) probabilities can be conditioned on an input sequence. For example, these conditional distributions can be linear, logistic, or nonlinear (using for example multilayer neural networks). We compare the generalization performance of several models which are special cases of input-output hidden Markov models on financial time-series prediction tasks: an unconditional Gaussian, a conditional linear Gaussian, a mixture of Gaussians, a mixture of conditional linear Gaussians, a hidden Markov model, and various IOHMMs. The experiments compare these models on predicting the conditional density of returns of market and sector indices. Note that the unconditional Gaussian estimates the first moment with the historical average. The results show that, although for the first moment the historical average gives the best results, for the higher moments, the IOHMMs yielded significantly better performance, as estimated by the out-of-sample likelihood.  相似文献   

15.
现有级联非线性加性噪声模型可解决隐藏中间变量的因果方向推断问题,然而对于包含隐变量和级联传递因果关系的因果网络学习存在全局结构搜索、等价类无法识别等问题。设计一种面向非时序观测数据的两阶段因果结构学习算法,第一阶段根据观测数据变量间的条件独立性,构建基本的因果网络骨架,第二阶段基于级联非线性加性噪声模型,通过比较骨架中每个相邻因果对在不同因果方向假设下的边缘似然度进行因果方向推断。实验结果表明,该算法在虚拟因果结构数据集的不同隐变量数量、平均入度、结构维度、样本数量下均表现突出,且在真实因果结构数据集中的F1值相比主流因果结构学习算法平均提升了51%,具有更高的准确率和更强的鲁棒性。  相似文献   

16.
Di  Xiao-Jun  John A.   《Neurocomputing》2007,70(16-18):3019
Real-world systems usually involve both continuous and discrete input variables. However, in existing learning algorithms of both neural networks and fuzzy systems, these mixed variables are usually treated as continuous without taking into account the special features of discrete variables. It is inefficient to represent each discrete input variable having only a few fixed values by one input neuron with full connection to the hidden layer. This paper proposes a novel hierarchical hybrid fuzzy neural network to represent systems with mixed input variables. The proposed model consists of two levels: the lower level are fuzzy sub-systems each of which aggregates several discrete input variables into an intermediate variable as its output; the higher level is a neural network whose input variables consist of continuous input variables and intermediate variables. For systems or function approximations with mixed variables, it is shown that the proposed hierarchical hybrid fuzzy neural networks outperform standard neural networks in accuracy with fewer parameters, and both provide greater transparency and preserve the universal approximation property (i.e., they can approximate any function with mixed input variables to any degree of accuracy).  相似文献   

17.
韩红桂  林征来  乔俊飞 《控制与决策》2017,32(12):2169-2175
为了实现模糊神经网络结构和参数的同时调整,提出一种基于无迹卡尔曼滤波(UKF)的增长型模糊神经网络(UKF-GFNN).首先,利用UKF对模糊神经网络的参数进行调整;然后,设计一种基于隐含层神经元输出强度的模糊规则增长机制,实现模糊神经网络的结构增长;最后,将所提出的增长型模糊神经网络应用于非线性系统建模.实验结果显示,基于UKF的增长型模糊神经网络能够实现结构和参数的自校正,并且具有较高的建模精度.  相似文献   

18.
19.
Application of neural networks in forecasting engine systems reliability   总被引:5,自引:0,他引:5  
This paper presents a comparative study of the predictive performances of neural network time series models for forecasting failures and reliability in engine systems. Traditionally, failure data analysis requires specifications of parametric failure distributions and justifications of certain assumptions, which are at times difficult to validate. On the other hand, the time series modeling technique using neural networks provides a promising alternative. Neural network modeling via feed-forward multilayer perceptron (MLP) suffers from local minima problems and long computation time. The radial basis function (RBF) neural network architecture is found to be a viable alternative due to its shorter training time. Illustrative examples using reliability testing and field data showed that the proposed model results in comparable or better predictive performance than traditional MLP model and the linear benchmark based on Box–Jenkins autoregressive-integrated-moving average (ARIMA) models. The effects of input window size and hidden layer nodes are further investigated. Appropriate design topologies can be determined via sensitivity analysis.  相似文献   

20.
In this study, the revised group method of data handling (GMDH)-type neural network (NN) algorithm self-selecting the optimum neural network architecture is applied to the identification of a nonlinear system. In this algorithm, the optimum neural network architecture is automatically organized using two kinds of neuron architecture, such as the polynomial- and sigmoid function-type neurons. Many combinations of the input variables, in which the high order effects of the input variables are contained, are generated using the polynomial-type neurons, and useful combinations are selected using the prediction sum of squares (PSS) criterion. These calculations are iterated, and the multilayered architecture is organized. Furthermore, the structural parameters, such as the number of layers, the number of neurons in the hidden layers, and the useful input variables, are automatically selected in order to minimize the prediction error criterion defined as PSS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号