首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The authors investigate the effects of state disturbances, output noise, and errors in initial conditions on a class of learning control algorithms. They present a simple learning algorithm and exhibit, via a concise proof, bounds on the asymptotic trajectory errors for the learned input and the corresponding state and output trajectories. Furthermore, these bounds are continuous functions of the bounds on the initial condition errors, state disturbances, and output noise, and the bounds are zero in the absence of these disturbances  相似文献   

2.
Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets.  相似文献   

3.
We consider the minimization over probability measures of the expected value of a random variable, regularized by relative entropy with respect to a given probability distribution. In the general setting we provide a complete characterization of the situations in which a finite optimal value exists and the situations in which a minimizing probability distribution exists. Specializing to the case where the underlying probability distribution is Wiener measure, we characterize finite relative entropy changes of measure in terms of square integrability of the corresponding change of drift. For the optimal change of measure for the relative entropy weighted optimization, an expression involving the Malliavin derivative of the cost random variable is derived. The theory is illustrated by its application to several examples, including the case where the cost variable is the maximum of a standard Brownian motion over a finite time horizon. For this example we obtain an exact optimal drift, as well as an approximation of the optimal drift through a Monte-Carlo algorithm.  相似文献   

4.
Impulse response identification almost always leads to an ill-posed mathematical problem. This fact is the basis for the well-known numerical difficulties of identification by means of the impulse response. The theory of regularizable ill-posed problems furnishes a unifying point of view for several specific methods of impulse response identification. In this paper we introduce a class of input/output representations, which we call λ-representations, for linear, time-invariant systems. For many cases of practical interest the identification of one of these representations is mathematically well-posed. Its determination is thus relatively insensitive to certain experimental uncertainties, and rational error-in-identification bounds may be found, so that λ-identification is often an attractive alternative to impulse response identification in the nonparametric modeling of physical systems which must be identified from input/ output records. We investigate the effects of input and output uncertainties (noise) on λ-identification, and discuss the problem of finding minimal realizations from these representations. We illustrate the work with an example of electromagnetic pulse (EMP) threat prediction using experimental data. Hard error bounds are provided on the predicted threat. For this problem, the appropriate λ-representation turns out to be the ramp response.  相似文献   

5.
In this paper, we propose a new information-theoretic method to simplify the computation of information and to unify several methods in one framework. The new method is called “supposed maximum information,” used to produce humanly comprehensible representations in competitive learning by taking into account the importance of input units. In the new learning method, by supposing the maximum information of input units, the actual information of input units is estimated. Then, the competitive network is trained with the estimated information in input units. The method is applied not to pure competitive learning, but to self-organizing maps, because it is easy to demonstrate visually how well the new method can produce more interpretable representations. We applied the method to three well-known sets of data, namely, the Kohonen animal data, the SPECT heart data and the voting data from the machine learning database. With these data, we succeeded in producing more explicit class boundaries on the U-matrices than did the conventional SOM. In addition, for all the data, quantization and topographic errors produced by our method were lower than those by the conventional SOM.  相似文献   

6.
7.
This paper deals with an MIMO feedback control system that has two channels with additive noises and studies the effects of the noises on the input and output signals of the plant. We derive achievable bounds of integral type for sensitivity-like properties of the system based on an information theoretic approach. These bounds are generalizations of Bode’s integral formula for the case that the feedback system includes nonlinear elements.  相似文献   

8.
Kulkarni  S.R.  Mitter  S.K.  Tsitsiklis  J.N. 《Machine Learning》1993,11(1):23-35
The original and most widely studied PAC model for learning assumes a passive learner in the sense that the learner plays no role in obtaining information about the unknown concept. That is, the samples are simply drawn independently from some probability distribution. Some work has been done on studying more powerful oracles and how they affect learnability. To find bounds on the improvement in sample complexity that can be expected from using oracles, we consider active learning in the sense that the learner has complete control over the information received. Specifically, we allow the learner to ask arbitrary yes/no questions. We consider both active learning under a fixed distribution and distribution-free active learning. In the case of active learning, the underlying probability distribution is used only to measure distance between concepts. For learnability with respect to a fixed distribution, active learning does not enlarge the set of learnable concept classes, but can improve the sample complexity. For distribution-free learning, it is shown that a concept class is actively learnable iff it is finite, so that active learning is in fact less powerful than the usual passive learning model. We also consider a form of distribution-free learning in which the learner knows the distribution being used, so that distribution-free refers only to the requirement that a bound on the number of queries can be obtained uniformly over all distributions. Even with the side information of the distribution being used, a concept class is actively learnable iff it has finite VC dimension, so that active learning with the side information still does not enlarge the set of learnable concept classes.  相似文献   

9.
10.
11.
设计了一种新的适用于大数据的管理和分析模型——大数据随机样本划分(Random sample partition,RSP)模型,它是将大数据文件表达成一系列RSP数据块文件的集合,分布存储在集群节点上。RSP的生成操作使每个RSP数据块的分布与大数据的分布保持统计意义上的一致,因此,每个RSP数据块是大数据的一个随机样本数据,可以用来估计大数据的统计特征,或建立大数据的分类和回归模型。基于RSP模型,大数据的分析任务可以通过对RSP数据块的分析来完成,不需要对整个大数据进行计算,极大地减少了计算量,降低了对计算资源的要求,提高了集群系统的计算能力和扩展能力。本文首先给出RSP模型的定义、理论基础和生成方法;然后介绍基于RSP数据块的渐近式集成学习Alpha计算框架;之后讨论基于RSP模型和Alpha框架的大数据分析相关计算技术,包括:数据探索与清洗、概率密度函数估计、有监督子空间学习、半监督集成学习、聚类集成和异常点检测;最后讨论RSP模型在分而治之大数据分析和抽样方法上的创新,以及RSP模型和Alpha计算框架实现大规模数据分析的优势。  相似文献   

12.
Confidence-based active learning   总被引:1,自引:0,他引:1  
This paper proposes a new active learning approach, confidence-based active learning, for training a wide range of classifiers. This approach is based on identifying and annotating uncertain samples. The uncertainty value of each sample is measured by its conditional error. The approach takes advantage of current classifiers' probability preserving and ordering properties. It calibrates the output scores of classifiers to conditional error. Thus, it can estimate the uncertainty value for each input sample according to its output score from a classifier and select only samples with uncertainty value above a user-defined threshold. Even though we cannot guarantee the optimality of the proposed approach, we find it to provide good performance. Compared with existing methods, this approach is robust without additional computational effort. A new active learning method for support vector machines (SVMs) is implemented following this approach. A dynamic bin width allocation method is proposed to accurately estimate sample conditional error and this method adapts to the underlying probabilities. The effectiveness of the proposed approach is demonstrated using synthetic and real data sets and its performance is compared with the widely used least certain active learning method.  相似文献   

13.
A novel parameter learning scheme using multi-signal processing is developed that aims at estimating parameters of the Hammerstein nonlinear model with output disturbance in this paper. The Hammerstein nonlinear model consists of a static nonlinear block and a dynamic linear block, and the multi-signals are devised to estimate separately the nonlinear block parameters and the linear block parameters; the parameter estimation procedure is greatly simplified. Firstly, in view of the input–output data of separable signals, the linear block parameters are computed through correlation analysis method, thereby the influence of output noise is effectively handled. In addition, model error probability density function technology is employed to estimate the nonlinear block parameters with the help of measurable input–output data of random signals, which not only controls the space state distribution of model error but also makes error distribution tends to normal distribution. The simulation results demonstrate that the developed approach obtains high learning accuracy and small modeling error, which verifies the effectiveness of the developed approach.  相似文献   

14.
We consider the problem of generating balanced training samples from an unlabeled data set with an unknown class distribution. While random sampling works well when the data are balanced, it is very ineffective for unbalanced data. Other approaches, such as active learning and cost-sensitive learning, are also suboptimal as they are classifier-dependent and require misclassification costs and labeled samples, respectively. We propose a new strategy for generating training samples, which is independent of the underlying class distribution of the data and the classifier that will be trained using the labeled data. Our methods are iterative and can be seen as variants of active learning, where we use semi-supervised clustering at each iteration to perform biased sampling from the clusters. We provide several strategies to estimate the underlying class distributions in the clusters and to increase the balancedness in the training samples. Experiments with both highly skewed and balanced data from the UCI repository and a private data set show that our algorithm produces much more balanced samples than random sampling or uncertainty sampling. Further, our sampling strategy is substantially more efficient than active learning methods. The experiments also validate that, with more balanced training data, classifiers trained with our samples outperform classifiers trained with random sampling or active learning.  相似文献   

15.
We address the problem of estimating a function f: [0,1] d [-L,L] by using feedforward sigmoidal networks with a single hidden layer and bounded weights. The only information about the function is provided by an identically independently distributed sample generated according to an unknown distribution. The quality of the estimate is quantified by the expected cost functional and depends on the sample size. We use Lipschitz properties of the cost functional and of the neural networks to derive the relationship between performance bounds and sample sizes within the framework of Valiant's probably approximately correct learning.  相似文献   

16.
Dependency networks approximate a joint probability distribution over multiple random variables as a product of conditional distributions. Relational Dependency Networks (RDNs) are graphical models that extend dependency networks to relational domains. This higher expressivity, however, comes at the expense of a more complex model-selection problem: an unbounded number of relational abstraction levels might need to be explored. Whereas current learning approaches for RDNs learn a single probability tree per random variable, we propose to turn the problem into a series of relational function-approximation problems using gradient-based boosting. In doing so, one can easily induce highly complex features over several iterations and in turn estimate quickly a very expressive model. Our experimental results in several different data sets show that this boosting method results in efficient learning of RDNs when compared to state-of-the-art statistical relational learning approaches.  相似文献   

17.
Xiao  Yueyue  Huang  Wei  Oh  Sung-Kwun  Zhu  Liehuang 《Applied Intelligence》2022,52(6):6398-6412

In this paper, we propose a polynomial kernel neural network classifier (PKNNC) based on the random sampling and information gain. Random sampling is used here to generate datasets for the construction of polynomial neurons located in the neural networks, while information gain is used to evaluate the importance of the input variables (viz. dataset features) of each neuron. Both random sampling and information gain stem from the concepts of well-known random forest models. Some traditional neural networks have certain limitations, such as slow convergence speed, easily falling to local optima and difficulty describing the polynomial relation between the input and output. In this regard, a general PKNNC is proposed, and it consists of three parts: the premise, conclusion, and aggregation. The method of designing the PKNNC is summarized as follows. In the premise section, random sampling and information gain are used to obtain multiple subdatasets that are passed to the aggregation part, and the conclusion part uses three types of polynomials. In the aggregation part, the least squares method (LSM) is used to estimate the parameters of polynomials. Furthermore, the particle swarm optimization (PSO) algorithm is exploited here to optimize the PKNNC. The overall optimization of the PKNNC combines structure optimization and parameter optimization. The PKNNC takes advantage of three types of polynomial kernel functions, random sampling techniques and information gain algorithms, which have a good ability to describe the higher-order nonlinear relationships between input and output variables and have high generalization and fast convergence capabilities. To evaluate the effectiveness of the PKNNC, numerical experiments are carried out on two types of data: machine learning data and face data. A comparative study illustrates that the proposed PKNNC leads to better performance than several conventional models.

  相似文献   

18.
Kernel-based algorithms have been proven successful in many nonlinear modeling applications. However, the computational complexity of classical kernel-based methods grows superlinearly with the increasing number of training data, which is too expensive for online applications. In order to solve this problem, the paper presents an information theoretic method to train a sparse version of kernel learning algorithm. A concept named instantaneous mutual information is investigated to measure the system reliability of the estimated output. This measure is used as a criterion to determine the novelty of the training sample and informative ones are selected to form a compact dictionary to represent the whole data. Furthermore, we propose a robust learning scheme for the training of the kernel learning algorithm with an adaptive learning rate. This ensures the convergence of the learning algorithm and makes it converge to the steady state faster. We illustrate the performance of our proposed algorithm and compare it with some recent kernel algorithms by several experiments.  相似文献   

19.
Knowledge about the distribution of a statistical estimator is important for various purposes, such as the construction of confidence intervals for model parameters or the determination of critical values of tests. A widely used method to estimate this distribution is the so-called boot-strap, which is based on an imitation of the probabilistic structure of the data-generating process on the basis of the information provided by a given set of random observations. In this article we investigate this classical method in the context of artificial neural networks used for estimating a mapping from input to output space. We establish consistency results for bootstrap estimates of the distribution of parameter estimates.  相似文献   

20.
Noise can improve how memoryless neurons process signals and maximize their throughput information. Such favorable use of noise is the so-called "stochastic resonance" or SR effect at the level of threshold neurons and continuous neurons. This work presents theoretical and simulation evidence that 1) lone noisy threshold and continuous neurons exhibit the SR effect in terms of the mutual information between random input and output sequences, 2) a new statistically robust learning law can find this entropy-optimal noise level, and 3) the adaptive SR effect is robust against highly impulsive noise with infinite variance. Histograms estimate the relevant probability density functions at each learning iteration. A theorem shows that almost all noise probability density functions produce some SR effect in threshold neurons even if the noise is impulsive and has infinite variance. The optimal noise level in threshold neurons also behaves nonlinearly as the input signal amplitude increases. Simulations further show that the SR effect persists for several sigmoidal neurons and for Gaussian radial-basis-function neurons.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号