首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 468 毫秒
1.
Subspace information criterion for model selection   总被引:7,自引:0,他引:7  
The problem of model selection is considerably important for acquiring higher levels of generalization capability in supervised learning. In this article, we propose a new criterion for model selection, the subspace information criterion (SIC), which is a generalization of Mallows's C(L). It is assumed that the learning target function belongs to a specified functional Hilbert space and the generalization error is defined as the Hilbert space squared norm of the difference between the learning result function and target function. SIC gives an unbiased estimate of the generalization error so defined. SIC assumes the availability of an unbiased estimate of the target function and the noise covariance matrix, which are generally unknown. A practical calculation method of SIC for least-mean-squares learning is provided under the assumption that the dimension of the Hilbert space is less than the number of training examples. Finally, computer simulations in two examples show that SIC works well even when the number of training examples is small.  相似文献   

2.
最小平方误差算法的正则化核形式   总被引:2,自引:0,他引:2  
最小平方误差算法是最常用的一种经典模式识别和回归分析方法,其目标是使线性函 数输出与期望输出的误差平方和为最小.该文应用满足Meteer条件的核函数和正则化技术,改 造经典的最小平方误差算法,提出了基于核函数和正则化技术的非线性最小平方误差算法,即 最小平方误差算法的正则化核形式,其目标函数包含基于核的非线性函数的输出与期望输出的 误差平方和,及一个适当的正则项.正则化技术可以处理病态问题,同时可以减小解空间和控制 解的推广性,文中采用了三种平方型的正则项,并且根据正则项的概率解释,详细比较了三种正 则项之间的差别.最后,用仿真资料和实际资料进一步分析算法的性能.  相似文献   

3.
4.
正则化最小二乘分类(RLSC)是一种基于二次损失函数的正则化网络,其推广能力受模型参数影响,传统的模型选择方法是耗时的参数网格搜索.为此,提出一种新颖的AlignLoo模型选择方法,其关键在于将核参数与超参数分开优化,即最大化核-目标配准以选择最优核参数,最小化RLSC的留一法误差的界以选择最优超参数.该方法效率高且不需验证样本,并在IDA数据集上进行了测试,结果表明方法有效.  相似文献   

5.
Hagiwara K 《Neural computation》2002,14(8):1979-2002
In considering a statistical model selection of neural networks and radial basis functions under an overrealizable case, the problem of unidentifiability emerges. Because the model selection criterion is an unbiased estimator of the generalization error based on the training error, this article analyzes the expected training error and the expected generalization error of neural networks and radial basis functions in overrealizable cases and clarifies the difference from regular models, for which identifiability holds. As a special case of an overrealizable scenario, we assumed a gaussian noise sequence as training data. In the least-squares estimation under this assumption, we first formulated the problem, in which the calculation of the expected errors of unidentifiable networks is reduced to the calculation of the expectation of the supremum of the chi2 process. Under this formulation, we gave an upper bound of the expected training error and a lower bound of the expected generalization error, where the generalization is measured at a set of training inputs. Furthermore, we gave stochastic bounds on the training error and the generalization error. The obtained upper bound of the expected training error is smaller than in regular models, and the lower bound of the expected generalization error is larger than in regular models. The result tells us that the degree of overfitting in neural networks and radial basis functions is higher than in regular models. Correspondingly, it also tells us that the generalization capability is worse than in the case of regular models. The article may be enough to show a difference between neural networks and regular models in the context of the least-squares estimation in a simple situation. This is a first step in constructing a model selection criterion in an overrealizable case. Further important problems in this direction are also included in this article.  相似文献   

6.

This paper develops an unbiased iterative parameter identification algorithm for time-delay nonlinear rational systems. In order to reduce redundant parameters, the time delay is estimated by the iterative estimator based on the L1 regularization technique and the cross-validation strategy first, then a rational system is decomposed into two subsystems by the hierarchical principle. The unbiased parameter estimates are obtained by the least squares iterative technique. The simulation example shows the effectiveness of the proposed algorithm.

  相似文献   

7.
Nonparametric regression is widely used as a method of characterizing a non-linear relationship between a variable of interest and a set of covariates. Practical application of nonparametric regression methods in the field of small area estimation is fairly recent, and has so far focussed on the use of empirical best linear unbiased prediction under a model that combines a penalized spline (p-spline) fit and random area effects. The concept of model-based direct estimation is used to develop an alternative nonparametric approach to estimation of a small area mean. The suggested estimator is a weighted average of the sample values from the area, with weights derived from a linear regression model with random area effects extended to incorporate a smooth, nonparametrically specified trend. Estimation of the mean squared error of the proposed small area estimator is also discussed. Monte Carlo simulations based on both simulated and real datasets show that the proposed model-based direct estimator and its associated mean squared error estimator perform well. They are worth considering in small area estimation applications where the underlying population regression relationships are non-linear or have a complicated functional form.  相似文献   

8.
Model Selection for Small Sample Regression   总被引:7,自引:0,他引:7  
Model selection is an important ingredient of many machine learning algorithms, in particular when the sample size in small, in order to strike the right trade-off between overfitting and underfitting. Previous classical results for linear regression are based on an asymptotic analysis. We present a new penalization method for performing model selection for regression that is appropriate even for small samples. Our penalization is based on an accurate estimator of the ratio of the expected training error and the expected generalization error, in terms of the expected eigenvalues of the input covariance matrix.  相似文献   

9.
基于正则化路径的支持向量机近似模型选择   总被引:2,自引:0,他引:2  
模型选择问题是支持向量机的基本问题.基于核矩阵近似计算和正则化路径,提出一个新的支持向量机模型选择方法.首先,发展初步的近似模型选择理论,包括给出核矩阵近似算法KMA-α,证明KMA-α的近似误差界定理,进而得到支持向量机的模型近似误差界.然后,提出近似模型选择算法AMSRP.该算法应用KMA-α计算的核矩阵的低秩近似来提高支持向量机求解的效率,同时应用正则化路径算法来提高惩罚因子C参数调节的效率.最后,通过标准数据集上的对比实验,验证了AMSRP的可行性和计算效率.实验结果显示,AMSRP可在保证测试集准确率的前提下,显著地提高支持向量机模型选择的效率.理论分析与实验结果表明,AMSRP是一合理、高效的模型选择算法.  相似文献   

10.
This paper presents an efficient construction algorithm for obtaining sparse kernel density estimates based on a regression approach that directly optimizes model generalization capability. Computational efficiency of the density construction is ensured using an orthogonal forward regression, and the algorithm incrementally minimizes the leave-one-out test score. A local regularization method is incorporated naturally into the density construction process to further enforce sparsity. An additional advantage of the proposed algorithm is that it is fully automatic and the user is not required to specify any criterion to terminate the density construction procedure. This is in contrast to an existing state-of-art kernel density estimation method using the support vector machine (SVM), where the user is required to specify some critical algorithm parameter. Several examples are included to demonstrate the ability of the proposed algorithm to effectively construct a very sparse kernel density estimate with comparable accuracy to that of the full sample optimized Parzen window density estimate. Our experimental results also demonstrate that the proposed algorithm compares favorably with the SVM method, in terms of both test accuracy and sparsity, for constructing kernel density estimates.  相似文献   

11.
Two obvious limitations exist for baseline kernel minimum squared error (KMSE): lack of sparseness of the solution and the ill-posed problem. Previous sparse methods for KMSE have overcome the second limitation using a regularization strategy, which introduces an increase in the computational cost to determine the regularization parameter. Hence, in this paper, a constructive sparse algorithm for KMSE (CS-KMSE) and its improved version (ICS-KMSE) are proposed which will simultaneously address the two limitations described above. CS-KMSE chooses the training samples that incur the largest reductions on the objective function as the significant nodes on the basis of the Householder transformation. In contrast with CS-KMSE, there is an additional replacement mechanism using Givens rotation in ICS-KMSE, which results in ICS-KMSE giving better performance than CS-KMSE in terms of sparseness. CS-KMSE and ICS-KMSE do not require the regularization parameter at all before they begin to choose significant nodes, which is beneficial since it saves on the model selection time. More importantly, CS-KMSE and ICS-KMSE terminate their procedures with an early stopping strategy that acts as an implicit regularization term, which avoids overfitting and curbs the sparse level on the solution of the baseline KMSE. Finally, in comparison with other algorithms, both ICS-KMSE and CS-KMSE have superior sparseness, and extensive comparisons confirm their effectiveness and feasibility.  相似文献   

12.
Consider a wireless sensor network with a fusion center deployed to estimate a common non-random parameter vector. Each sensor obtains a noisy observation vector of the non-random parameter vector according to a linear regression model. The observation noise is correlated across the sensors. Due to power, bandwidth and complexity limitations, each sensor linearly compresses its data. The compressed data from the sensors are transmitted to the fusion center, which linearly estimates the non-random parameter vector. The goal is to design the compression matrices at the sensors and the linear unbiased estimator at the fusion center such that the total variance of the estimation error is minimized. In this paper, we provide necessary and sufficient conditions for achieving the performance of the centralized best linear unbiased estimator. We also provide the optimal compression matrices and the optimal linear unbiased estimator when these conditions are satisfied. When these conditions are not satisfied, we propose a sub-optimal algorithm to determine the compression matrices and the linear unbiased estimator. Simulation results are provided to illustrate the effectiveness of the proposed algorithm.  相似文献   

13.
Based on quantale-enriched category, we consider algebras with compatible quantale-enriched structures, which can be viewed as fuzzification of ordered algebraic structures. We mainly study groupoids and semigroups with compatible quantale-enriched structures from this viewpoint. Some basic concepts such as ideals, homomorphisms, residuated quantale-enriched groupoids are developed and some examples of them are given. Our approach gives a complement to the approach initiated by Rosenfeld to study fuzzy abstract algebra, and these two approaches are combined in the present paper to study fuzzy aspects of abstract algebra structures.  相似文献   

14.
The Gaussian kernel density estimator is known to have substantial problems for bounded random variables with high density at the boundaries. For independent and identically distributed data, several solutions have been put forward to solve this boundary problem. In this paper, we propose the gamma kernel estimator as a density estimator for positive time series data from a stationary α-mixing process. We derive the mean (integrated) squared error and asymptotic normality. In a Monte Carlo simulation, we generate data from an autoregressive conditional duration model and a stochastic volatility model. We study the local and global behavior of the estimator and we find that the gamma kernel estimator outperforms the local linear density estimator and the Gaussian kernel estimator based on log-transformed data. We also illustrate the good performance of the h-block cross-validation method as a bandwidth selection procedure. An application to data from financial transaction durations and realized volatility is provided.  相似文献   

15.
A conditional density function, which describes the relationship between response and explanatory variables, plays an important role in many analysis problems. In this paper, we propose a new kernel-based parametric method to estimate conditional density. An exponential function is employed to approximate the unknown density, and its parameters are computed from the given explanatory variable via a nonlinear mapping using kernel principal component analysis (KPCA). We develop a new kernel function, which is a variant to polynomial kernels, to be used in KPCA. The proposed method is compared with the Nadaraya-Watson estimator through numerical simulation and practical data. Experimental results show that the proposed method outperforms the Nadaraya-Watson estimator in terms of revised mean integrated squared error (RMISE). Therefore, the proposed method is an effective method for estimating the conditional densities.  相似文献   

16.
为了提高随机配置网络(stochastic configuration networks,SCN)的泛化能力,提出一种适用于SCN的光滑化$L_1$正则化方法.针对$L_1$正则化算子局部不可微的缺陷,在曲线不光滑点的邻域内进行光滑处理,并在此基础上构建SCN的光滑误差函数,提出增量计算权值的算法,进而以交替方向乘子法为基础给出权值的全局优化算法,并且在理论上分析算法的收敛性.与$L_1$正则化的稀疏性和$L_2$正则化均匀减小参数的特点相比,所提出方法按重要程度保留数据的全部特征,使参数既保持在较小的范围内又具有层次分明的分布,从而使网络具有更好的泛化能力.最后,通过数值仿真实验验证了所提出方法的可行性和有效性.  相似文献   

17.
In the past decade, support vector machines (SVMs) have gained the attention of many researchers. SVMs are non-parametric supervised learning schemes that rely on statistical learning theory which enables learning machines to generalize well to unseen data. SVMs refer to kernel-based methods that have been introduced as a robust approach to classification and regression problems, lately has handled nonlinear identification problems, the so called support vector regression. In SVMs designs for nonlinear identification, a nonlinear model is represented by an expansion in terms of nonlinear mappings of the model input. The nonlinear mappings define a feature space, which may have infinite dimension. In this context, a relevant identification approach is the least squares support vector machines (LS-SVMs). Compared to the other identification method, LS-SVMs possess prominent advantages: its generalization performance (i.e. error rates on test sets) either matches or is significantly better than that of the competing methods, and more importantly, the performance does not depend on the dimensionality of the input data. Consider a constrained optimization problem of quadratic programing with a regularized cost function, the training process of LS-SVM involves the selection of kernel parameters and the regularization parameter of the objective function. A good choice of these parameters is crucial for the performance of the estimator. In this paper, the LS-SVMs design proposed is the combination of LS-SVM and a new chaotic differential evolution optimization approach based on Ikeda map (CDEK). The CDEK is adopted in tuning of regularization parameter and the radial basis function bandwith. Simulations using LS-SVMs on NARX (Nonlinear AutoRegressive with eXogenous inputs) for the identification of a thermal process show the effectiveness and practicality of the proposed CDEK algorithm when compared with the classical DE approach.  相似文献   

18.
Three aspects of the application of the jackknife technique to ridge regression are considered, viz. as a bias estimator, as a variance estimator, and as an indicator of observations influence on parameter estimates. The ridge parameter is considered non-stochastic. The jackknifed ridge estimator is found to be a ridge estimator with a smaller value on the ridge parameter. Hence it has a smaller bias but a larger variance than the ridge estimator. The variance estimator is expected to be robust against heteroscedastic error variance as well as against outliers. A measure of observations influence on the estimates of regression parameters is proposed.  相似文献   

19.
Kernel selection is one of the key issues both in recent research and application of kernel methods. This is usually done by minimizing either an estimate of generalization error or some other related performance measure. Use of notions of stability to estimate the generalization error has attracted much attention in recent years. Unfortunately, the existing notions of stability, proposed to derive the theoretical generalization error bounds, are difficult to be used for kernel selection in practice. It is well known that the kernel matrix contains most of the information needed by kernel methods, and the eigenvalues play an important role in the kernel matrix. Therefore, we aim at introducing a new notion of stability, called the spectral perturbation stability, to study the kernel selection problem. This proposed stability quantifies the spectral perturbation of the kernel matrix with respect to the changes in the training set. We establish the connection between the spectral perturbation stability and the generalization error. By minimizing the derived generalization error bound, we propose a new kernel selection criterion that can guarantee good generalization properties. In our criterion, the perturbation of the eigenvalues of the kernel matrix is efficiently computed by solving the derivative of a newly defined generalized kernel matrix. Both theoretical analysis and experimental results demonstrate that our criterion is sound and effective.  相似文献   

20.
一种自动选择参数的加权支持向量机算法   总被引:7,自引:0,他引:7  
C-SVM分类算法在不同类别样本数目不均衡的情况下,训练时的分类错误倾向于样本数目小的类别。样本集中出现重复样本时作为新样本重新计算,增加了算法的训练时间。针对这两种问题,分析了产生的原因,提出了一种加权支持向量机算法,补偿了类别差异造成的不利影响,加快了重复样本的决策速度。为提高算法的推广性能,在模型训练过程中引入遗传算法自动选择惩罚因子和核函数宽度两个参数。实验结果表明了该算法可以有效地解决类别不均衡和重复样本问题,且训练模型具有良好的推广性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号