首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A new criterion based on a Jackknife or a Bootstrap statistic is proposed for identifying non-parsimonious dynamic models (FIR, ARX). It is applicable for selecting the number of components in latent variable regression methods or the constraining parameter in regularized least squares regression methods. These meta parameters are used to overcome ill-conditioning caused by model over-parameterization, when fitted using prediction error or least squares methods. In all cases studied, using PLS for parameter estimation, the proposed criterion led to the selection of better models, in the mean square error sense, than when selected via cross-validation. The methodology also provides approximate confidence intervals for the model parameters and the step and impulse response of the system.  相似文献   

2.
The paper introduces an efficient construction algorithm for obtaining sparse linear-in-the-weights regression models based on an approach of directly optimizing model generalization capability. This is achieved by utilizing the delete-1 cross validation concept and the associated leave-one-out test error also known as the predicted residual sums of squares (PRESS) statistic, without resorting to any other validation data set for model evaluation in the model construction process. Computational efficiency is ensured using an orthogonal forward regression, but the algorithm incrementally minimizes the PRESS statistic instead of the usual sum of the squared training errors. A local regularization method can naturally be incorporated into the model selection procedure to further enforce model sparsity. The proposed algorithm is fully automatic, and the user is not required to specify any criterion to terminate the model construction procedure. Comparisons with some of the existing state-of-art modeling methods are given, and several examples are included to demonstrate the ability of the proposed algorithm to effectively construct sparse models that generalize well.  相似文献   

3.
Testing the presence of serial correlation in the error terms in fixed effects regression models is important for many reasons. This paper proposes portmanteau tests based on the sum of the squares of autocorrelation estimators. This approach is a direct extension of the Box–Pierce or Ljung–Box test from single time series to panel data settings. In fixed effects regression analysis, we may estimate the autocorrelations using the within-group autocorrelations of the residuals. However, the within-group autocorrelations may be severely biased when the length of the time series is not very large compared with the cross-sectional sample size, as a result of the incidental parameters problem. We overcome this problem by using asymptotically unbiased autocorrelation estimators for long panel data recently proposed by the author. Monte Carlo simulations reveal that the proposed tests have good size properties and are powerful against a wide range of alternatives.  相似文献   

4.
In this paper a new class of simplified low-cost analog artificial neural networks with on chip adaptive learning algorithms are proposed for solving linear systems of algebraic equations in real time. The proposed learning algorithms for linear least squares (LS), total least squares (TLS) and data least squares (DLS) problems can be considered as modifications and extensions of well known algorithms: the row-action projection-Kaczmarz algorithm and/or the LMS (Adaline) Widrow-Hoff algorithms. The algorithms can be applied to any problem which can be formulated as a linear regression problem. The correctness and high performance of the proposed neural networks are illustrated by extensive computer simulation results.  相似文献   

5.
Multiple outliers are frequently encountered in regression models used in business, economics, engineers and applied studies. The ordinary least squares (OLS) estimator fails even in the presence of a single outlying observation. To overcome this problem, a class of high breakdown robust estimators (insensitive to outliers up to 50% of the data sample) has been introduced as an alternative to the least squares regression. Among them the Penalized Trimmed Squares (PTS) is a reasonable high breakdown estimator. This estimator is defined by the minimization of an objective function where penalty cost for deleting an outlier is added, which serves as an upper bound on the residual error for any feasible regression line. Since the PTS does not require presetting the number of outliers to delete from the data set, it has better efficiency with respect to other estimators. However, small outliers remain influential causing bias to the regression line. In this work we present a new class of regression estimates called generalized PTS (GPTS). The new GPTS estimator is defined as the PTS but with penalties suitable for bounding the influence function of all observations. We show with some numerical examples and a Monte Carlo simulation study that the generalized PTS estimate has very good performance for both robust and efficiency properties.  相似文献   

6.
The requirement for low data rate voice transmission has resulted in a large number of algorithms being proposed for speech digitization at data rates of 2·4–4 kilobits/sec. Many of the proposed algorithms are quite complicated and have their origin in disciplines generally considered to be outside of the realm of the speech researcher or communication system designer. Additionally, the algorithms have been developed and presented in highly varying notation using various theoretical approaches. The result is a confusing array of equations, algorithms, and numerical analysis procedure. It is the goal of this paper to alleviate this problem by providing a unified tutorial development of the various algorithms used and proposed for speech data compression.Classical least squares estimation theory is used as the focal point of the discussion since it forms the basis for several of the more familiar speech digitization algorithms. The remainder of the algorithms, whether they have their basis in stochastic estimation theory or statistical regression theory, are related back to the more familiar least squares approach. The speech digitization techniques discussed are the covariance method, the autocorrelation method, the PARCOR method, a priori analysis, the sequential least squares method, the Kalman filter approach, the stochastic approximation method, and the general linear regression model. An effort has been made to provide sufficient theoretical background to establish the algorithm relationships without stressing mathematical rigor.  相似文献   

7.
Most researchers are familiar with ordinary multiple regression models, most commonly fitted using the method of least squares. The method of Buckley and James (J. Buckley, I. James, Linear regression with censored data, Biometrika 66 (1979) 429-436.) is an extension of least squares for fitting multiple regression models when the response variable is right-censored as in the analysis of survival time data. The Buckley-James method has been shown to have good statistical properties under usual regularity conditions (T.L. Lai, Z. Ying, Large sample theory of a modified Buckley-James estimator for regression analysis with censored data, Ann. Stat. 19 (1991) 1370-1402.). Nevertheless, even after 20 years of its existence, it is almost never used in practice. We believe that this is mainly due to lack of software and we describe here an S-Plus program that through its inclusion in a public domain function library fully exploits the power of the S-Plus programming environment. This environment provides multiple facilities for model specification, diagnostics, statistical inference, and graphical depiction of the model fit.  相似文献   

8.
左向东  王坤  邱辉 《计算机科学》2016,43(2):140-143
传感器主要用于对外部环境进行监测,然而当传感器发生故障时监测结果会出现误差。为了提高传感器发生故障时系统的容错能力,提出了一种容错的感知数据回归模型。首先,对最小二乘和岭回归两种线性回归模型进行分析,并分析了线性回归模型的相关统计量;然后,分析了部分传感器发生故障时系统的相关统计量,并以此为基础分析了协变量矩阵的上下界;最后,依据协变量矩阵定义了故障指标,并将优化模型转化为同时最小化故障指标和均方误差的问题。实验表明,提出的容错回归模型与传统的最小二乘法和岭回归方法相比具有更小的预测误差,因而当传感器发生故障时所提模型具有更好的健壮性。  相似文献   

9.
Neural nets' usefulness for forecasting is limited by problems of overfitting and the lack of rigorous procedures for model identification, selection and adequacy testing. This paper describes a methodology for neural model misspecification testing. We introduce a generalization of the Durbin-Watson statistic for neural regression and discuss the general issues of misspecification testing using residual analysis. We derive a generalized influence matrix for neural estimators which enables us to evaluate the distribution of the statistic. We deploy Monte Carlo simulation to compare the power of the test for neural and linear regressors. While residual testing is not a sufficient condition for model adequacy, it is nevertheless a necessary condition to demonstrate that the model is a good approximation to the data generating process, particularly as neural-network estimation procedures are susceptible to partial convergence. The work is also an important step toward developing rigorous procedures for neural model identification, selection and adequacy testing which have started to appear in the literature. We demonstrate its applicability in the nontrivial problem of forecasting implied volatility innovations using high-frequency stock index options. Each step of the model building process is validated using statistical tests to verify variable significance and model adequacy with the results confirming the presence of nonlinear relationships in implied volatility innovations  相似文献   

10.
This paper introduces a new robust nonlinear identification algorithm using the predicted residual sums of squares (PRESS) statistic and forward regression. The major contribution is to compute the PRESS statistic within a framework of a forward orthogonalization process and hence construct a model with a good generalization property. Based on the properties of the PRESS statistic the proposed algorithm can achieve a fully automated procedure without resort to any other validation data set for iterative model evaluation.  相似文献   

11.
In this correspondence new robust nonlinear model construction algorithms for a large class of linear-in-the-parameters models are introduced to enhance model robustness via combined parameter regularization and new robust structural selective criteria. In parallel to parameter regularization, we use two classes of robust model selection criteria based on either experimental design criteria that optimizes model adequacy, or the predicted residual sums of squares (PRESS) statistic that optimizes model generalization capability, respectively. Three robust identification algorithms are introduced, i.e., combined A- and D-optimality with regularized orthogonal least squares algorithm, respectively; and combined PRESS statistic with regularized orthogonal least squares algorithm. A common characteristic of these algorithms is that the inherent computation efficiency associated with the orthogonalization scheme in orthogonal least squares or regularized orthogonal least squares has been extended such that the new algorithms are computationally efficient. Numerical examples are included to demonstrate effectiveness of the algorithms.  相似文献   

12.
The author proposes a simple model selection procedure based on IV estimators, which can be viewed as an extension of P. Stoica's method (1981). The model selection procedure is based on a simple statistic with a known limiting distribution. It closely parallels the overfitting strategy applied to models estimated by nonlinear least squares  相似文献   

13.
《Automatica》1987,23(2):203-208
Current engineering practice for adaptive control schemes is to base the design on globally convergent schemes for simple plant models. An important class of such schemes uses least squares estimation of assumed simple input-output models and constructs the controller using the parameter estimates. This paper studies the robustness of such schemes to the presence of unmodelled plant coloured noise. Such noise is sometimes an adequate model for unmodelled plant dynamics.The theory of the paper makes a connection between the least squares parameter error equations and those associated with extended least squares using a posteriori noise estimates for which there are known global convergence results. For the case of adaptive minimum variance control of minimum phase plants, this connection permits stronger convergence results than those hitherto derived from the theory of extended least squares based on a priori noise estimates.  相似文献   

14.
When sampling minimal subsets for robust parameter estimation, it is commonly known that obtaining an all-inlier minimal subset is not sufficient; the points therein should also have a large spatial extent. This paper investigates a theoretical basis behind this principle, based on a little known result which expresses the least squares regression as a weighted linear combination of all possible minimal subset estimates. It turns out that the weight of a minimal subset estimate is directly related to the span of the associated points. We then derive an analogous result for total least squares which, unlike ordinary least squares, corrects for errors in both dependent and independent variables. We establish the relevance of our result to computer vision by relating total least squares to geometric estimation techniques. As practical contributions, we elaborate why naive distance-based sampling fails as a strategy to maximise the span of all-inlier minimal subsets produced. In addition we propose a novel method which, unlike previous methods, can consciously target all-inlier minimal subsets with large spans.  相似文献   

15.
The continuum regression technique provides an appealing regression framework connecting ordinary least squares, partial least squares and principal component regression in one family. It offers some insight on the underlying regression model for a given application. Moreover, it helps to provide deep understanding of various regression techniques. Despite the useful framework, however, the current development on continuum regression is only for linear regression. In many applications, nonlinear regression is necessary. The extension of continuum regression from linear models to nonlinear models using kernel learning is considered. The proposed kernel continuum regression technique is quite general and can handle very flexible regression model estimation. An efficient algorithm is developed for fast implementation. Numerical examples have demonstrated the usefulness of the proposed technique.  相似文献   

16.
Partial least squares and principal components regression are commonly used regularized regression methods which use derived components instead of original predictors. The components are derived from the estimated variance-covariance matrix and regression is run using the least squares. Therefore, they are not robust and a few outliers may have drastic effects on the obtained results. These regression methods are robustified by using the BACON algorithm which provides robust measures for both dispersion and regression. The proposed methods are illustrated by examples and their properties are investigated using both real data and simulation experiments.  相似文献   

17.
有序回归是一种特殊的机器学习范式,其目标是利用类间内在的有序标号来划分模式。尽管已有众多有序学习方法相继被提出,但其性能常受制于有限的训练样本。借鉴最近提出的边际特征扰动思想,通过对训练样本的输入和输出分别施加已知分布噪声的随机扰动和确定偏差的可控扰动,以弥补样本有限的不足,进而在最小平方有序回归基础上发展出采用双重特征扰动的最小平方有序回归(least squares ordinal regres-sion using doubly corrupted features,LSOR-DCF)。实验结果表明,LSOR-DCF性能优于无扰动或单一输入/输出的扰动,且在小数据集上表现得尤其明显。  相似文献   

18.
This paper deals with the asymptotic properties of the least squares estimators for fuzzy linear regression models with fuzzy triangular input-output and random error terms. The asymptotic normality and strong consistency of the fuzzy least squares estimator (FLSE) are investigated; a confidence region based on a class of FLSEs is proposed; the asymptotic relative efficiency of FLSEs with respect to the crisp least squares estimators is also provided and a numerical example is given. Some simulation results are also presented to illustrate the behavior of FLSEs.  相似文献   

19.
Abstract

This article describes research related to sampling techniques for establishing linear relations between land surface parameters and remotely-sensed data. Predictive relations are estimated between percentage tree cover in a savanna environment and a normalized difference vegetation index (NDVI) derived from the Thematic Mapper sensor. Spatial autocorrelation in original measurements and regression residuals is examined using semi-variogram analysis at several spatial resolutions. Sampling schemes are then tested to examine the effects of autocorrelation on predictive linear models in cases of small sample sizes. Regression models between image and ground data are affected by the spatial resolution of analysis. Reducing the influence of spatial autocorrelation by enforcing minimum distances between samples may also improve empirical models which relate ground parameters to satellite data.  相似文献   

20.
在对传统的数据挖掘技术加以改进的基础上,利用均匀实验设计、灰关联分析、逐步回归变量筛选、无阀值逐步回归和非线性偏最小二乘法等多种数学方法,对软质聚氯乙烯阻燃配方体系作研究实验。证明灰关联排序和灰色优势分析,对配方设计与分析行之有效,改进后的建模方法可以在一定程度上,避免小样本带来的拟合误差。通过对配方体系5项指标(氧指数、烟密度、拉伸强度、延伸率和热释放速率)的数学模型分析,深入探讨体系中,添加剂之间的作用及其对体系各种性能的影响。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号