首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Multivariate time series may contain outliers of different types. In the presence of such outliers, applying standard multivariate time series techniques becomes unreliable. A robust version of multivariate exponential smoothing is proposed. The method is affine equivariant, and involves the selection of a smoothing parameter matrix by minimizing a robust loss function. It is shown that the robust method results in much better forecasts than the classic approach in the presence of outliers, and performs similarly when the data contain no outliers. Moreover, the robust procedure yields an estimator of the smoothing parameter less subject to downward bias. As a byproduct, a cleaned version of the time series is obtained, as is illustrated by means of a real data example.  相似文献   

2.
Machine learning methods provide a powerful approach for analyzing longitudinal data in which repeated measurements are observed for a subject over time. We boost multivariate trees to fit a novel flexible semi-nonparametric marginal model for longitudinal data. In this model, features are assumed to be nonparametric, while feature-time interactions are modeled semi-nonparametrically utilizing P-splines with estimated smoothing parameter. In order to avoid overfitting, we describe a relatively simple in sample cross-validation method which can be used to estimate the optimal boosting iteration and which has the surprising added benefit of stabilizing certain parameter estimates. Our new multivariate tree boosting method is shown to be highly flexible, robust to covariance misspecification and unbalanced designs, and resistant to overfitting in high dimensions. Feature selection can be used to identify important features and feature-time interactions. An application to longitudinal data of forced 1-second lung expiratory volume (FEV1) for lung transplant patients identifies an important feature-time interaction and illustrates the ease with which our method can find complex relationships in longitudinal data.  相似文献   

3.
王丽  王文剑  姜高霞 《计算机科学》2015,42(9):226-229, 234
数据的函数化是函数数据分析(Functional Data Analysis,FDA)的基础,也是区别于其它分析方法的关键步骤。数据拟合作为数据函数化的主要方法,通常可转化为损失函数和正则项的优化问题,其中的光滑参数就起着权衡损失和过拟合风险的作用。在光滑参数的选择方法中,广义交叉验证(Generalized Cross-Validation,GCV)是一种通用而且较好的参数选择方法,然而GCV是对离散值进行计算,欲得到较准确的光滑参数仍需做大量的计算。针对此问题,提出拟合优化和差分两种求解策略以提高最优光滑参数的求解效率,并在算法精度及效率方面进行了比较分析。在模拟数据和真实数据上的实验结果表明:两种策略与常用的网格法相比,求解效率有较大提高,且算法精度方面几乎相同,此外差分求解策略在精度方面略优于拟合优化求解策略,而拟合优化求解策略的效率更高。  相似文献   

4.
Current gene intensity-dependent normalization methods, based on regression smoothing techniques, usually approach the two problems of reducing location bias and data rescaling without taking into account the censoring that is characteristic of certain gene expressions, produced by experimental measurement constraints or by previous normalization steps. Moreover, control of normalization procedures for balancing bias versus variance is often left to the user’s experience. An approximate maximum likelihood procedure for fitting a model smoothing the dependences of log-fold gene expression differences on average gene intensities is presented. Central tendency and scaling factor are modeled by means of the B-spline smoothing technique. As an alternative to the outlier theory and robust methods, the approach presented looks for suitable distributional models, possibly generalizing the classical Gaussian and Laplacian assumptions, controlling for different types of censoring. The Bayesian information criterion is adopted for model selection. Distributional assumptions are tested using goodness-of-fit statistics and Monte Carlo evaluation. Randomization quantiles are proposed to produce normally distributed adjusted data. Three publicly available data sets are analyzed for demonstration purposes. Student’s t error models reveal best performances in all of the data sets considered. More validating evidence is needed to evaluate the Asymmetric Laplace distribution, which showed interesting results in one data set.  相似文献   

5.
Common simplifications of the bandwidth matrix cannot be applied to existing kernels for density estimation with compositional data. In this paper, kernel density estimation methods are modified on the basis of recent developments in compositional data analysis and bandwidth matrix selection theory. The isometric log-ratio normal kernel is used to define a new estimator in which the smoothing parameter is chosen from the most general class of bandwidth matrices on the basis of a recently proposed plug-in algorithm. Both simulated and real examples are presented in which the behaviour of our approach is illustrated, which shows the advantage of the new estimator over existing proposed methods.  相似文献   

6.
The functional coefficient regression models assume that the regression coefficients vary with some “threshold” variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called “curse of dimensionality” in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.  相似文献   

7.
This paper considers the problem of estimating curve and surface functions when the structures of an unknown function vary spatially. Classical approaches such as using smoothing splines, which are controlled by a single smoothing parameter, are inefficient in estimating the underlying function that consists of different spatial structures. In this paper, we propose a blockwise method of fitting smoothing splines wherein the smoothing parameter λ varies spatially, in order to accommodate possible spatial nonhomogeneity of the regression function. A key feature of the proposed blockwise method is the parameterization of a smoothing parameter function λ(x) that produces a continuous spatially adaptive fit over the entire range of design points. The proposed parameterization requires two important ingredients: (1) a blocking scheme that divides the data into several blocks according to the degree of spatial variation of the data; and (2) a method for choosing smoothing parameters of blocks. We propose a block selection approach that is based on the adaptive thinning algorithm and a choice of smoothing parameters that minimize a newly defined blockwise risk. The results obtained from numerical experiments validate the effectiveness of the proposed method.  相似文献   

8.
This work presents a new approach for the analysis of convex minimization-based edge-preserving image smoothing and the parameter selection therein. The global solution, that is, the response of a convex smoothing model to the ideal step edge, is derived in close-form. By analyzing the close-form solution, insights are drawn into how the optimal solution responds to edges in the data and how the parameter values affect resultant edges in the solution. Based on this, a scheme is proposed for selecting parameters to achieve desirable responses at edges. The theoretic results are substantiated by experiments  相似文献   

9.
支持向量回归模型在曲线光顺拟合中的改进   总被引:2,自引:1,他引:1  
几何逆向工程中的光顺曲线重构问题本质上属于回归问题。支持向量回归机是求解回归问题的新的十分有效的方法。论文研究用支持向量回归机处理光顺曲线的重构问题。鉴于后者有着对于光顺性的特殊要求,已有的支持向量机并不适用。通过修正惩罚因子对支持向量机加以改造,即根据测量数据点的分布情况,利用各测量点圆率的特性确定对应的惩罚因子,从而实现了自由曲线的光顺重构。数值试验表明新方法可以剔除输入数据中不光顺点的影响,并在给定的精度条件下有效地逼近曲线,达到较好的拟合效果。  相似文献   

10.
Most dimension reduction methods based on nonparametric smoothing are highly sensitive to outliers and to data coming from heavy-tailed distributions. Two recently proposed methods, minimum average variance estimation and outer product of gradients, can be and are made robust in such a way that preserves all advantages of the original approach. Their extension based on the local one-step M-estimators is sufficiently robust to outliers and data from heavy-tailed distributions, it is relatively easy to implement, and surprisingly, it performs as well as the original methods when applied to normally distributed data.  相似文献   

11.
Multi-layer perceptron artificial neural networks are used extensively in hydrological and water resources modelling. However, a significant limitation with their application is that it is difficult to determine the optimal model structure. General regression neural networks (GRNNs) overcome this limitation, as their model structure is fixed. However, there has been limited investigation into the best way to estimate the parameters of GRNNs within water resources applications. In order to address this shortcoming, the performance of nine different estimation methods for the GRNN smoothing parameter is assessed in terms of accuracy and computational efficiency for a number of synthetic and measured data sets with distinct properties. Of these methods, five are based on bandwidth estimators used in kernel density estimation, and four are based on single and multivariable calibration strategies. In total, 5674 GRNN models are developed and preliminary guidelines for the selection of GRNN parameter estimation methods are provided and tested.  相似文献   

12.
Sub-pixel mapping is a process to provide the spatial distributions of land cover classes with finer spatial resolution than the size of a remotely sensed image pixel. Traditional Markov random field-based sub-pixel mapping (MRF_SPM) adopts a fixed smoothing parameter estimated based on the entire image to balance the spatial and spectral energies. However, the spectra of the remotely sensed pixels are always spatially variable. Adopting a fixed smoothing parameter disregards the local properties provided by each pixel spectrum, and may probably lead to insufficient smoothing in the homogeneous region and over-smoothing between class boundaries simultaneously. This article proposes a spatially adaptive parameter selection method for the MRF_SPM model to overcome the limitation of the fixed parameter. As pixel class proportions are indicators of the type and proportion of land cover classes within each coarse pixel, in the proposed method, fraction images providing pixel class proportions as local properties of each pixel spectrum are employed to constrain the smoothing parameter. Consequently, the smoothing parameter is spatially adaptive to each pixel spectrum of the remotely sensed image. Synthetic images and IKONOS multi-spectral images were employed. Results showed that compared with the hard classification method and the non-spatially adaptive MRF_SPM adopting a fixed smoothing parameter, the spatially adaptive MRF_SPM with the smoothing parameter constrained to each pixel spectrum yielded sub-pixel maps not only with higher accuracy but also with shapes and boundaries visually reconstructed more closely to the reference map.  相似文献   

13.
For a smoothing spline or general penalized spline model, the smoothing parameter can be estimated using residual maximum likelihood (REML) methods by expressing the spline in the form of a mixed model. The possibility of bimodality in the profile log-likelihood function for the smoothing parameter of these penalized spline mixed models is demonstrated. A canonical transformation into independent observations is used to provide efficient evaluation of the log-likelihood function and gives insight into the incompatibilities between the model and data that cause bimodality. This transformation can also be used to assess the influence of different frequency components in the data on the estimated smoothing parameter. It is demonstrated that, where bimodality occurs in the log-likelihood, Bayesian penalized spline models may show poor mixing in MCMC chains and be sensitive to the choice of prior distributions for variance components.  相似文献   

14.
In kernel discriminant analysis, it is common practice to select the smoothing parameter (bandwidth) based on the training data and use it for classifying all unlabeled observations. But this method of selecting a single scale of smoothing ignores the major issue of model uncertainty. Moreover, in addition to depending on the training sample, a good choice of bandwidth may also depend on the observation to be classified, and a fixed level of smoothing may not work well in all parts of the measurement space. So, instead of using a single smoothing parameter, it may be more useful in practice to study classification results for multiple scales of smoothing and judiciously aggregate them to arrive at the final decision. This paper adopts a Bayesian approach to carry out one such multiscale analysis using a probabilistic framework. This framework also helps us to extend our multiscale method for semi-supervised classification, where, in addition to the training sample, one uses unlabeled test set observations to form the decision rule. Some well-known benchmark data sets are analyzed to show the utility of these proposed methods.  相似文献   

15.
The significance of detection and classification of power quality (PQ) events that disturb the voltage and/or current waveforms in electrical power distribution networks is well known. Consequently, in spite of a large number of research reports in this area, research on the selection of useful features from the existing feature set and the parameter selection for specific classifiers has thus far not been explored. The choice of a smoothing parameter for a probabilistic neural network classifier (PNN) in the training process, together with feature selection, will significantly impact the classification accuracy. In this work, a thorough analysis is carried out, using two wrapper-based optimization techniques—the genetic algorithm and simulated annealing—for identifying the ensemble of celebrated features obtained using discrete wavelet transform together with the smoothing parameter selection of the PNN classifier. As a result of these analyses, the proper smoothing parameter together with a more useful feature set from among a wider set of features for the PNN classifier is obtained with improved classification accuracy. Furthermore, the results show that the performance of simulated annealing is better than the genetic algorithm for feature selection and parameter optimization in Power Quality Data Mining.  相似文献   

16.
重点在于融和丰富字特征作为消歧知识以提高分类性能和引入不等式平滑技术来克服数据稀疏问题,同时不等式平滑技术还使特征选择嵌入到参数估计过程中,显著压缩模型规模。  相似文献   

17.
Quality-of-Service (QoS) is an important concept for service selection and user satisfaction in cloud computing. So far, service recommendation in the cloud is done by means of QoS, ranking and rating techniques. The ranking methods perform much better, when compared with the rating methods. In view of the fact that the ranking methods directly predict QoS rankings as accurately as possible, in most of the ranking methods, an individual QoS value alone is employed to predict the cloud rank. In this paper, we propose a correlated QoS ranking algorithm along with a data smoothing technique and combined with QoS to predict a personalized ranking for service selection by an active user. Experiments are conducted employing a WSDream-QoS dataset, including 300 distributed users and 500 real world web services all over the world. Six different techniques of correlated QoS ranking schemes have been proposed and evaluated. The experimental results showed that this approach improves the accuracy of ranking prediction when compared to a ranking prediction framework using a single QoS parameter.  相似文献   

18.
Exponential procedures are widely used as forecasting techniques for inventory control and business planning. A number of modifications to the generalized exponential smoothing (Holt-Winters) approach to forecasting univariate time series is presented, which have been adapted into a tool for decision support systems. This methodology unifies the phases of estimation and model selection into just one optimization framework which permits the identification of robust solutions. This procedure may provide forecasts from different versions of exponential smoothing by fitting the updated formulas of Holt-Winters and selects the best method using a fuzzy multicriteria approach. The elements of the set of local minima of the non-linear programming problems allow us to build the membership functions of the conflicting objectives. It is compared to other forecasting methods on the 111 series from the M-competition.  相似文献   

19.
A number of methods have been proposed to estimate the period of a variable star; e.g., a recent approach uses smoothing spline regression to fit tentative periodic functions (light curves) and selects the period minimizing a robust goodness-of-fit criterion. These methods assume that measurement errors vary independently over time. Empirical evidence, however, indicates substantial temporal dependence, possibly related to changes in observing conditions. Dependence complicates the period analysis in several respects: selection of a “best” period among several local optima, estimation of the light curve, and evaluation of uncertainty about period and light curve estimates. This article presents methods designed to accommodate dependent errors. An analysis of several data sets shows that the proposed approach can produce substantially different and arguably better results compared with other methods.  相似文献   

20.
Particle filters for state and parameter estimation in batch processes   总被引:2,自引:0,他引:2  
In process engineering, on-line state and parameter estimation is a key component in the modelling of batch processes. However, when state and/or measurement functions are highly non-linear and the posterior probability of the state is non-Gaussian, conventional filters, such as the extended Kalman filter, do not provide satisfactory results. This paper proposes an alternative approach whereby particle filters based on the sequential Monte Carlo method are used for the estimation task. Particle filters are initially described prior to discussing some implementation issues, including degeneracy, the selection of the importance density and the number of particles. A kernel smoothing approach is introduced for the robust estimation of unknown and time-varying model parameters. The effectiveness of particle filters is demonstrated through application to a benchmark batch polymerization process and the results are compared with the extended Kalman filter.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号