首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this study, a model identification instrument to determine the variance component structure for generalized linear mixed models (glmms) is developed based on the conditional Akaike information (cai). In particular, an asymptotically unbiased estimator of the cai (denoted as caicc) is derived as the model selection criterion which takes the estimation uncertainty in the variance component parameters into consideration. The relationship between bias correction and generalized degree of freedom for glmms is also explored. Simulation results show that the estimator performs well. The proposed criterion demonstrates a high proportion of correct model identification for glmms. Two sets of real data (epilepsy seizure count data and polio incidence data) are used to illustrate the proposed model identification method.  相似文献   

2.
An approximate small sample variance estimator for fixed effects from the multivariate normal linear model, together with appropriate inference tools based on a scaled F pivot, is now well established in practice and there is a growing literature on its properties in a variety of settings. Although effective under linear covariance structures, there are examples of nonlinear structures for which it does not perform as well. The cause of this problem is shown to be a missing term in the underlying Taylor series expansion which accommodates the bias in the estimators of the parameters of the covariance structure. The form of this missing term is derived, and then used to adjust the small sample variance estimator. The behaviour of the resulting estimator is explored in terms of invariance under transformation of the covariance parameters and also using a simulation study. It is seen to perform successfully in the way predicted from its derivation.  相似文献   

3.
The Hurst parameter is the simplest numerical characteristic of self-similar long-range dependent stochastic processes. Such processes have been identified in many natural and man-made systems. In particular, since they were discovered in the Internet and other multimedia telecommunication networks a decade ago, they have been the subject of numerous investigations. Typical quantitative assessment of self-similarity and long-range dependency, begins with the estimation of the Hurst parameter H. There have been a number of techniques proposed for this. This paper reports results of a comparative analysis of the six most frequently used estimators of H. To set up a credible framework for this, the minimal acceptable sample size is first determined. The Hurst parameter estimators are then compared for bias and variance. Our experimental results have confirmed that the Abry–Veitch Daubechies Wavelet-Based (DWB) and the Whittle ML (Maximum Likelihood) estimators of H are the least biased. However, the latter has significantly smaller variance and can be applied to shorter data samples than the Abry–Veitch DWB estimator. On the other hand, the Abry–Veitch DWB estimator is computationally simpler and faster than the Whittle ML estimator.  相似文献   

4.
The statistical properties of the k-NN estimators are investigated in a design-based framework, avoiding any assumption about the population under study. The issue of coupling remotely sensed digital imagery with data arising from forest inventories conducted using probabilistic sampling schemes is considered. General results are obtained for the k-NN estimator at the pixel level. When averages (or totals) of forest attributes for the whole study area or sub-areas are of interest, the use of the empirical difference estimator is proposed. The estimator is shown to be approximately unbiased with a variance admitting unbiased or conservative estimators. The performance of the empirical difference estimator is evaluated by an extensive simulation study performed on several populations whose dimensions and covariate values are taken from a real case study. Samples are selected from the populations by means of simple random sampling without replacement. Comparisons with the generalized regression estimator and Horvitz-Thompson estimators are also performed. An application to a local forest inventory on a test area of central Italy is considered.  相似文献   

5.
Composite sampling may be used in industrial or environmental settings for the purpose of quality monitoring and regulation, particularly if the cost of testing samples is high relative to the cost of collecting samples. In such settings, it is often of interest to estimate the proportion of individual sampling units in the population that are above or below a given threshold value, C. We consider estimation of a proportion of the form p=P(X>C) from composite sample data, assuming that X follows a three-parameter gamma distribution. The gamma distribution is useful for modeling skewed data, which arise in many applications, and adding a shift parameter to the usual two-parameter gamma distribution also allows the analyst to model a minimum or baseline level of the response. We propose an estimator of p that is based on maximum likelihood estimates of the parameters α, β, and γ, and an associated variance estimator based on the observed information matrix. Theoretical properties of the estimator are briefly discussed, and simulation results are given to assess the performance of the estimator. We illustrate the proposed estimator using an example of composite sample data from the meat products industry.  相似文献   

6.
Several methods for estimating a sample-based discriminant's probability of correct classification are compared with respect to bias, variance, robustness, and computation cost. “Smooth” modification of the counting estimator, or sample success proportion, is recommended to reduce bias and variance while retaining robustness. Also the “bootstrap” method of Efron(8) can approximately correct an additive estimator's bias using an ancillary computer simulation. In contrast, bias reduction achieved by the popular “leave-one-out” modification of counting method is vitiated by corresponding increase in variance.  相似文献   

7.
In multiple testing, a challenging issue is to provide an accurate estimation of the proportion π0 of true null hypotheses among the whole set of tests. Besides a biological interpretation, this parameter is involved in the control of error rates such as the False Discovery Rate. Improving its estimation can result in more powerful/less conservative methods of differential analysis. Various methods for π0 estimation have been previously developed. Most of them rely on the assumption of independent p-values distributed according to a two-component mixture model, with a uniform distribution for null p-values. In a general factor analytic framework, the impact of dependence on the properties of the estimation procedures is first investigated and exact expressions of bias and variance are provided in case of dependent data. A more accurate factor-adjusted estimator of π0 is finally presented, which shows large improvements with respect to the standard procedures.  相似文献   

8.
This paper develops a recursive, convergent estimator for some parameters of Gaussian mixtures. The M class conditional (component) densities of the mixture random variable are Gaussian with known and distinct means and unknown and possibly different variances. A joint estimator of M prior (mixing) probabilities and M class conditional variances is derived. Sufficient conditions on the data and control parameters are derived for the estimator to converge. Convergence of the estimator follows from the use of a stochastic approximation theorem. Techniques to extend the estimators for the case of successive class labels forming a Markov chain are mentioned. The estimator has applications in blind parameter estimation in digital communication with symbol dependent noise variance and in image compression.  相似文献   

9.
The scientific method has been characterized as having two distinct components, Discovery and Justification. Discovery emphasizes ideas and creativity, focuses on conceiving hypotheses and constructing models, and is generally regarded as lacking a formal logic. Justification begins with the hypotheses and models and ends with a valid scientific inference. Unlike Discovery, Justification has a formal logic whose rules must be rigorously followed to produce valid scientific inferences. In particular, when inferences are based on sample data, the rules of the logic of Justification require assessments of bias and precision. Thus, satellite image-based maps that lack such assessments for parameters of populations depicted by the maps may be of little utility for scientific inference; essentially, they may be just pretty pictures. Probability- and model-based approaches are explained, illustrated, and compared for producing inferences for population parameters using a map depicting three land cover classes: non-forest, coniferous forest, and deciduous forest. The maps were constructed using forest inventory data and Landsat imagery. Although a multinomial logistic regression model was used to classify the imagery, the methods for assessing bias and precision can be used with any classification method. For probability-based approaches, the difference estimator was used, and for model-based inference, a bootstrap approach was used.  相似文献   

10.
The Field Estimator for Arbitrary Spaces (FiEstAS) computes the continuous probability density field underlying a given discrete data sample in multiple, non-commensurate dimensions. The algorithm works by constructing a metric-independent tessellation of the data space based on a recursive binary splitting. Individual, data-driven bandwidths are assigned to each point, scaled so that a constant “mass” M0 is enclosed. Kernel density estimation may then be performed for different kernel shapes, and a combination of balloon and sample point estimators is proposed as a compromise between resolution and variance. A bias correction is evaluated for the particular (yet common) case where the density is computed exactly at the locations of the data points rather than at an uncorrelated set of locations. By default, the algorithm combines a top-hat kernel with M0=2.0 with the balloon estimator and applies the corresponding bias correction. These settings are shown to yield reasonable results for a simple test case, a two-dimensional ring, that illustrates the performance for oblique distributions, as well as for a six-dimensional Hernquist sphere, a fairly realistic model of the dynamical structure of stellar bulges in galaxies and dark matter haloes in cosmological N-body simulations. Results for different parameter settings are discussed in order to provide a guideline to select an optimal configuration in other cases. Source code is available upon request.  相似文献   

11.
Estimation of Hurst exponent revisited   总被引:1,自引:0,他引:1  
In order to estimate the Hurst exponent of long-range dependent time series numerous estimators such as based e.g. on rescaled range statistic (R/S) or detrended fluctuation analysis (DFA) are traditionally employed. Motivated by empirical behaviour of the bias of R/S estimator, its bias-corrected version is proposed. It has smaller mean squared error than DFA and behaves comparably to wavelet estimator for traces of size as large as 215 drawn from some commonly considered long-range dependent processes. It is also shown that several variants of R/S and DFA estimators are possible depending on the way they are defined and that they differ greatly in their performance.  相似文献   

12.
In this paper, the k-NN approach is used for the purpose of estimating the multiclass, 1-NN Bayes error bounds. We derive an estimator which is asymptotically unbiased, and whose variance can be controlled by the choice of k. The estimator appears to be very economic in its use of samples, and quite stable even in very small sample cases.  相似文献   

13.
Suppose the random vector (X,Y) satisfies the regression model Y=m(X)+σ(X)ε, where m(⋅) is the conditional mean, σ2(⋅) is the conditional variance, and ε is independent of X. The covariate X is d-dimensional (d≥1), the response Y is one-dimensional, and m and σ are unknown but smooth functions. Goodness-of-fit tests for the parametric form of the error distribution are studied under this model, without assuming any parametric form for m or σ. The proposed tests are based on the difference between a nonparametric estimator of the error distribution and an estimator obtained under the null hypothesis of a parametric model. The large sample properties of the proposed test statistics are obtained, as well as those of the estimator of the parameter vector under the null hypothesis. Finally, the finite sample behavior of the proposed statistics, and the selection of the bandwidths for estimating m and σ are extensively studied via simulations.  相似文献   

14.
Three aspects of the application of the jackknife technique to ridge regression are considered, viz. as a bias estimator, as a variance estimator, and as an indicator of observations influence on parameter estimates. The ridge parameter is considered non-stochastic. The jackknifed ridge estimator is found to be a ridge estimator with a smaller value on the ridge parameter. Hence it has a smaller bias but a larger variance than the ridge estimator. The variance estimator is expected to be robust against heteroscedastic error variance as well as against outliers. A measure of observations influence on the estimates of regression parameters is proposed.  相似文献   

15.
Land-cover maps are often used to compute land-cover composition (i.e., the proportion or percent of area covered by each class), for each unit in a spatial partition of the region mapped. We derive design-based estimators of mean deviation (MD), mean absolute deviation (MAD), root mean square error (RMSE), and correlation (CORR) to quantify accuracy of land-cover composition for a general two-stage cluster sampling design, and for the special case of simple random sampling without replacement (SRSWOR) at each stage. The bias of the estimators for the two-stage SRSWOR design is evaluated via a simulation study. The estimators of RMSE and CORR have small bias except when sample size is small and the land-cover class is rare. The estimator of MAD is biased for both rare and common land-cover classes except when sample size is large. A general recommendation is that rare land-cover classes require large sample sizes to ensure that the accuracy estimators have small bias.  相似文献   

16.
Matrix models are often used to model the dynamics of age-structured or size-structured populations. The Usher model is an important particular case that relies on the following hypothesis: between time steps t and t+1, individuals either remain in the same class, move up to the following class, or die. There are then two ways of handling data that do not meet this condition: either remove them prior to data analysis or rectify them. These two ways correspond to two estimators of transition parameters. The former, which corresponds to the classical estimator, is obtained from the latter by a data trimming. The two estimators of transition parameters are compared on the basis of their robustness in order to obtain a criterion of choice between the two estimators. The influence curve of both estimators is first computed, then their gross sensitivity and their asymptotic variance. The untrimmed estimator is more robust than the classical one. Its asymptotic variance can be lower or greater than that of the classical estimator depending on the boundaries used for data trimming. The results are applied to a tropical rain forest in French Guiana, with a discussion on the role of the class width.  相似文献   

17.
The problem of estimating the error probability of a given classification system is considered. Statistical properties of the empirical error count (C) and the average conditional error (R) estimators are studied. It is shown that in the large sample case the R estimator is unbiased and its variance is less than that of the C estimator. In contrast to conventional methods of Bayes error estimation the unbiasedness of the R estimator for a given classifier can be obtained only at the price of an additional set of classified samples. On small test sets the R estimator may be subject to a pessimistic bias caused by the averaging phenomenon characterizing the functioning of conditional error estimators.  相似文献   

18.
19.
Log periodogram regression is widely applied in empirical applications to estimate the memory parameter, d, of long memory time series. This estimator is consistent for d<1 and pivotal asymptotically normal for d<3/4. However, the asymptotic distribution is a poor approximation of the (unknown) finite sample distribution if the sample size is small. Finite sample improvements in the construction of confidence intervals can be achieved by different nonparametric bootstrap procedures based on the residuals of log periodogram regression. In addition to the basic residual bootstrap, the local and block bootstraps seem adequate for replicating the structure that may arise in the errors of the regression when the series shows weak dependence in addition to long memory. The performances of different bias correcting bootstrap techniques and a bias reduced log periodogram regression are also analyzed with a view to adjusting the bias caused by that structure. Finally, an application to the Nelson and Plosser US macroeconomic data is included.  相似文献   

20.
This paper is to study the linear minimum variance estimation for discrete-time systems with instantaneous and l-time delayed measurements by using re-organized innovation analysis. A simple approach to the problem is presented in this paper. It is shown that the derived estimator involves solving l+1 different standard Kalman filtering with the same dimension as the original system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号