期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Estimate-based goodness-of-fit test for large sparse multinomial distributions

Sung-Ho Kim Hyemi Choi 《Computational statistics & data analysis》2009,53(4):1122-1131

The Pearson’s chi-squared statistic (X²) does not in general follow a chi-square distribution when it is used for goodness-of-fit testing for a multinomial distribution based on sparse contingency table data. We explore properties of [Zelterman, D., 1987. Goodness-of-fit tests for large sparse multinomial distributions. J. Amer. Statist. Assoc. 82 (398), 624-629] D² statistic and compare them with those of X² and compare the power of goodness-of-fit test among the tests using D², X², and the statistic (L_r) which is proposed by [Maydeu-Olivares, A., Joe, H., 2005. Limited- and full-information estimation and goodness-of-fit testing in 2ⁿ contingency tables: A unified framework. J. Amer. Statist. Assoc. 100 (471), 1009-1020] when the given contingency table is very sparse. We show that the variance of D² is not larger than the variance of X² under null hypotheses where all the cell probabilities are positive, that the distribution of D² becomes more skewed as the multinomial distribution becomes more asymmetric and sparse, and that, as for the L_r statistic, the power of the goodness-of-fit testing depends on the models which are selected for the testing. A simulation experiment strongly recommends to use both D² and L_r for goodness-of-fit testing with large sparse contingency table data. 相似文献

2.

A 16-competitive algorithm for hierarchical median problem

WenQiang Dai 《中国科学:信息科学(英文版)》2014,57(3):1-7

The hierarchical median problem consists of finding a hierarchical assignment function sequence of solutions to the well-known k-median problems with growing cardinality. This sequence is said to be r competitive if the cost of each solution is at most r times of the optimum cost of median problem with a same cardinality, where r is called the competitive ratio. Previously the best algorithm available for this problem has competitive ratio 20.71. In this paper an improved aigorithm is proposed with competitive ratio factor 16. 相似文献

3.

An analytical technique for uni,bi, and trimodal paleocurrent data

P.S Plummer P.I Leppard 《Computers & Geosciences》1979,5(2):157-172

Accurate paleogeographic reconstructions of sedimentary rock associations require a method of paleocurrent analysis in which each individual current system present can be analyzed separately. Such a method has been developed whereby mixtures of up to three populations in any one paleocurrent distribution can be analyzed individually. For each population the mean direction (u_i), concentration about the mean (k_i), and proportion of that population in the entire distribution (p_i) are determined from the data, along with a chi-square, goodness-of-fit test of the model selected. 相似文献

4.

Investigation and application of Laplace spectra of orgraphs with the ring structure

R. P. Agaev 《Automation and Remote Control》2008,69(2):177-188

The Laplace matrix is a square matrix L = (?_ij) ∈ ?^n×n in which all nondiagonal elements are nonpositive and all row sums are equal to zero. Each Laplace matrix corresponds to a weighted orgraph with positive arc weights. The problem of reality of Laplace matrix spectrum for orgraphs of a special type consisting of two “counter” Hamiltonian cycles in one of which one or two arcs are removed is studied. Characteristic polynomials of Laplace matrices of these orgraphs are expressed through polynomials Z _n(x) that can be obtained from Chebyshev second-kind polynomials P _2n(y) by the substitution of y ² = x. The obtained results relate to properties of the product of Chebyshev second-kind polynomials. A direct method for computing the spectrum of Laplace circuit matrix is given. The obtained results can be used for computing the number of spanning trees in orgraphs of the studied type. One of the possible practical applications of these results is the investigation of topology and development of new Internet protocols. 相似文献

5.

Investigation of some probabilistic characteristics of one class of semi-Markov wandering with delaying screens

T. I. Nasirova B. G. Shamilova 《Automatic Control and Computer Sciences》2014,48(2):109-119

The process of semi-Markov wandering with delaying screens “b” and “a” (a > b > 0) is constructed by the given sequence of independent and identically distributed random vectors (ξ_i, η_i), i ≥ 1. The integral equation for the Laplace transform by time and the Laplace-Stieltjes transform by the phase of its conditional distribution is derived. If the wandering occurs by a complicated Laplace distribution, the ergodic distribution of the process and its moments are found. Then, the integral equation for the generating function of the conditional distribution of the number of process steps at which it firstly reaches the level a is derived. When the wandering occurs by the simple Laplace distribution, its generating functions and moments are found. 相似文献

6.

Estimating the parameters of a generalized lambda distribution

B. Fournier N. Rupin M. Bigerelle A. Iost 《Computational statistics & data analysis》2007,51(6):2813-2835

The method of moments is a popular technique for estimating the parameters of a generalized lambda distribution (GLD), but published results suggest that the percentile method gives superior results. However, the percentile method cannot be implemented in an automatic fashion, and automatic methods, like the starship method, can lead to prohibitive execution time with large sample sizes. A new estimation method is proposed that is automatic (it does not require the use of special tables or graphs), and it reduces the computational time. Based partly on the usual percentile method, this new method also requires choosing which quantile u to use when fitting a GLD to data. The choice for u is studied and it is found that the best choice depends on the final goal of the modeling process. The sampling distribution of the new estimator is studied and compared to the sampling distribution of estimators that have been proposed. Naturally, all estimators are biased and here it is found that the bias becomes negligible with sample sizes n?2×10³. The .025 and .975 quantiles of the sampling distribution are investigated, and the difference between these quantiles is found to decrease proportionally to . The same results hold for the moment and percentile estimates. Finally, the influence of the sample size is studied when a normal distribution is modeled by a GLD. Both bounded and unbounded GLDs are used and the bounded GLD turns out to be the most accurate. Indeed it is shown that, up to n=10⁶, bounded GLD modeling cannot be rejected by usual goodness-of-fit tests. 相似文献

7.

On the choice of the smoothing parameter for the BHEP goodness-of-fit test 总被引：1，自引：0，他引：1

Carlos Tenreiro 《Computational statistics & data analysis》2009,53(4):1038-1053

相似文献

8.

FITEST: A computer program for “exact chi-square” goodness-of-fit significance tests

H.Charles Romesburg Kim Marshall Timothy P. Mauk 《Computers & Geosciences》1981,7(1):47-58

FITEST, a FORTRAN IV computer program, performs what is termed an exact chi-square test (ECST) to assess the goodness-of-fit between an observed and a theoretical distribution. This test is an alternative to the chi-square and Kolmogorov-Smirnov goodness-of-fit tests. Because it is based on less restrictive assumptions, the ECST may be more appropriate. However, the test imposes a computational burden which, if not handled by an efficiently designed computer algorithm, makes it prohibitively expensive on all but trivial problems. FITEST, through an efficiently designed algorithm, makes an ECST possible for any problem at a reasonable cost. 相似文献

9.

On testing a subset of regression parameters under heteroskedasticity

Miin-Jye Wen Hubert J. Chen 《Computational statistics & data analysis》2007,51(12):5958-5976

Assuming a general linear model with unknown and possibly unequal normal error variances, the interest is to develop a one-sample procedure to handle the hypothesis testing on all, partial, or a subset of linear functions of regression parameters. The sampling procedure is to split up each single sample of size n_i at a controllable regressor's data point into two portions, the first consisting of the n_i-1 observations for initial estimation and the second consisting of the remaining one for overall use in the final estimation in order to define a weighted sample mean based on all sample observations at each data point. Then, the weighted sample mean is used to serve as a basis for parameter estimates and test statistics for a general linear regression model. It is found that the distributions of the test statistics based on the weighted sample means are completely independent of the unknown variances. This method can be applied to analysis of variance under various designs of experiments with unequal variances. 相似文献

10.

MRPP tests in L1-norm

《Computational statistics & data analysis》1987,5(4):373-380

Multiresponse permutation procedure tests, developed by Mielke, Berry and Johnson, are surveyed. One such rank test in L₁-norm is compared with another for underlying exponential and Laplace populations. 相似文献

11.

Locally optimal tests for exponential distributions with type-I censoring

Tachen Liang Kun-Cheng Yang 《Computational statistics & data analysis》2008,52(7):3603-3615

This article studies a locally optimal test φ^∗ for testing H₀:θ≥θ₀ versus H₁:θ<θ₀ for the lifetime parameter θ in an exponential distribution based on type-I censored data. Certain properties associated with φ^∗ are addressed. We compare the performance of φ^∗ with that of Spurrier and Wei’s [Spurrier, J.D., Wei, L.J., 1980. A test of the parameter of the exponential distribution in the type-I censoring case. J. Amer. Statist. Assoc. 75, 405-409] test φ_SW, which is based on the MLE of θ. The exact powers and asymptotic powers of φ^∗ and φ_SW are computed. The numerical results indicate that the power of φ^∗ is better than that of φ_SW when θ(0<θ<θ₀) is close to θ₀. 相似文献

12.

CHITEST: A Monte-Carlo computer program for contingency table tests

H.Charles Romesburg Kim Marshall 《Computers & Geosciences》1985,11(1):69-78

CHITEST is an interactive FORTRAN 77 program that uses Monte-Carlo methods to test the null hypothesis that the row and column factors of a r-by-k contingency table are independent of each other. The program optionally performs the test for two populations: (1) the population of tables that have the same row and column marginal frequencies as the observed table; and (2) the population of tables that have the same total frequency as the observed table. The test does not require the expected cell frequencies to be large values—a requirement necessary for the standard chi-square test of independence to be valid. The program also will test the goodness-of-fit of an empirical distribution to a discrete theoretical distribution, and this test does not require large expected values for the theoretical distribution. 相似文献

13.

Comparison of some tests of fit for the Laplace distribution

D.J. Best J.C.W. Rayner 《Computational statistics & data analysis》2008,52(12):5338-5343

Tests for the Laplace distribution based on the sample skewness and kurtosis coefficients are shown to be related to components of smooth tests of goodness of fit and are compared with a number of tests including the Anderson-Darling test, a new data-driven smooth test, a new empirical characteristic function based test and a new maximum entropy test. This last would be our slight preference as the test of choice for testing for the Laplace distribution. 相似文献

14.

Prediction intervals for regression models

David J. Olive 《Computational statistics & data analysis》2007,51(6):3115-3122

This paper presents simple large sample prediction intervals for a future response Y_f given a vector x_f of predictors when the regression model has the form Y_i=m(x_i)+e_i where m is a function of x_i and the errors e_i are iid. Intervals with correct asymptotic coverage and shortest asymptotic length can be made by applying the shorth estimator to the residuals. Since residuals underestimate the errors, finite sample correction factors are needed.As an application, three prediction intervals are given for the least squares multiple linear regression model. The asymptotic coverage and length of these intervals and the classical estimator are derived. The new intervals are useful since the distribution of the errors does not need to be known, and simulations suggest that the large sample theory often provides good approximations for moderate sample sizes. 相似文献

15.

Triangular fuzzification of random variables and power of distribution tests: Empirical discussion

Ana Colubi Gil González-Rodríguez 《Computational statistics & data analysis》2007,51(9):4742-4750

A fuzzifying process of finitely valued random variables by means of triangular fuzzy sets is analyzed. Empirical studies show that if the random variable takes on a small number of different values, the one-sample test about the (fuzzy) mean of the fuzzified random variable is frequently more powerful than the classical test about the mean of the original random variable. This empirical conclusion is theoretically supported as follows: whenever the number of different values of a random variable X is up to 4, the mean of the fuzzified random variable captures the whole information on its distribution. As a consequence, the statistical test about the mean of the fuzzified random variable can be considered in fact as a goodness-of-fit test for the original random variable and, analogously, the J-sample test becomes a test for the equality of J distributions. Comparative simulation studies of these procedures with respect to other well-known methods are carried out. A real-life example illustrates the introduced methodology. 相似文献

16.

The likelihood ratio test for hidden Markov models in two-sample problems

Jörn Dannemann 《Computational statistics & data analysis》2008,52(4):1850-1859

The asymptotic distribution of the likelihood ratio test statistic in two-sample testing problems for hidden Markov models is derived when allowing for unequal sample sizes as well as for different families of state-dependent distributions. In both cases under regularity conditions the limit distribution is a standard χ²-distribution, and in particular does not depend on the ratio of the distinct sample sizes. In a simulation study, the finite sample properties are investigated, and the methodology is illustrated in an application to modeling the movement of Drosophila larvae. 相似文献

17.

Distribution of busy period for the bulk-service queueing system E_k/M^a,b/1

N.S. Kambo M.L. Chaudhry 《Computers & Operations Research》1984,11(3):267-274

Using the Erlangian technique the busy-period equations for the single-server bulkservice system E_k/M^a,b/1. are solved to obtain the Laplace transform of the probability density function (pdf) of the busy period. The Laplace transform is expressed in terms of an easily computable real root of the characteristic equation. Expressions for the mean and variance of the busy-period distribution are given. Explicit results for the pdf of the busy period are obtained for the special systems M/M^a,b/1 and E_k/M^a,a/1. 相似文献

18.

Weighted tardiness for the single machine scheduling problem:An examination of precedence theorem productivity

J.J. Kanet C. Birkemeier 《Computers & Operations Research》2013

Earlier research by Kanet [11] has provided a number of new theorems for deciding precedence between pairs of jobs for 1∣∣Σw_jT_j. The theorems supplant those of Rinnooy Kan, Lageweg, and Lenstra [16]. Presented here are the results of an analysis of the marginal benefit these new theorems provide over the earlier versions of Rinnooy Kan et al. Results show that the new theorems can provide noteworthy improvements in the ability to discover precedence relations between job pairs. For a large set of problem instances the new theorems uncovered up to 8% more precedence relations than the original theorems of Rinnooy Kan et al. The improvement in the productivity in discovering precedence relations shows to be dependent on the coefficient of variation of the distribution of job weights. Logical application of the theorems is to include them in search procedures and/or heuristic approaches to 1||Σw_jT_j. One such heuristic based on the theorems is provided here in which the solutions to a large set of sample problems are within 8–12% of the optimum. 相似文献

19.

A multivariate synthetic double sampling T2 control chart

Michael B.C. Khoo Zhang Wu Philippe Castagliola H.C. Lee 《Computers & Industrial Engineering》2013,64(1):179-189

In this article, we propose a multivariate synthetic double sampling T² chart to monitor the mean vector of a multivariate process. The proposed chart combines the double sampling (DS) T² chart and the conforming run length (CRL) chart. On the whole, the proposed chart performs better than its standard counterparts, namely, the Hotelling’s T², DS T², and synthetic T² charts, in terms of the average run length (ARL) and average number of observations to sample (ANOS). The proposed chart also outperforms the multivariate exponentially weighted moving average (MEWMA) chart for moderate and large shifts but the latter is more sensitive than the former towards small shifts. For a variable sample size chart, like the synthetic DS T² chart, ANOS is a more meaningful performance measure than ARL. ANOS relates to the actual number of observations sampled but ARL merely deals with the number of sampling stages taken. Interpretation based on ARL is more complicated as either n₁ or n₁ + n₂ observations are taken in each sampling stage. 相似文献

20.

Generalized Cramér–von Mises goodness-of-fit tests for multivariate distributions

Sung Nok Chiu Kwong Ip Liu 《Computational statistics & data analysis》2009,53(11):3817-3834

A class of statistics for testing the goodness-of-fit for any multivariate continuous distribution is proposed. These statistics consider not only the goodness-of-fit of the joint distribution but also the goodness-of-fit of all marginal distributions, and can be regarded as generalizations of the multivariate Cramér–von Mises statistic. Simulation shows that these generalizations, using the Monte Carlo test procedure to approximate their finite-sample p-values, are more powerful than the multivariate Kolmogorov–Smirnov statistic. 相似文献