期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Computation of the quasi-independence model for the analysis of triangular contingency tables

Aktaş S 《Computer methods and programs in biomedicine》2008,89(1):72-75

Triangular contingency tables are a special class of incomplete contingency tables. Association and independence models are used to analyze such tables. This paper presents and compares some methods including the uniform association model and the quasi-independence model. These models can be described in terms of the association parameters for the analysis of triangular contingency tables having ordered categories. A computer program is developed for the analysis of quasi-independence model for positive (negative) likelihood dependence. The sign test, which is a nonparametric test of the independence against the likelihood ratio dependence, is also examined. These methods are applied to the disability ratings of stroke patients data. Effects of the structural zeros on the results are also discussed. 相似文献

2.

Regularized nonsymmetric correspondence analysis

Yoshio Takane Sunho Jung 《Computational statistics & data analysis》2009,53(8):3159-3170

Nonsymmetric correspondence analysis (NSCA) is designed to analyze two-way contingency tables in which rows and columns assume an asymmetric role, e.g., columns depend on rows, but not vice versa. A ridge type of regularization was incorporated into a variety of NSCA: Ordinary NSCA, and Partial and/or Constrained NSCA. The regularization has proven useful in obtaining estimates of parameters, which are on average closer to the true population values. An optimal value of the regularization parameter is found by a G-fold cross validation method, and the best dimensionality of the solution space is determined by permutation tests. A bootstrap method is used to evaluate the stability of the solution. A small Monte Carlo study and an illustrative example demonstrate the usefulness of the proposed procedures. 相似文献

3.

基于卡方统计量的属性约简新方法

魏立力韩崇昭《计算机仿真》2007,24(5):72-74,106

近年来粗糙集数据分析已经成为定性数据分析的一个常用方法,而信息系统的属性约简是粗糙集理论的一个基本问题.粗糙集方法虽然不需要数据之外的其它信息,但所得结果同时也缺乏统计证据.文中运用非参数统计的思想研究了信息系统的属性约简问题,首先将原始信息系统整理成任意两个属性子集之间的列联表,然后给出了基于卡方统计量的属性相关性的一个新度量,基于此度量给出了信息系统属性约简的新方法.数值例子说明了该方法的可行性和有效性. 相似文献

4.

Divergence statistics for testing uniform association in cross-classifications

V. Alba-Fernández M.D. Jiménez-Gamero 《Information Sciences》2010,180(23):4557-4571

In this paper, we consider the problem of testing uniform association in cross-classifications having ordered categories, taking as test statistic one in the family proposed by Conde and Salicrú [J. Conde, M. Salicrú, Uniform association in contingency tables associated to Csiszár divergence, Statistics and Probability Letters 37 (1998) 149-154]. We consider two approximations to the null distribution of the test statistics in this family: an estimation of the asymptotic null distribution and a bootstrap estimator. We prove that both approximations are asymptotically equivalent. To study their finite sample performance, we carried out two simulation experiments, whose results are presented. From the simulations it can be concluded that the bootstrap estimator behaves much better than the estimated asymptotic null distribution. 相似文献

5.

A framework for nonparametric profile monitoring

Shih-Chung Chuang Ying-Chao Hung Wen-Chi Tsai Su-Fen Yang 《Computers & Industrial Engineering》2013,64(1):482-491

Control charts have been widely used for monitoring the functional relationship between a response variable and some explanatory variable(s) (called profile) in various industrial applications. In this article, we propose an easy-to-implement framework for monitoring nonparametric profiles in both Phase I and Phase II of a control chart scheme. The proposed framework includes the following steps: (i) data cleaning; (ii) fitting B-spline models; (iii) resampling for dependent data using block bootstrap method; (iv) constructing the confidence band based on bootstrap curve depths; and (v) monitoring profiles online based on curve matching. It should be noted that, the proposed method does not require any structural assumptions on the data and, it can appropriately accommodate the dependence structure of the within-profile observations. We illustrate and evaluate our proposed framework by using a real data set. 相似文献

6.

Forecasting nonlinear time series with neural network sieve bootstrap

Francesco Giordano Cira Perna 《Computational statistics & data analysis》2007,51(8):3871-3884

A new method to construct nonparametric prediction intervals for nonlinear time series data is proposed. Within the framework of the recently developed sieve bootstrap, the new approach employs neural network models to approximate the original nonlinear process. The method is flexible and easy to implement as a standard residual bootstrap scheme while retaining the advantage of being a nonparametric technique. It is model-free within a general class of nonlinear processes and avoids the specification of a finite dimensional model for the data generating process. The results of a Monte Carlo study are reported in order to investigate the finite sample performances of the proposed procedure. 相似文献

7.

Methodology for validating software metrics

Schneidewind N.F. 《IEEE transactions on pattern analysis and machine intelligence》1992,18(5):410-422

A comprehensive metrics validation methodology is proposed that has six validity criteria, which support the quality functions assessment, control, and prediction, where quality functions are activities conducted by software organizations for the purpose of achieving project quality goals. Six criteria are defined and illustrated: association, consistency, discriminative power, tracking, predictability, and repeatability. The author shows that nonparametric statistical methods such as contingency tables play an important role in evaluating metrics against the validity criteria. Examples emphasizing the discriminative power validity criterion are presented. A metrics validation process is defined that integrates quality factors, metrics, and quality functions 相似文献

8.

Nonparametric discriminant analysis 总被引：4，自引：0，他引：4

Fukunaga K Mantock JM 《IEEE transactions on pattern analysis and machine intelligence》1983,(6):671-678

A nonparametric method of discriminant analysis is proposed. It is based on nonparametric extensions of commonly used scatter matrices. Two advantages result from the use of the proposed nonparametric scatter matrices. First, they are generally of full rank. This provides the ability to specify the number of extracted features desired. This is in contrast to parametric discriminant analysis, which for an L class problem typically can determine at most L 1 features. Second, the nonparametric nature of the scatter matrices allows the procedure to work well even for non-Gaussian data sets. Using the same basic framework, a procedure is proposed to test the structural similarity of two distributions. The procedure works in high-dimensional space. It specifies a linear decomposition of the original data space in which a relative indication of dissimilarity along each new basis vector is provided. The nonparametric scatter matrices are also used to derive a clustering procedure, which is recognized as a k-nearest neighbor version of the nonparametric valley seeking algorithm. The form which results provides a unified view of the parametric nearest mean reclassification algorithm and the nonparametric valley seeking algorithm. 相似文献

9.

Non-symmetric correspondence analysis with ordinal variables using orthogonal polynomials

R. Lombardo E.J. Beh 《Computational statistics & data analysis》2007,52(1):566-577

Non-symmetrical correspondence analysis (NSCA) is a useful tool for graphically detecting the asymmetric relationship between two categorical variables. Most of the theory associated with NSCA does not distinguish between a two-way contingency table of ordinal variables and a two-way one of nominal variables. Typically, singular value decomposition (SVD) is used in classical NSCA for dimension reduction. A bivariate moment decomposition (BMD) for ordinal variables in contingency tables using orthogonal polynomials and generalized correlations is proposed. This method not only takes into account the ordinal nature of the two categorical variables, but also permits for the detection of significant association in terms of location, dispersion and higher order components. 相似文献

10.

Correspondence analysis of contingency tables

M A Moussa B A Ouda 《Computer methods and programs in biomedicine》1988,27(2):111-119

This paper deals with the multidimensional representation of the dependence between row and column variables of contingency tables using the correspondence analysis. It includes: (1) estimation of the optimal weights that maximize the canonical correlation between two categorical variables by an optimization iterative method, (2) testing the discriminability of the estimated scoring scheme, (3) evaluating the relative contribution of categories (rows or columns) for each dimension, (4) simultaneous symmetric graphical representation of row and column points. The method is applicable to two-way contingency tables or multi-way tables concatenated to two-way tables through merging variables to form interactive ones. 相似文献

11.

Simulation study of the tests of uniform association based on the power-divergence

M.M. Mayoral 《Information Sciences》2007,177(22):5024-5032

In this paper, a simulation study is presented to analyze the behavior of the family of test statistics proposed by Conde and Salicrú [J. Conde, M. Salicrú, Uniform association in contingency tables associated to Csiszár divergence, Statistics and Probability Letters, 37 (1998) 149-154] using the ?-divergence measures, that include as special case the power-divergence [N. Cressie, T.R.C. Read, Multinomial goodness-of-fit tests, Journal of the Royal Statistic Society, Series B, 46 (1984) 440-464] for the analysis of uniform association between two classification processes, based on the local odd ratios. For the above test statistics the significance level and its power are evaluated for different sample sizes when we consider a 3 × 2 contingency table. 相似文献

12.

Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power 总被引：10，自引：0，他引：10

Salvador García Alberto Fernández 《Information Sciences》2010,180(10):2044-120

Experimental analysis of the performance of a proposed method is a crucial and necessary task in an investigation. In this paper, we focus on the use of nonparametric statistical inference for analyzing the results obtained in an experiment design in the field of computational intelligence. We present a case study which involves a set of techniques in classification tasks and we study a set of nonparametric procedures useful to analyze the behavior of a method with respect to a set of algorithms, such as the framework in which a new proposal is developed.Particularly, we discuss some basic and advanced nonparametric approaches which improve the results offered by the Friedman test in some circumstances. A set of post hoc procedures for multiple comparisons is presented together with the computation of adjusted p-values. We also perform an experimental analysis for comparing their power, with the objective of detecting the advantages and disadvantages of the statistical tests described. We found that some aspects such as the number of algorithms, number of data sets and differences in performance offered by the control method are very influential in the statistical tests studied. Our final goal is to offer a complete guideline for the use of nonparametric statistical procedures for performing multiple comparisons in experimental studies. 相似文献

13.

Bayesian analysis of GUHA hypotheses

Robert Piché Marko Järvenpää Esko Turunen Milan Šimůnek 《Journal of Intelligent Information Systems》2014,42(1):47-73

The LISp-Miner system for data mining and knowledge discovery uses the GUHA method to comb through a large data base and finds 2 × 2 contingency tables that satisfy a certain condition given by generalised quantifiers and thereby suggest the existence of possible relations between attributes. In this paper, we show how a more detailed interpretation of the data in the tables that were found by GUHA can be obtained using Bayesian statistical methods. Using a multinomial sampling model and Dirichlet prior, we derive posterior distributions for parameters that correspond to GUHA generalised quantifiers. Examples are presented illustrating the new Bayesian post-processing tools implemented in LISp-Miner. A statistical model for the analysis of contingency tables for data from two subpopulations is also presented. 相似文献

14.

Bootstrap analysis of multiple repetitions of experiments using an interval-valued multiple comparison procedure

José Otero Luciano Sánchez Inés Couso Ana Palacios 《Journal of Computer and System Sciences》2014

A new bootstrap test is introduced that allows for assessing the significance of the differences between stochastic algorithms in a cross-validation with repeated folds experimental setup. Intervals are used for modeling the variability of the data that can be attributed to the repetition of learning and testing stages over the same folds in cross validation. Numerical experiments are provided that support the following three claims: (1) Bootstrap tests can be more powerful than ANOVA or Friedman test for comparing multiple classifiers. (2) In the presence of outliers, interval-valued bootstrap tests achieve a better discrimination between stochastic algorithms than nonparametric tests. (3) Choosing ANOVA, Friedman or Bootstrap can produce different conclusions in experiments involving actual data from machine learning tasks. 相似文献

15.

A Sieve Bootstrap Method for Correlation Analysis

Webber J.R. Gupta Y.P. 《Automatic Control, IEEE Transactions on》2007,52(6):1079-1081

This note presents a nonparametric sieve bootstrap method for estimating the variance of impulse response coefficients and the process steady-state gain determined via correlation analysis. The bootstrap estimates are demonstrated to be better for small samples than the analytical finite sample variance expression for the simplified form (assuming white noise input) of the Wiener-Hopf equations. Monte Carlo simulations demonstrate that solving the linear equations resulting from the Wiener-Hopf equations can result in a variance reduction. 相似文献

16.

Computer applications in cultural anthropology

Michael L. Burton 《Language Resources and Evaluation》1970,5(1):37-45

Summary This paper covers important developments in the use of computers for quantitative research in cultural anthropology, particularly in areas which (unlike statistics) are uniquely anthropological. These fall into statistical topics and topics in scaling and measurement. By far the largest single usage of computers by cultural anthropologists is for statistical summaries of field data and for simple statistical tests such as thechi-squared for the analysis of field data or for cross-cultural studies. As the discipline develops this situation will remain the same. In fact, the proportion of people who use the computer primarily for contingency tables, frequency counts, and correlation analysis may very well increase, since there are many potential users who would fall in this category and only a few potential users who would perform other operations such as multi-dimensional scaling or simulation. The few other computer techniques that would be relevant to anthropology, and for which the technology already exists, include linear regression, as practiced by economists, and linear programming (also practiced by economists), both of which could be extremely useful in the study of peasant economy. Careful research with such models could dispel some of the controversy which has been hindering the development of economic anthropology for the last fifteen years. The training of anthropologists who can understand the relevance of such models to their work may be far in the future, since the majority of them are still skeptical of most formal methods and of the computers which make them work. 相似文献

17.

Exploring contingency tables with correspondence analysis

E B Moser 《Computer applications in the biosciences》1989,5(3):183-189

An algorithm for correspondence analysis is described and implemented in SAS/IML (SAS Institute, 1985a). The technique is shown, through the analysis of several biological examples, to supplement the log-linear models approach to the analysis of contingency tables, both in the model identification and model interpretation stages of analysis. A simple two-way contingency table of tumor data is analyzed using correspondence analysis. This example emphasises the relationships between the parameters of the log-linear model for the table and the graphical correspondence analysis results. The technique is also applied to a three-way table of survey data concerning ulcer patients to demonstrate applications of simple correspondence analysis to higher dimensional tables with fixed margins. Finally, the diets and foraging behaviors of birds of the Hubbard Brook Forest are each analyzed and then a simultaneous display of the two separate but related tables is constructed to highlight relationships between the tables. 相似文献

18.

Confidence intervals of the hazard rate function for discrete distributions using mixtures

Dimitris Karlis Valentin Patilea 《Computational statistics & data analysis》2007,51(11):5388-5401

The statistical models and methods for lifetime data mainly deal with continuous nonnegative lifetime distributions. However, discrete lifetimes arise in various common situations where either the clock time is not the best scale for measuring lifetime or the lifetime is measured discretely. In most settings involving lifetime data, the population under study is not homogenous. Mixture models, in particular mixtures of discrete distributions, provide a natural answer to this problem. Nonparametric mixtures of power series distributions are considered, as for instance nonparametric mixtures of Poisson laws or nonparametric mixtures of geometric laws. The mixing distribution is estimated by nonparametric maximum likelihood (NPML). Next, the NPML estimator is used to build estimates and confidence intervals for the hazard rate function of the discrete lifetime distribution. To improve the performance of the confidence intervals, a bootstrap procedure is considered where the estimated mixture is used for resampling. Various bootstrap confidence intervals are investigated and compared to the confidence intervals obtained directly from the NPML estimates. 相似文献

19.

Improving the reliability of bootstrap tests with the fast double bootstrap 总被引：2，自引：0，他引：2

Russell Davidson James G. MacKinnon 《Computational statistics & data analysis》2007,51(7):3259-3281

Two procedures are proposed for estimating the rejection probabilities (RPs) of bootstrap tests in Monte Carlo experiments without actually computing a bootstrap test for each replication. These procedures are only about twice as expensive (per replication) as estimating RPs for asymptotic tests. Then a new procedure is proposed for computing bootstrap P values that will often be more accurate than ordinary ones. This “fast double bootstrap” (FDB) is closely related to the double bootstrap, but it is far less computationally demanding. Simulation results for three different cases suggest that the FDB can be very useful in practice. 相似文献

20.

A nonparametric test for the equality of counting processes with panel count data

N. Balakrishnan 《Computational statistics & data analysis》2010,54(1):135-300

This paper considers the problem of nonparametric comparison of counting processes with panel count data, which arise naturally when recurrent events are considered. For the problem considered, we construct a new nonparametric test statistic based on the nonparametric maximum likelihood estimator of the mean function of the counting processes over observation times. The asymptotic distribution of the proposed statistic is derived and its finite-sample property is examined through Monte Carlo simulations. The simulation results show that the proposed method is good for practical use and also more powerful than the existing nonparametric tests based on the nonparametric maximum pseudo-likelihood estimator. A set of panel count data from a floating gallstone study is analyzed and presented as an illustrative example. 相似文献