首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Reports an error in "Improved Bonferroni-type multiple testing procedures" by Burt S. Holland and Margaret D. Copenhaver (Psychological Bulletin, 1988[Jul], Vol 104[1], 145-149). An error was made in the author note on page 145. Correspondence should be addressed to Burt S. Holland, Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, Pennsylvania 19122. Margaret DiPonzio Copenhaver is now at Merck Sharp & Dohme Research Laboratories, West Point, Pennsylvania. (The following abstract of the original article appeared in record 1988-34705-001.) The Bonferroni multiple comparisons procedure is customarily used when doing several simultaneous tests of significance in relatively nonstandard situations in which other methods do not apply. We review some new and improved competitors to the Bonferroni procedure, that although constraining generalized Type I error probability to be at most α, afford increased power in exchange for increased complexity in implementation. An improvement to the weighted form of the Bonferroni procedure is also presented. Several data sets are reanalyzed with the new methods. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

2.
1. A common statistical flaw in articles submitted to or published in biomedical research journals is to test multiple null hypotheses that originate from the results of a single experiment without correcting for the inflated risk of type 1 error (false positive statistical inference) that results from this. Multiple comparison procedures (MCP) are designed to minimize this risk. The present review focuses on pairwise contrasts, the most common sort of multiple comparisons made by biomedical investigators. 2. In an earlier review a variety of MCP were described and evaluated. It was concluded that an effective MCP should control the risk of family-wise type 1 error, so as to ensure that not more than one hypothesis within a single family is falsely rejected. One-step procedures based on the Bonferroni or Sidák inequalities do this. For continuous data and under normal distribution theory, so does the Tukey-Kramer procedure for all possible pairwise contrasts of means and the Dunnett procedure for all possible pairwise contrasts of means with a control mean. 3. There is now a new class of MCP, based on the Bonferroni or Sidák inequalities but performed in a step-wise fashion. The members of this class have certain desirable properties. They: (i) control the family-wise type 1 error rate as effectively as the one-step procedures; (ii) are more powerful than the one-step Bonferroni or Sidák procedures, especially when hypotheses are logically related; and (iii) can be applied not only to continuous data but also to ordinal or categorical data. 4. Of the new step-wise MCP, Holm's step-down procedures are commended for their combination of accuracy, power and versatility. They also have the virtue of simplicity. Given the raw P values that result from conventional tests of significance, the adjustments for multiple comparisons can be made by hand or hand-held calculator. 5. Despite the corrective abilities of the new step-wise MCP, investigators should try to design their experiments and analyses to test a single, global hypothesis rather than multiple ones.  相似文献   

3.
We compare two approaches to the identification of individual significant outcomes when a comparison of two groups involves multiple outcome variables. The approaches are all designed to control the familywise error rate (FWE) with any subset of the null hypothesis being true (in the strong sense). The first approach is initially to use a global test of the overall hypothesis that the groups are equivalent for all variables, followed by an application of the closed testing algorithm of Marcus, Peritz and Gabriel. The global tests considered here are ordinary least squares (OLS), generalized least squares (GLS), an approximation to a likelihood ratio test (ALR), and a new test based on an approximation to the most powerful similar test for simple alternatives. The second approach is that of stepwise testing, which tests the univariate hypotheses in a specific order with appropriate adjustment to the univariate p-values for multiplicity. The stepwise tests considered include both step-down and step-up tests of a general type, and likewise permutation tests that incorporate the dependence structure of the data. We illustrate the tests with two examples of birth outcomes: a comparison of cocaine-exposed new-borns to control new-borns on neurobehavioural and physical growth variables, and, in a separate study, a comparison of babies born to diabetic mothers and babies born to non-diabetic mothers on minor malformations. After describing the methods and analysing the birth outcome data, we use simulations on Gaussian data to provide guidelines for the use of these procedures in terms of power and computation.  相似文献   

4.
Derivation of the minimum sample size is an important consideration in an applied research effort. When the outcome is measured at a single time point, sample size procedures are well known and widely applied. The corresponding situation for longitudinal designs, however, is less well developed. In this paper, we adapt the generalized estimating equation (GEE) approach of Liang and Zeger to sample size calculations for discrete and continuous outcome variables. The non-central version of the Wald Chi 2 test is considered. We use the damped exponential family of correlation structures described in Mu?oz et al. for the 'working' correlation matrix among the repeated measures. We present a table of minimum sample sizes for binary outcomes, and discuss extensions that account for unequal allocation, staggered entry and loss to follow-up.  相似文献   

5.
One approach to the analysis of repeated measures data allows researchers to model the covariance structure of the data rather than presume a certain structure, as is the case with conventional univariate and multivariate test statistics. This mixed-model approach was evaluated for testing all possible pairwise differences among repeated measures marginal means in a Between-Subjects?×?Within-Subjects design. Specifically, the authors investigated Type I error and power rates for a number of simultaneous and stepwise multiple comparison procedures using SAS (1999) PROC MIXED in unbalanced designs when normality and covariance homogeneity assumptions did not hold. J. P. Shaffer's (1986) sequentially rejective step-down and Y. Hochberg's (1988) sequentially acceptive step-up Bonferroni procedures, based on an unstructured covariance structure, had superior Type I error control and power to detect true pairwise differences across the investigated conditions. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

6.
Hypothesis testing with multiple outcomes requires adjustments to control Type I error inflation, which reduces power to detect significant differences. Maintaining the prechosen Type I error level is challenging when outcomes are correlated. This problem concerns many research areas, including neuropsychological research in which multiple, interrelated assessment measures are common. Standard p value adjustment methods include Bonferroni-, Sidak-, and resampling-class methods. In this report, the authors aimed to develop a multiple hypothesis testing strategy to maximize power while controlling Type I error. The authors conducted a sensitivity analysis, using a neuropsychological dataset, to offer a relative comparison of the methods and a simulation study to compare the robustness of the methods with respect to varying patterns and magnitudes of correlation between outcomes. The results lead them to recommend the Hochberg and Hommel methods (step-up modifications of the Bonferroni method) for mildly correlated outcomes and the step-down minP method (a resampling-based method) for highly correlated outcomes. The authors note caveats regarding the implementation of these methods using available software. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
The authors investigate the relative power of unweighted and weighted versions of simultaneous and sequential Bonferroni procedures. The unweighted simultaneous procedure is represented by the standard Bonferroni; the unweighted sequential methods are from S. Holm (1979) and G. Hommel (1988); the weighted simultaneous procedure comes from R. Rosenthal and D. B. Rubin (see record 1984-05701-001); and the weighted sequential method is also from S. Holm. These are applied to a complete set of C?=?3 orthogonal contrasts defined by 4 treatment groups. The authors present power estimates for small, medium, and large effects in various effect and weighting configurations. They discuss factors affecting the relative power of these methods and present statistical evidence that strengthens a case against weighted procedures built on methodological subjectivity issues. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

8.
Many reaction time (RT) researchers truncate their data sets, excluding as spurious all RTs falling outside a prespecified range. Such truncation can introduce bias because extreme but valid RTs may be excluded. This article examines biasing effects of truncation under various assumptions about the underlying distributions of valid and spurious RTs. For the mean, median, standard deviation and skewness of RT, truncation bias is larger than some often-studied experimental effects. Truncation can also seriously distort linear relations between RT and an independent variable, additive RT patterns in factorial designs, and hazard functions, but it has little effect on statistical power. A promising maximum likelihood procedure for estimating properties of an untruncated distribution from a truncated sample is reported, and a set of procedures to control for truncation biases when testing hypotheses is appended. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

9.
An important, frequent, and unresolved problem in treatment research is deciding how to analyze outcome data when some of the data are missing. After a brief review of alternative procedures and the underlying models on which they are based, an approach is presented for dealing with the most common situation—comparing the outcome results in a 2-group, randomized design in the presence of missing data. The proposed analysis is based on the concept of "modeling our ignorance" by examining all possible outcomes, given a known number of missing results with a binary outcome, and then describing the distribution of those results. This method allows the researcher to define the range of all possible results that could have resulted had the missing data been observed. Extensions to more complex designs are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
A procedure offered by Morris, Sherman, and Mansfield (1986) for testing hypotheses about interactions ("moderator variables") is unacceptable. A related proposal by Bobko (1986) is also problematic. One can appropriately test such hypotheses by stepwise regression, comparing an equation in the simple predictors with an equation that also includes the product of their deviation scores. There is, however, reason to believe that methods with greater statistical power can be found. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
W. B. Stiles and D. A. Shapiro (see record 1995-10433-001) argue that trivial correlations between process variables and treatment outcome point to inherent methodological limitations of correlational designs in process-outcome research. In coming to such a far-reaching (erroneous) conclusion, Stiles and Shapiro are throwing out the baby with the bath. Correlational designs are perfectly appropriate for testing process-outcome correlations if process measures are adequately conceptualized. Examples of case-specific measures of therapist responsiveness are reviewed to illustrate the power of correlational designs. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
This article illustrates the use of structural equation modeling (SEM) procedures with latent variables to analyze data from experimental studies. These procedures allow the researcher to remove the biasing effects of random and correlated measurement error on the outcomes of the experiment and to examine processes that may account for changes in the outcome variables that are observed. Analyses of data from a Project Family study, an experimental intervention project with rural families that strives to improve parenting skills, are presented to illustrate the use of these modeling procedures. Issues that arise in applying SEM procedures, such as sample size and distributional characteristics of the measures, are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
A general rationale and specific procedures for examining the statistical power characteristics of psychology-of-aging (POA) empirical studies are provided. First, 4 basic ingredients of statistical hypothesis testing are reviewed. Then, 2 measures of effect size are introduced (standardized mean differences and the proportion of variation accounted for by the effect of interest), and methods are given for estimating these measures from already-completed studies. Power and sample size formulas, examples, and discussion are provided for common comparison-of-means designs, including independent samples 1-factor and factorial ANOVA designs, analysis of covariance (ANCOVA) designs, repeated measures (correlated samples) ANOVA designs, and split-plot (combined between- and within-Ss) ANOVA designs. Because of past conceptual differences, special attention is given to the power associated with statistical interactions, and cautions about applying the various procedures are indicated. Illustrative power estimations also are applied to a published study from the literature. POA researchers will be better informed consumers of what they read and more "empowered" with respect to what they research by understanding the important roles played by power and sample size in statistical hypothesis testing. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
There have been strong critiques of the notion that environmental influences can have an important effect on psychological functioning. The substance of these criticisms is considered in order to infer the methodological challenges that have to be met. Concepts of cause and of the testing of causal effects are discussed with a particular focus on the need to consider sample selection and the value (and limitations) of longitudinal data. The designs that may be used to test hypotheses on specific environmental risk mechanisms for psychopathology are discussed in relation to a range of adoption strategies, twin designs, various types of "natural experiments," migration designs, the study of secular change, and intervention designs. In each case, consideration is given to the need for samples that "pull-apart" variables that ordinarily go together, specific hypotheses on possible causal processes, and the specification and testing of key assumptions. It is concluded that environmental risk hypotheses can be (and have been) put to the test but that it is usually necessary to use a combination of research strategies. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

15.
We investigated the relative and combined effects of personal and situational variables on job outcomes of new professionals. The personal variables were cognitive ability, socioeconomic status, and career goals; the situational variables were job feedback, autonomy, and job context. Data were collected at two times from 280 newly hired, entry-level accountants at "Big Eight" firms. Both personal and situational variables predict job outcomes, but their relative influence depends on the outcome measure. Situational variables account for the most variance in job performance, job satisfaction, and organizational commitment; personal variables account for the most variance in promotability, internal work motivation, and turnover. The findings indicate that job performance does not take care of itself by selecting bright people, but requires constant vigilance and effective systems. The results also suggest that a given result can be achieved through a variety of behavioral science interventions. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
Since the publication of Loftus and Masson’s (1994) method for computing confidence intervals (CIs) in repeated-measures (RM) designs, there has been uncertainty about how to apply it to particular effects in complex factorial designs. Masson and Loftus (2003) proposed that RM CIs for factorial designs be based on number of observations rather than number of participants. However, determining the correct number of observations for a particular effect can be complicated, given the variety of effects occurring in factorial designs. In this paper the authors define a general “number of observations” principle, explain why it obtains, and provide step-by-step instructions for constructing CIs for various effect types. The authors illustrate these procedures with numerical examples. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
Repeated measures designs involving nonorthogonal variables are being used with increasing frequency in cognitive psychology. Researchers usually analyze the data from such designs inappropriately, probably because the designs are not discussed in standard textbooks on regression. Two commonly used approaches to analyzing repeated measures designs are considered in this article. It is argued that both approaches use inappropriate error terms for testing the effects of independent variables. A more appropriate analysis is presented, and two alternative computational procedures for the analysis are illustrated. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

18.
DB Rubin 《Canadian Metallurgical Quarterly》1998,17(3):371-85; discussion 387-9
Standard randomization-based tests of sharp null hypotheses in randomized clinical trials, that is, intent-to-treat analyses, are valid without extraneous assumptions, but generally can be appropriately powerful only with alternative hypotheses that involve treatment assignment having an effect on outcome. In the context of clinical trials with non-compliance, other alternative hypotheses can be more natural. In particular, when a trial is double-blind, it is often reasonable for the alternative hypothesis to exclude any effect of treatment assignment on outcome for a unit unless the assignment affected which treatment that unit actually received. Bayesian analysis under this alternative 'exclusion' hypothesis leads to new estimates of the effect of receipt of treatment, and to a new randomization-based procedure that has frequentist validity yet can be substantially more powerful than the standard intent-to-treat procedure. The key idea is to obtain a p-value using a posterior predictive check distribution, which includes a model for non-compliance behaviour, although only under the standard sharp null hypothesis of no effect of assignment (or receipt) of treatment on outcome. It is important to note that these new procedures are distinctly different from 'as treated' and 'per protocol' analyses, which are not only badly biased in general, but generally have very low power.  相似文献   

19.
In the course of clinical (or preclinical) trial studies, it is a common practice to conduct a relatively large number of tests to extract the maximum level of information from the study. It has been known that as the number of tests (or endpoints) increases, the probability of falsely rejecting at least one hypothesis also increases. Single-step methods such as the Bonferroni, Sidák, or James approximation procedure have been used to adjust the p-values for each hypothesis. To reduce the conservatism (i.e., underestimating type I error) possessed by the aforementioned methods, Holm proposed a so-called "free-step-down" procedure. This adjustment can be made even less conservative by incorporating the dependence structure of endpoints at each adjustment step of the procedure. That is done by sequentially applying James's approximation procedure for correlated endpoints at each step, referred to as the Free-James method. This article primarily compares the power of the Free-James method to the power of the Bonferroni and James single-step-down and the Holm free-step-down methods. Two definitions of power are considered: (a) the probability of correctly rejecting at least one hypothesis when it is true, and (b) the probability of correctly rejecting all hypotheses that are true. Monte Carlo simulations show that the Free-James method is as good as other methods under definition (a) and the most powerful under definition (b) for various sample sizes, numbers of endpoints, and correlations.  相似文献   

20.
Psychologists do not analyze the conceptual relations between their independent and dependent variables. Hence, they fail to recognize that the plausibility of their hypotheses stems from the conceptual relatedness of the variables. The outcome is research that appears to test hypotheses but really tests only procedures, because the hypotheses involve conceptually related variables and are necessarily true. Domains in which this has been demonstrated are discussed. Psychologic is an axiomatic system intended to formulate the psychologically relevant conceptual relationships embedded in language and is an instrument for describing, explaining, predicting, and controlling intrapersonal and interpersonal processes. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号