期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A generally robust approach for testing hypotheses and setting confidence intervals for effect sizes.

Keselman H. J.; Algina James; Lix Lisa M.; Wilcox Rand R.; Deering Kathleen N. 《Canadian Metallurgical Quarterly》2008,13(2):110

Standard least squares analysis of variance methods suffer from poor power under arbitrarily small departures from normality and fail to control the probability of a Type I error when standard assumptions are violated. This article describes a framework for robust estimation and testing that uses trimmed means with an approximate degrees of freedom heteroscedastic statistic for independent and correlated groups designs in order to achieve robustness to the biasing effects of nonnormality and variance heterogeneity. The authors describe a nonparametric bootstrap methodology that can provide improved Type I error control. In addition, the authors indicate how researchers can set robust confidence intervals around a robust effect size parameter estimate. In an online supplement, the authors use several examples to illustrate the application of an SAS program to implement these statistical methods. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

2.

Testing for adverse impact when sample size is small.

Collins Michael W.; Morris Scott B. 《Canadian Metallurgical Quarterly》2008,93(2):463

Adverse impact evaluations often call for evidence that the disparity between groups in selection rates is statistically significant, and practitioners must choose which test statistic to apply in this situation. To identify the most effective testing procedure, the authors compared several alternate test statistics in terms of Type I error rates and power, focusing on situations with small samples. Significance testing was found to be of limited value because of low power for all tests. Among the alternate test statistics, the widely-used Z-test on the difference between two proportions performed reasonably well, except when sample size was extremely small. A test suggested by G. J. G. Upton (1982) provided slightly better control of Type I error under some conditions but generally produced results similar to the Z-test. Use of the Fisher Exact Test and Yates's continuity-corrected chi-square test are not recommended because of overly conservative Type I error rates and substantially lower power than the Z-test. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

3.

Advances in testing the statistical significance of mediation effects. 总被引：1，自引：0，他引：1

Mallinckrodt Brent; Abraham W. Todd; Wei Meifen; Russell Daniel W. 《Canadian Metallurgical Quarterly》2006,53(3):372

P. A. Frazier, A. P. Tix, and K. E. Barron (2004) highlighted a normal theory method popularized by R. M. Baron and D. A. Kenny (1986) for testing the statistical significance of indirect effects (i.e., mediator variables) in multiple regression contexts. However, simulation studies suggest that this method lacks statistical power relative to some other approaches. The authors describe an alternative developed by P. E. Shrout and N. Bolger (2002) based on bootstrap resampling methods. An example and step-by-step guide for performing bootstrap mediation analyses are provided. The test of joint significance is also briefly described as an alternative to both the normal theory and bootstrap methods. The relative advantages and disadvantages of each approach in terms of precision in estimating confidence intervals of indirect effects, Type I error, and Type II error are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

4.

A comparison of methods to test mediation and other intervening variable effects. 总被引：3，自引：0，他引：3

MacKinnon David P.; Lockwood Chondra M.; Hoffman Jeanne M.; West Stephen G.; Sheets Virgil 《Canadian Metallurgical Quarterly》2002,7(1):83

A Monte Carlo study compared 14 methods to test the statistical significance of the intervening variable effect. An intervening variable (mediator) transmits the effect of an independent variable to a dependent variable. The commonly used R. M. Baron and D. A. Kenny (1986) approach has low statistical power. Two methods based on the distribution of the product and 2 difference-in-coefficients methods have the most accurate Type I error rates and greatest statistical power except in 1 important case in which Type I error rates are too high. The best balance of Type I error and statistical power across all cases is the test of the joint significance of the two effects comprising the intervening variable effect. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献

5.

Testing for Suspected Impairments and Dissociations in Single-Case Studies in Neuropsychology: Evaluation of Alternatives Using Monte Carlo Simulations and Revised Tests for Dissociations.

Crawford John R.; Garthwaite Paul H. 《Canadian Metallurgical Quarterly》2005,19(3):318

In neuropsychological single-case studies, a patient is compared with a small control sample. Methods of testing for a deficit on Task X, or a significant difference between Tasks X and Y, either treat the control sample statistics as parameters (using z and zD) or use modified t tests. Monte Carlo simulations demonstrated that if z is used to test for a deficit, the Type I error rate is high for small control samples, whereas control of the error rate is essentially perfect for a modified t test. Simulations on tests for differences revealed that error rates were very high for zD. A new method of testing for a difference (the revised standardized difference test) achieved good control of the error rate, even with very small sample sizes. A computer program that implements this new test (and applies criteria to test for classical and strong dissociations) is made available. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

6.

Mixed-model pairwise multiple comparisons of repeated measures means.

Kowalchuk Rhonda K.; Keselman H. J. 《Canadian Metallurgical Quarterly》2001,6(3):282

One approach to the analysis of repeated measures data allows researchers to model the covariance structure of the data rather than presume a certain structure, as is the case with conventional univariate and multivariate test statistics. This mixed-model approach was evaluated for testing all possible pairwise differences among repeated measures marginal means in a Between-Subjects?×?Within-Subjects design. Specifically, the authors investigated Type I error and power rates for a number of simultaneous and stepwise multiple comparison procedures using SAS (1999) PROC MIXED in unbalanced designs when normality and covariance homogeneity assumptions did not hold. J. P. Shaffer's (1986) sequentially rejective step-down and Y. Hochberg's (1988) sequentially acceptive step-up Bonferroni procedures, based on an unstructured covariance structure, had superior Type I error control and power to detect true pairwise differences across the investigated conditions. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

7.

Testing multiple outcomes in repeated measures designs.

Lix Lisa M.; Sajobi Tolulope 《Canadian Metallurgical Quarterly》2010,15(3):268

This study investigates procedures for controlling the familywise error rate (FWR) when testing hypotheses about multiple, correlated outcome variables in repeated measures (RM) designs. A content analysis of RM research articles published in 4 psychology journals revealed that 3 quarters of studies tested hypotheses about 2 or more outcome variables. Several procedures originally proposed for testing multiple outcomes in 2-group designs are extended to 2-group RM designs. The investigated procedures include 2 modified Bonferroni procedures that adjust the level of significance, α, for the effective number of outcomes and a permutation step-down (PSD) procedure. The FWR, any-variable power, and all-variable power are investigated in a Monte Carlo study. One modified Bonferroni procedure frequently resulted in inflated FWRs, whereas the PSD procedure controlled the FWR. The PSD procedure could be substantially more powerful than the conventional Bonferroni procedure, which does not account for dependencies among the outcome variables. However, the difference in power between the PSD procedure, which does account for these dependencies, and Hochberg's step-up procedure, which does not, were negligible. A numeric example illustrates implementation of these multiple-testing procedures. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

8.

Using analysis of covariance (ANCOVA) with fallible covariates.

Culpepper Steven Andrew; Aguinis Herman 《Canadian Metallurgical Quarterly》2011,16(2):166

Analysis of covariance (ANCOVA) is used widely in psychological research implementing nonexperimental designs. However, when covariates are fallible (i.e., measured with error), which is the norm, researchers must choose from among 3 inadequate courses of action: (a) know that the assumption that covariates are perfectly reliable is violated but use ANCOVA anyway (and, most likely, report misleading results); (b) attempt to employ 1 of several measurement error models with the understanding that no research has examined their relative performance and with the added practical difficulty that several of these models are not available in commonly used statistical software; or (c) not use ANCOVA at all. First, we discuss analytic evidence to explain why using ANCOVA with fallible covariates produces bias and a systematic inflation of Type I error rates that may lead to the incorrect conclusion that treatment effects exist. Second, to provide a solution for this problem, we conduct 2 Monte Carlo studies to compare 4 existing approaches for adjusting treatment effects in the presence of covariate measurement error: errors-in-variables (EIV; Warren, White, & Fuller, 1974), Lord's (1960) method, Raaijmakers and Pieters's (1987) method (R&P), and structural equation modeling methods proposed by S?rbom (1978) and Hayduk (1996). Results show that EIV models are superior in terms of parameter accuracy, statistical power, and keeping Type I error close to the nominal value. Finally, we offer a program written in R that performs all needed computations for implementing EIV models so that ANCOVA can be used to obtain accurate results even when covariates are measured with error. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献

9.

The Persistence of Underpowered Studies in Psychological Research: Causes, Consequences, and Remedies.

Maxwell Scott E. 《Canadian Metallurgical Quarterly》2004,9(2):147

Underpowered studies persist in the psychological literature. This article examines reasons for their persistence and the effects on efforts to create a cumulative science. The "curse of multiplicities" plays a central role in the presentation. Most psychologists realize that testing multiple hypotheses in a single study affects the Type I error rate, but corresponding implications for power have largely been ignored. The presence of multiple hypothesis tests leads to 3 different conceptualizations of power. Implications of these 3 conceptualizations are discussed from the perspective of the individual researcher and from the perspective of developing a coherent literature. Supplementing significance tests with effect size measures and confidence intervals is shown to address some but not necessarily all problems associated with multiple testing. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

10.

Deficiencies in the detection of cognitive deficits.

Keith Julian R.; Puente Antonio E. 《Canadian Metallurgical Quarterly》2002,16(3):434

Numerous studies purport to show that cardiopulmonary bypass (CPB) surgery is associated with persistent postoperative cognitive decline. In J. R. Keith et al. (2002), the authors argued that reports of post-CPB cognitive declines have often been quantified using data analysis methods that were based on tenuous assumptions and overlooked problems associated with familywise Type I errors. Four peers who are recognized for their expertise in neuropsychological outcomes research evaluated the arguments developed in the J. R. Keith et al. article, critiqued the study presented in that article, and offered suggestions for how to investigate whether cognitive decline occurs reliably after CPB. In this reply article, the authors respond to the open-peer commentaries made regarding the J. R. Keith et al. study. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

11.

Multiplicity, directional (Type III) errors, and the Null Hypothesis.

Shaffer Juliet Popper 《Canadian Metallurgical Quarterly》2002,7(3):356

L. V. Jones and J. W. Tukey (2000) pointed out that the usual 2-sided, equal-tails null hypothesis test at level ot can be reinterpreted as simultaneous tests of 2 directional inequality hypotheses, each at level α/2, and that the maximum probability of a Type I error is α/2 if the truth of the null hypothesis is considered impossible. This article points out that in multiple testing with familywise error rate controlled at ot, the directional error rate (assuming all null hypotheses are false) is greater than α/2 and can be arbitrarily close to α. Single-step, step-down, and step-up procedures are analyzed, and other error rates, including the false discovery rate, are discussed. Implications for confidence interval estimation and hypothesis testing practices are considered. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

12.

Protecting the overall rate of Type I errors for pairwise comparisons with an omnibus test statistic.

Keselman Harvey J.; Games Paul A.; Rogan Joanne C. 《Canadian Metallurgical Quarterly》1979,86(4):884

Compares 2 procedures for protecting the number of false rejections for a set of all possible pairwise comparisons. The 2-stage strategy of computing pairwise comparisons, conditional on a significant omnibus test, is compared with the multiple comparison strategy that sets a "familywise" critical value directly. The ANOVA test, the Brown and Forsythe test, and the Welch omnibus test, as well as 3 procedures for assessing the significance of pairwise comparisons, are combined into 9 2-stage testing strategies. The data from this study establish that the common strategy of following a significant ANOVA F with Student's t tests on pairs of means results in a substantially inflated rate of Type I error when variances are heterogeneous. Type I error control, however, can be obtained with other 2-stage procedures, and the authors tentatively consider the Welch F″ Welch t″ combination desirable. In addition, the 2 techniques for controlling Type I error do not substantially differ as much as might be expected; some 2-stage procedures are comparable to simultaneous techniques. (18 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

13.

A Monte Carlo study of the inferential properties of three methods of shape comparison

WM Coward D McConathy 《Canadian Metallurgical Quarterly》1996,99(3):369-377

Three inferential morphometric methods, Euclidean distance matrix analysis (EDMA), Bookstein's edge-matching method (EMM), and the Procrustes method, were applied to facial landmark data. A Monte Carlo simulation was conducted with three sample sizes, ranging from n = 10 to 50, to assess type I error rates and the power of the tests to detect group differences for two- and three-dimensional representations of forms. Type I error rates for EMM were at or below nominal levels in both two and three dimensions. Procrustes in 2D and EDMA in 2D and 3D produced inflated type I error rates in all conditions, but approached acceptable levels with moderate cell sizes. Procrustes maintained error rates below the nominal levels in 2D. The power of EMM was high compared with the other methods in both 2D and 3D, but, conflicting EMM decisions were provided depending on which pair (2D) or triad (3D) of landmarks were selected as reference points. EDMA and Procrustes were more powerful in 2D data than for 3D data. Interpretation of these results must take into account that the data used in this simulation were selected because they represent real data that might have been collected during a study or experiment. These data had characteristics which violated assumptions central to the methods here with unequal variances about landmarks, correlated errors, and correlated landmark locations; therefore these results may not generalize to all conditions, such as cases with no violations of assumptions. This simulation demonstrates, however, limitations of each procedure that should be considered when making inferences about shape comparisons. 相似文献

14.

Analysis of Likert-scale data: A reinterpretation of Gregoire and Driver.

Rasmussen Jeffrey L. 《Canadian Metallurgical Quarterly》1989,105(1):167

Most studies that have investigated the use of coarsely grained scales have indicated that the accuracy of statistics calculated on such scales is not compromised as long as the scales have about 5 or more points. Gregoire and Driver (1987), however, found serious perturbances of the Type I and Type II error rates using a 5-point scale. They carried out three computer simulation experiments in which continuous data were transformed to Likert-scale values. Two of the three experiments are shown to be flawed because the authors incorrectly specified the population mean in their simulation. This article corrects the flaw and demonstrates that the Type I and Type II error rates are not seriously compromised by the use of ordinal-scale data. Furthermore, Gregoire and Driver's results are reinterpreted to show that in most cases, the parametric test of location equality shows a power superiority to the nonparametric tests. Only in their most nonnormal simulation does a nonparametric test show a power superiority. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

15.

Repeated measurements designs and counterbalancing.

Gaito John 《Canadian Metallurgical Quarterly》1961,58(1):46

"Six types of analysis of repeated measurements designs are indicated. The effects of order, interactions containing order, and correlated observations on the components of variance and analysis of variance tests of significance are considered. The first two act, in general, to inflate the error estimates and thus to increase the probability of a Type II error. The correlated observations (if unequal) have the opposite effect, i.e., increase the probability of a Type I error." From Psyc Abstracts 36:01:1AF46G. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

16.

Type I error rates for moderated multiple regression analysis.

Paunonen Sampo V.; Jackson Douglas N. 《Canadian Metallurgical Quarterly》1988,73(3):569

The method of moderated multiple regression is increasingly being applied in the search for moderator variables in industrial and organizational psychology. Because of frequent failures of the method in revealing moderator effects in empirical studies—in which such effects are strongly expected—it has been suggested that the procedure may lack statistical power with respect to hypothesis tests about moderating effects and, therefore, is inappropriate for the purposes of conventional moderator analyses. We evaluated this conclusion with computer simulation data. Our study indicated that the method is not overly conservative and that the Type I error rate of moderated multiple regression is approximately .05 at α?=?.05. Moreover, a proposed alternative multivariate procedure, principal component regression, is shown to have a Type I error rate that approaches unity under ordinary conditions when applied to the evaluation of moderator effects. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

17.

Structural Equation Models of Latent Interactions: Evaluation of Alternative Estimation Strategies and Indicator Construction.

Marsh Herbert W.; Wen Zhonglin; Hau Kit-Tai 《Canadian Metallurgical Quarterly》2004,9(3):275

Interactions between (multiple indicator) latent variables are rarely used because of implementation complexity and competing strategies. Based on 4 simulation studies, the traditional constrained approach performed more poorly than did 3 new approaches-unconstrained, generalized appended product indicator, and quasi-maximum-likelihood (QML). The authors' new unconstrained approach was easiest to apply. All 4 approaches were relatively unbiased for normally distributed indicators, but the constrained and QML approaches were more biased for nonnormal data; the size and direction of the bias varied with the distribution but not with the sample size. QML had more power, but this advantage was qualified by consistently higher Type I error rates. The authors also compared general strategies for defining product indicators to represent the latent interaction factor. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

18.

Single-sample tests for many correlations.

Larzelere Robert E.; Mulaik Stanley A. 《Canadian Metallurgical Quarterly》1977,84(3):557

When more than one correlation coefficient is tested for significance in a study, the probability of making at least one Type I error rises rapidly as the number of tests increases, and the probability of making a Type I error after a Type I error on a previous test is usually greater than the nominal significance level used in each test. To avoid excessive Type I errors with multiple tests of correlations, it is noted that researchers should use procedures that answer research questions with a single statistical test and/or should use special multiple-test procedures. A review of simultaneous-test and multiple-test procedures for correlations (e.g. Bartlett and Rajalakshman's test, multistage Bonferroni procedure, union-intersection tests, and the rank adjusted method) is presented, and several new procedures are described. (40 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

19.

Testing covariates of Type 2 diabetes-cognition associations in older adults: Moderating or mediating effects?

McFall G. Peggy; Geall Bonnie P.; Fischer Ashley L.; Dolcos Sanda; Dixon Roger A. 《Canadian Metallurgical Quarterly》2010,24(5):547

Objective: The general goal of this study was to advance our understanding of Type 2 diabetes (T2D)-cognition relationships in older adults by linking and testing comprehensive sets of potential moderators, potential mediators, and multiple cognitive outcomes. Method: We identified in the literature 13 health-related (but T2D-distal) potential covariates, representing four informal domains (i.e., biological vitality, personal affect, subjective health, lifestyle activities). Cross-sectional data from the Victoria Longitudinal Study (age range = 53–90 years; n = 41 T2D and n = 458 control participants) were used. We first examined whether any of the 13 potential covariates influenced T2D-cognition associations, as measured by a comprehensive neuropsychological battery (15 measures). Next, using standard regression-based moderator and mediator analyses, we systematically tested whether the identified covariates would significantly alter observed T2D-cognition relationships. Results: Six potential covariates were found to be sensitive to T2D associations with performance on seven cognitive measures. Three factors (systolic blood pressure, gait-balance composite, subjective health) were significant mediators. Each mediated multiple cognitive outcomes, especially measures of neurocognitive speed, executive functioning, and episodic memory. Conclusions: Our findings offer a relatively comprehensive perspective of T2D-related cognitive deficits, comorbidities, and modulating influences. The implications for future research reach across several fields of study and application. These include (1) neuropsychological research on neural and biological bases of T2D-related cognitive decline, (2) clinical research on intervention and treatment strategies, and (3) larger-scale longitudinal studies examining the potential multilateral and dynamic relationships among T2D status, related comorbidities, and cognitive outcomes. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

20.

The role of method in treatment effectiveness research: Evidence from meta-analysis.

Wilson David B.; Lipsey Mark W. 《Canadian Metallurgical Quarterly》2001,6(4):413

A synthesis of 319 meta-analyses of psychological, behavioral, and educational treatment research was conducted to assess the influence of study method on observed effect sizes relative to that of substantive features of the interventions. An index was used to estimate the proportion of effect size variance associated with various study features. Study methods accounted for nearly as much variability in study outcomes as characteristics of the interventions. Type of research design and operationalization of the dependent variable were the method features associated with the largest proportion of variance. The variance as a result of sampling error was about as large as that associated with the features of the interventions studied. These results underscore the difficulty of detecting treatment outcomes, the importance of cautiously interpreting findings from a single study, and the importance of meta-analysis in summarizing results across studies. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献