期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Evaluating statistical difference, equivalence, and indeterminancy using inferential confidence intervals: An integrated alternative method of conducting null hypothesis statistical tests.

Tryon Warren W. 《Canadian Metallurgical Quarterly》2001,6(4):371

Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes an integrated, alternative inferential confidence interval approach to testing for statistical difference, equivalence, and indeterminacy that is algebraically equivalent to standard NHST procedures and therefore exacts the same evidential standard. The combined numeric and graphic tests of statistical difference, equivalence, and indeterminacy are designed to avoid common interpretive problems associated with NHST procedures. Multiple comparisons, power, sample size, test reliability, effect size, and cause-effect ratio are discussed. A section on the proper interpretation of confidence intervals is followed by a decision rule summary and caveats. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

2.

The epistemology of mathematical and statistical modeling: A quiet methodological revolution.

Rodgers Joseph Lee 《Canadian Metallurgical Quarterly》2010,65(1):1

A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rule-based framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

3.

The inscrutable null hypothesis.

Tyron Warren W. 《Canadian Metallurgical Quarterly》1998,53(7):796a

Comments on the article by R. L. Hagen (see record 1997-02239-002) praising the null hypothesis statistical test (NHST). Hagen's praise of the NHST may be supported on purely technical grounds but it is unfortunate if it prolongs primary reliance on NHST to evaluate quantitative difference and equivalence given the prominent human factors problem of widespread and intractable interpretation errors. Alternative methods are available for these purposes that are far less subject to misinterpretation. The science of psychology can openly benefit by supplementing, if not replacing, NHST practices with these methods. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

4.

On the logic and purpose of significance testing.

Cortina Jose M.; Dunlap William P. 《Canadian Metallurgical Quarterly》1997,2(2):161

There has been much recent attention given to the problems involved with the traditional approach to null hypothesis significance testing (NHST). Many have suggested that, perhaps, NHST should be abandoned altogether in favor of other bases for conclusions such as confidence intervals and effect size estimates (e.g., F. L. Schmidt; see record 83-24994) . The purposes of this article are to (a) review the function that data analysis is supposed to serve in the social sciences, (b) examine the ways in which these functions are performed by NHST, (c) examine the case against NHST, and (d) evaluate interval-based estimation as an alternative to NHST. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

5.

In praise of the null hypothesis statistical test.

Hagen Richard L. 《Canadian Metallurgical Quarterly》1997,52(1):15

Jacob Cohen (see record 1995-12080-001) raised a number of questions about the logic and information value of the null hypothesis statistical test (NHST). Specifically, he suggested that: (1) The NHST does not tell us what we want to know; (2) the null hypothesis is always false; and (3) the NHST lacks logical integrity. It is the author's view that although there may be good reasons to give up the NHST, these particular points made by Cohen are not among those reasons. When addressing these points, the author also attempts to demonstrate the elegance and usefulness of the NHST. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

6.

A further look at wrong reasons to abandon statistical testing.

Hagen Richard L. 《Canadian Metallurgical Quarterly》1998,53(7):801

Responds to comments by W. W. Tryon, R. E. McGrath, R. G. Malgady, R. Falk, B. Thompson, and M. M. Granaas (see records 1998-04417-011, 1998-04417-012, 1998-04417-013, 1998-04417-014, 1998-04417-015, and 1998-04417-016, respectively) on the author's article (see record 1997-02239-002) defending use of the null hypothesis statistical test (NHST). The logic of NHST has been challenged by 3 claims: (1) the null hypothesis is always false; therefore, a test of the null hypothesis is only a search for what is already known to be true; (2) the form of logic on which NHST rests is flawed; and (3) NHST does not tell one what one wants to know. In attempting to rebut these claims, while there may be good reasons to give up NHST, these particular points are not the reason why. Key points of each commentary are addressed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

7.

Effect sizes for growth-modeling analysis for controlled clinical trials in the same metric as for classical analysis.

Feingold Alan 《Canadian Metallurgical Quarterly》2009,14(1):43

The use of growth-modeling analysis (GMA)—including hierarchical linear models, latent growth models, and general estimating equations—to evaluate interventions in psychology, psychiatry, and prevention science has grown rapidly over the last decade. However, an effect size associated with the difference between the trajectories of the intervention and control groups that captures the treatment effect is rarely reported. This article first reviews 2 classes of formulas for effect sizes associated with classical repeated-measures designs that use the standard deviation of either change scores or raw scores for the denominator. It then broadens the scope to subsume GMA and demonstrates that the independent groups, within-subjects, pretest–posttest control-group, and GMA designs all estimate the same effect size when the standard deviation of raw scores is uniformly used. Finally, the article shows that the correct effect size for treatment efficacy in GMA—the difference between the estimated means of the 2 groups at end of study (determined from the coefficient for the slope difference and length of study) divided by the baseline standard deviation—is not reported in clinical trials. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

8.

Confidence intervals for standardized linear contrasts of means.

Bonett Douglas G. 《Canadian Metallurgical Quarterly》2008,13(2):99

Most psychology journals now require authors to report a sample value of effect size along with hypothesis testing results. The sample effect size value can be misleading because it contains sampling error. Authors often incorrectly interpret the sample effect size as if it were the population effect size. A simple solution to this problem is to report a confidence interval for the population value of the effect size. Standardized linear contrasts of means are useful measures of effect size in a wide variety of research applications. New confidence intervals for standardized linear contrasts of means are developed and may be applied to between-subjects designs, within-subjects designs, or mixed designs. The proposed confidence interval methods are easy to compute, do not require equal population variances, and perform better than the currently available methods when the population variances are not equal. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

9.

Does intrinsic motivation theory explain the adverse effects of rewards on immediate task performance?

McCullers John C.; Fabes Richard A.; Moran James D. 《Canadian Metallurgical Quarterly》1987,52(5):1027

In an effort to answer the question posed in the title, we assessed the effects of rewards on the immediate task performance of preschool children in two studies. Both studies had within-subjects, repeated measures designs, and both yielded highly consistent results showing a detrimental effect of reward on the Peabody Picture Vocabulary Test and on the Goodenough-Harris Draw-a-Man test. Performance decrements were confined to sessions in which subjects were rewarded; when rewarded subjects were shifted to nonreward, their performance improved dramatically. Although these studies were not concerned with the effects of reward on intrinsic motivation, the findings appear to present theoretical difficulties for current cognitive-motivational explanations of the adverse effects of material rewards on immediate task performance. An alternative viewpoint that material rewards can produce a temporary regression in psychological functioning is suggested. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

10.

Model fitting: A better approach.

Granaas Michael M. 《Canadian Metallurgical Quarterly》1998,53(7):800

Comments on the article by R. L. Hagen (see record 1997-02239-002) defending the logic and practice of null hypothesis statistical testing (NHST). It is argued that model fitting provides an approach to data analysis that is more appropriate to the cognitive needs of the researcher than is NHST. Model fitting combines the NHST ability to falsify hypotheses with the parameter-estimation characteristic of confidence intervals in an approach that is simpler to learn, understand, and use. Effect size estimation is central to the approach, and power calculations are vastly simplified relative to NHST. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

11.

Using the analysis of covariance to increase the power of priming experiments.

Mulligan Neil W.; Wiesen Chris 《Canadian Metallurgical Quarterly》2003,57(3):152

Although priming paradigms are widely used in cognitive psychology, the statistical analyses typically applied to priming data may not be optimal. Conceiving of priming paradigms as change-from-baseline designs suggests that the analysis of covariance line (ANCOVA), using baseline performance as the covariate, is a more efficient (i.e., powerful) analysis. Specifically, ANCOVA provides more powerful tests of 1) the presence of priming and 2) between-group differences in priming. In addition, for within-subject designs with multiple baseline conditions, ANCOVA may increase the power of within-subjects effects. Efficiency gains are demonstrated with a re-analysis of priming datasets from implicit memory research. It is suggested that similar gains may be realized in other areas of priming research. Important assumptions of this procedure, which must be evaluated for the appropriate application of ANCOVA, are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

12.

Estimating standardized linear contrasts of means with desired precision.

Bonett Douglas G. 《Canadian Metallurgical Quarterly》2009,14(1):1

L. Wilkinson and the Task Force on Statistical Inference (1999) recommended reporting confidence intervals for measures of effect sizes. If the sample size is too small, the confidence interval may be too wide to provide meaningful information. Recently, K. Kelley and J. R. Rausch (2006) used an iterative approach to computer-generate tables of sample size requirements for a standardized difference between 2 means in between-subjects designs. Sample size formulas are derived here for general standardized linear contrasts of k ≥ 2 means for both between-subjects designs and within-subjects designs. Special sample size formulas also are derived for the standardizer proposed by G. V. Glass (1976). (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

13.

Significance testing: Is there something Better?

McGrath Robert E. 《Canadian Metallurgical Quarterly》1998,53(7):796b

Comments on the article by R. L. Hagen (see record 1997-02239-002) supporting use of the null hypothesis statistical test (NHST). Hagen did an admirable job of reminding readers that the NHST represents a brilliant and useful innovation, but does not offer a strong case for its continued use as the primary inferential strategy in psychology. The question is not "Is it useless?" but "Is there something better?" Popular opinion holds that interval estimation represents a superior strategy to NHST in many ways. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

14.

How to show that 9?>?221: Collect judgments in a between-subjects design.

Birnbaum Michael H. 《Canadian Metallurgical Quarterly》1999,4(3):243

In between-subjects (BS) designs, different groups may be asked to make judgments on numerical rating scales. According to judgment theory, judgments obtained BS are not an ordinal scale of subjective value. This article illustrates how BS designs can lead to strange conclusions: When different groups judge the subjective size of numbers, 9 is judged significantly larger than 221. The theory is that 9 brings to mind a context of small numbers, among which 9 seems "average" or even "large"; however, 221 invokes a context of 3-digit numbers, among which 221 seems relatively "small." Within-subjects, however, judges would not have said 9?>?221. Implications of this problem and suggestions for dealing with it are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

15.

Intraclass correlation for two-by-two tables under three sampling designs

CA Bodian 《Canadian Metallurgical Quarterly》1994,50(1):183-193

Several sampling designs for assessing agreement between two binary classifications on each of n subjects lead to data arrayed in a four-fold table. Following Kraemer's (1979, Psychometrika 44, 461-472) approach, population models are described for binary data analogous to quantitative data models for a one-way random design, a two-way mixed design, and a two-way random design. For each of these models, parameters representing intraclass correlation are defined, and two estimators are proposed, one from constructing ANOVA-type tables for binary data, and one by the method of maximum likelihood. The maximum likelihood estimator of intraclass correlation for the two-way mixed design is the same as the phi coefficient (Chedzoy, 1985, in Encyclopedia of Statistical Sciences, Vol. 6, New York: Wiley). For moderately large samples, the ANOVA estimator for the two-way random design approximates Cohen's (1960, Psychological Measurement 20, 37-46) kappa statistic. Comparisons among the estimators indicate very little difference in values for tables with marginal symmetry. Differences among the estimators increase with increasing marginal asymmetry, and with average prevalence approaching .50. 相似文献

16.

Confidence intervals in repeated-measures designs: The number of observations principle.

Jarmasz Jerzy; Hollands Justin G. 《Canadian Metallurgical Quarterly》2009,63(2):124

Since the publication of Loftus and Masson’s (1994) method for computing confidence intervals (CIs) in repeated-measures (RM) designs, there has been uncertainty about how to apply it to particular effects in complex factorial designs. Masson and Loftus (2003) proposed that RM CIs for factorial designs be based on number of observations rather than number of participants. However, determining the correct number of observations for a particular effect can be complicated, given the variety of effects occurring in factorial designs. In this paper the authors define a general “number of observations” principle, explain why it obtains, and provide step-by-step instructions for constructing CIs for various effect types. The authors illustrate these procedures with numerical examples. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

17.

In criticism of the null hypothesis statistical test.

Falk Ruma 《Canadian Metallurgical Quarterly》1998,53(7):798

Comments on the article by R. L. Hagen (see record 1997-02239-002) in praise of the null hypothesis statistical test (NHST). NHST, is, in fact, a probabilistic imitation of modus tollens (or of the mathematical procedure of proof by contradiction). However, once the reasoning is made probabilistic, the inference is no longer valid. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

18.

Associative learning in the marsupials Didelphis albiventris and Lutreolina crassicaudata..

Papini Mauricio R. 《Canadian Metallurgical Quarterly》1988,102(1):21

A procedure to study associative learning in didelphid marsupials (Didelphis albiventris and Lutreolina crassicaudata) was developed, based on the use of an appetitive unconditioned stimulus, discrete conditioned stimuli, and multiple-behavior recordings of freely moving animals. In Experiments 1 and 2, three basic conditioning phenomena were reported: differential conditioning, stimulus reversal, and summation. A specific behavior developed during the excitatory signal, independently of the particular stimulus involved, consisting of rhythmic, goal-centered, sagittal head movements, highly similar across subjects and species. Unlike previous experiments on Pavlovian conditioning in marsupials, the use of differential conditioning in within-subjects designs, with appropriate counterbalance of stimuli, precludes interpretation of these results in terms of pseudoconditioning, sensitization, or sensory-perceptual effects. These results open the possiblity for systematic research on the comparative, developmental, and neuropsychological aspects of learning to which marsupials can contribute as models. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

19.

Should Providers of Treatment Be Regarded as a Random Factor? If It Ain't Broke, Don't "Fix" It: A Comment on Siemer and Joormann (2003).

Serlin Ronald C.; Wampold Bruce E.; Levin Joel R. 《Canadian Metallurgical Quarterly》2003,8(4):524

In their criticism of B. E. Wampold and R. C. Serlin's (see record 2000-16737-003) analysis of treatment effects in nested designs, M. Siemer and J. Joormann (see record 2003-10163-009) argued that providers of services should be considered a fixed factor because typically providers are neither randomly selected from a population of providers nor randomly assigned to treatments, and statistical power to detect treatment effects is greater in the fixed than in the mixed model. The authors of the present article argue that if providers are considered fixed, conclusions about the treatment must be conditioned on the specific providers in the study, and they show that in this case generalizing beyond these providers incurs inflated Type I error rates. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

20.

Null hypothesis significance testing: On the survival of a flawed method.

Krueger Joachim 《Canadian Metallurgical Quarterly》2001,56(1):16

Null hypothesis significance testing (NHST) is the researcher's workhorse for making inductive inferences. This method has often been challenged, has occasionally been defended, and has persistently been used through most of the history of scientific psychology. This article reviews both the criticisms of NHST and the arguments brought to its defense. The review shows that the criticisms address the logical validity of inferences arising from NHST, whereas the defenses stress the pragmatic value of these inferences. The author suggests that both critics and apologists implicitly rely on Bayesian assumptions. When these assumptions are made explicit, the primary challenge for NHST—and any system of induction—can be confronted. The challenge is to find a solution to the question of replicability. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献