期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Hypothesis testing and theory evaluation at the boundaries: Surprising insights from Bayes's theorem.

Trafimow David 《Canadian Metallurgical Quarterly》2003,110(3):526

Because the probability of obtaining an experimental finding given that the null hypothesis is true [p(H?/F)] is not the same as the probability that the null hypothesis is true given a finding [p(H?/F)], calculating the former probability does not justify conclusions about the latter one. As the standard null-hypothesis significance-testing procedure does just that, it is logically invalid (J. Cohen, 1994). Theoretically, Bayes's theorem yields [p(H?/F)], but in practice, researchers rarely know the correct values for 2 of the variables in the theorem. Nevertheless, by considering a wide range of possible values for the unknown variables, it is possible to calculate a range of theoretical values for [p(H?/F)] and to draw conclusions about both hypothesis testing and theory evaluation. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

2.

A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. 总被引：1，自引：0，他引：1

Sun Shuyan; Pan Wei; Wang Lihshing Leigh 《Canadian Metallurgical Quarterly》2010,102(4):989

Null hypothesis significance testing has dominated quantitative research in education and psychology. However, the statistical significance of a test as indicated by a p-value does not speak to the practical significance of the study. Thus, reporting effect size to supplement p-value is highly recommended by scholars, journal editors, and academic associations. As a measure of practical significance, effect size quantifies the size of mean differences or strength of associations and directly answers the research questions. Furthermore, a comparison of effect sizes across studies facilitates meta-analytic assessment of the effect size and accumulation of knowledge. In the current comprehensive review, we investigated the most recent effect size reporting and interpreting practices in 1,243 articles published in 14 academic journals from 2005 to 2007. Overall, 49% of the articles reported effect size—57% of which interpreted effect size. As an empirical study for the sake of good research methodology in education and psychology, in the present study we provide an illustrative example of reporting and interpreting effect size in a published study. Furthermore, a 7-step guideline for quantitative researchers is also summarized along with some recommended resources on how to understand and interpret effect size. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

3.

The test of significance in psychological research.

Bakan David 《Canadian Metallurgical Quarterly》1966,66(6):423

The test of significance does not provide the information concerning psychological phenomena characteristically attributed to it; and a great deal of mischief has been associated with its use. The basic logic associated with the test of significance is reviewed. The null hypothesis is characteristically false under any circumstances. Publication practices foster the reporting of small effects in populations. Psychologists have "adjusted" by misinterpretation, taking the p value as a "measure," assuming that the test of significance provides automaticity of inference, and confusing the aggregate with the general. The difficulties are illuminated by bringing to bear the contributions from the decision-theory school on the Fisher approach. The Bayesian approach is suggested. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

4.

An effect size primer: A guide for clinicians and researchers.

Ferguson Christopher J. 《Canadian Metallurgical Quarterly》2009,40(5):532

Increasing emphasis has been placed on the use of effect size reporting in the analysis of social science data. Nonetheless, the use of effect size reporting remains inconsistent, and interpretation of effect size estimates continues to be confused. Researchers are presented with numerous effect sizes estimate options, not all of which are appropriate for every research question. Clinicians also may have little guidance in the interpretation of effect sizes relevant for clinical practice. The current article provides a primer of effect size estimates for the social sciences. Common effect sizes estimates, their use, and interpretations are presented as a guide for researchers. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

5.

Killeen's (2005) prep coefficient: Logical and mathematical problems.

Maraun Michael; Gabriel Stephanie 《Canadian Metallurgical Quarterly》2010,15(2):182

In his article, “An alternative to null-hypothesis significance tests,” Killeen (2005) urged the discipline to abandon the practice of pobs-based null hypothesis testing and to quantify the signal-to-noise characteristics of experimental outcomes with replication probabilities. He described the coefficient that he invented, prep, as the probability of obtaining “an effect of the same sign as that found in an original experiment” (Killeen, 2005, p. 346). The journal Psychological Science quickly came to encourage researchers to employ prep, rather than pobs, in the reporting of their experimental findings. In the current article, we (a) establish that Killeen's derivation of prep contains an error, the result of which is that prep is not, in fact, the probability that Killeen set out to derive; (b) establish that prep is not a replication probability of any kind but, rather, is a quasi-power coefficient; and (c) suggest that Killeen has mischaracterized both the relationship between replication probabilities and statistical inference, and the kinds of claims that are licensed by knowledge of the value assumed by the replication probability that he attempted to derive. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

6.

The proof of the pudding: An illustration of the relative strengths of null hypothesis, meta-analysis, and Bayesian analysis.

Howard George S.; Maxwell Scott E.; Fleming Kevin J. 《Canadian Metallurgical Quarterly》2000,5(3):315

Some methodologists have recently suggested that scientific psychology's overreliance on null hypothesis significance testing (NHST) impedes the progress of the discipline. In response, a number of defenders have maintained that NHST continues to play a vital role in psychological research. Both sides of the argument to date have been presented abstractly. The authors take a different approach to this issue by illustrating the use of NHST along with 2 possible alternatives (meta-analysis as a primary data analysis strategy and Bayesian approaches) in a series of 3 studies. Comparing and contrasting the approaches on actual data brings out the strengths and weaknesses of each approach. The exercise demonstrates that the approaches are not mutually exclusive but instead can be used to complement one another. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

7.

Evaluating statistical difference, equivalence, and indeterminancy using inferential confidence intervals: An integrated alternative method of conducting null hypothesis statistical tests.

Tryon Warren W. 《Canadian Metallurgical Quarterly》2001,6(4):371

Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes an integrated, alternative inferential confidence interval approach to testing for statistical difference, equivalence, and indeterminacy that is algebraically equivalent to standard NHST procedures and therefore exacts the same evidential standard. The combined numeric and graphic tests of statistical difference, equivalence, and indeterminacy are designed to avoid common interpretive problems associated with NHST procedures. Multiple comparisons, power, sample size, test reliability, effect size, and cause-effect ratio are discussed. A section on the proper interpretation of confidence intervals is followed by a decision rule summary and caveats. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

8.

"Effect sizes for experimenting psychologists": Correction to Rosnow and Rosenthal (2003).

Rosnow Ralph L.; Rosenthal Robert 《Canadian Metallurgical Quarterly》2009,63(2):123

Reports an error in "Effect sizes for experimenting psychologists" by Ralph L. Rosnow and Robert Rosenthal (Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale, 2003[Sep], Vol 57[3], 221-237). A portion of the note to Table 1 was incorrect. The second sentence of the note should read as follows: Fisher’s ?r is the log transformation of r, that is, ? loge [(1 + r)/(1 - r)]. (The following abstract of the original article appeared in record 2003-08374-009.) [Correction Notice: An erratum for this article was reported in Vol 63(1) of Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale (see record 2009-03064-004). Correction for Note in TABLE 1 (on page 222): The second sentence should read as follows: Fisher’s zr is the log transformation of r, that is, 1?2 loge[(1 + r)/(1 ? r)].] This article describes three families of effect size estimators and their use in situations of general and specific interest to experimenting psychologists. The situations discussed include both between- and within-group (repeated measures) designs. Also described is the counternull statistic, which is useful in preventing common errors of interpretation in null hypothesis significance testing. The emphasis is on correlation (r-type) effect size indicators, but a wide variety of difference-type and ratio-type effect size estimators are also described. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

9.

Accurate tests of statistical significance for rWG and average deviation interrater agreement indexes.

Dunlap William P.; Burke Michael J.; Smith-Crowe Kristin 《Canadian Metallurgical Quarterly》2003,88(2):356

The authors demonstrated that the most common statistical significance test used with rWG-type interrater agreement indexes in applied psychology, based on the chi-square distribution, is flawed and inaccurate. The chi-square test is shown to be extremely conservative even for modest, standard significance levels (e.g., .05). The authors present an alternative statistical significance test, based on Monte Carlo procedures, that produces the equivalent of an approximate randomization test for the null hypothesis that the actual distribution of responding is rectangular and demonstrate its superiority to the chi-square test. Finally, the authors provide tables of critical values and offer downloadable software to implement the approximate randomization test for rWG type and for average deviation (AD)-type interrater agreement indexes. The implications of these results for studying a broad range of interrater agreement problems in applied psychology are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

10.

Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research.

Sharpe Donald 《Canadian Metallurgical Quarterly》2004,45(4):317

Reviews Kline's book (see record 2004-13019-000) which reviews the controversy regarding significance testing, offers methods for effect size and confidence interval estimation, and suggests some alternative methodologies. Whether or not one accepts Kline's view of the future of statistical significance testing, there is much of value in this book. As a textbook, it could serve as a reference for an upper level undergraduate course but it would be more appropriate for a graduate course. The book is a thought-provoking examination of the uneasy alliance between null hypothesis significance testing, and effect size and confidence interval estimation. There is much in this book for those on both sides of the null hypothesis testing debate and for those unsure where they stand. Whatever the future holds, Kline has done well in illustrating recent advances to statistical decision-making. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

11.

Statistical testing and null distributions: What to do when samples are not random.

Hunter Michael A.; May Richard B. 《Canadian Metallurgical Quarterly》2003,57(3):176

Selected literature related to statistical testing is reviewed to compare the theoretical models underlying parametric and nonparametric inference. Specifically, we show that these models evaluate different hypotheses, are based on different concepts of probability and resultant null distributions, and support different substantive conclusions. We suggest that cognitive scientists should be aware of both models, thus providing them with a better appreciation of the implications and consequences of their choices among potential methods of analysis. This is especially true when it is recognized that most cognitive science research employs design features that do not justify parametric procedures, but that do support nonparametric methods of analysis, particularly those based on the method of permutation/randomization. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

12.

The error of accepting the "theoretical" null hypothesis: The rise, fall, and resurrection of commonsense hypotheses in psychology.

Kluger Avraham N.; Tikochinsky Jonathan 《Canadian Metallurgical Quarterly》2001,127(3):408

When psychologists test a commonsense (CS) hypothesis and obtain no support, they tend to erroneously conclude that the CS belief is wrong. In many such cases it appears, after many years, that the CS hypothesis was valid after all. It is argued that this error of accepting the "theoretical" null hypothesis reflects confusion between the operationalized hypothesis and the theory or generalization that it is designed to test. That is, on the basis of reliable null data one can accept the operationalized null hypothesis (e.g., "A measure of attitude x is not correlated with a measure of behavior y"). In contrast, one cannot generalize from the findings and accept the abstract or theoretical null (e.g., "We know that attitudes do not predict behavior"). The practice of accepting the theoretical null hypothesis hampers research and reduces the trust of the public in psychological research. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

13.

A model-averaging approach to replication: The case of prep.

Iverson Geoffrey J.; Wagenmakers Eric-Jan; Lee Michael D. 《Canadian Metallurgical Quarterly》2010,15(2):172

The purpose of the recently proposed prep statistic is to estimate the probability of concurrence, that is, the probability that a replicate experiment yields an effect of the same sign (Killeen, 2005a). The influential journal Psychological Science endorses prep and recommends its use over that of traditional methods. Here we show that prep overestimates the probability of concurrence. This is because prep was derived under the assumption that all effect sizes in the population are equally likely a priori. In many situations, however, it is advisable also to entertain a null hypothesis of no or approximately no effect. We show how the posterior probability of the null hypothesis is sensitive to a priori considerations and to the evidence provided by the data; and the higher the posterior probability of the null hypothesis, the smaller the probability of concurrence. When the null hypothesis and the alternative hypothesis are equally likely a priori, prep may overestimate the probability of concurrence by 30% and more. We conclude that prep provides an upper bound on the probability of concurrence, a bound that brings with it the danger of having researchers believe that their experimental effects are much more reliable than they actually are. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

14.

Age and Handedness: Patterns of Change in the Population and Sex Differences Become Visible with Increased Statistical Power.

Coren Stanley 《Canadian Metallurgical Quarterly》1995,49(3):376

In a sample of 12,030 subjects, ranging in age from 8 to 99 years, significant decreases in both mixed and consistent left-handedness were found as age increased. There were also significant sex differences, with males more likely to be left- or mixed-handed. These age and sex differences were reported as non-significant in Porac's (1993) smaller sample of 654. Methodological issues associated with asserting the null hypothesis in handedness studies when statistical power is low are also discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

15.

Inference by Eye: Confidence Intervals and How to Read Pictures of Data.

Cumming Geoff; Finch Sue 《Canadian Metallurgical Quarterly》2005,60(2):170

Wider use in psychology of confidence intervals (CIs), especially as error bars in figures, is a desirable development. However, psychologists seldom use CIs and may not understand them well. The authors discuss the interpretation of figures with error bars and analyze the relationship between CIs and statistical significance testing. They propose 7 rules of eye to guide the inferential use of figures with error bars. These include general principles: Seek bars that relate directly to effects of interest, be sensitive to experimental design, and interpret the intervals. They also include guidelines for inferential interpretation of the overlap of CIs on independent group means. Wider use of interval estimation in psychology has the potential to improve research communication substantially. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

16.

The Path Analysis Controversy: A new statistical approach to strong appraisal of verisimilitude.

Meehl Paul E.; Waller Niels G. 《Canadian Metallurgical Quarterly》2002,7(3):283

A new approach for using path analysis to appraise the verisimilitude of theories is described. Rather than trying to test a model's truth (correctness), this method corroborates a class of path diagrams by determining how well they predict intradata relations in comparison with other diagrams. The observed correlation matrix is partitioned into disjoint sets. One set is used to estimate the model parameters, and a nonoverlapping set is used to assess the model's verisimilitude. Computer code was written to generate competing models and to test the conjectured model's superiority (relative to the generated set) using diagram combinatorics and is available on the Web. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

17.

Modern robust statistical methods: An easy way to maximize the accuracy and power of your research.

Erceg-Hurn David M.; Mirosevich Vikki M. 《Canadian Metallurgical Quarterly》2008,63(7):591

Classic parametric statistical significance tests, such as analysis of variance and least squares regression, are widely used by researchers in many disciplines, including psychology. For classic parametric tests to produce accurate results, the assumptions underlying them (e.g., normality and homoscedasticity) must be satisfied. These assumptions are rarely met when analyzing real data. The use of classic parametric methods with violated assumptions can result in the inaccurate computation of p values, effect sizes, and confidence intervals. This may lead to substantive errors in the interpretation of data. Many modern robust statistical methods alleviate the problems inherent in using parametric methods with violated assumptions, yet modern methods are rarely used by researchers. The authors examine why this is the case, arguing that most researchers are unaware of the serious limitations of classic methods and are unfamiliar with modern alternatives. A range of modern robust and rank-based significance tests suitable for analyzing a wide range of designs is introduced. Practical advice on conducting modern analyses using software such as SPSS, SAS, and R is provided. The authors conclude by discussing robust effect size indices. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

18.

The Persistence of Underpowered Studies in Psychological Research: Causes, Consequences, and Remedies.

Maxwell Scott E. 《Canadian Metallurgical Quarterly》2004,9(2):147

Underpowered studies persist in the psychological literature. This article examines reasons for their persistence and the effects on efforts to create a cumulative science. The "curse of multiplicities" plays a central role in the presentation. Most psychologists realize that testing multiple hypotheses in a single study affects the Type I error rate, but corresponding implications for power have largely been ignored. The presence of multiple hypothesis tests leads to 3 different conceptualizations of power. Implications of these 3 conceptualizations are discussed from the perspective of the individual researcher and from the perspective of developing a coherent literature. Supplementing significance tests with effect size measures and confidence intervals is shown to address some but not necessarily all problems associated with multiple testing. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

19.

A Privileged and Exemplar Resource: Traumatic Avoidance Learning and the Early Triumph of Mathematical Psychology.

Lovie Sandy; Lovie Pat 《Canadian Metallurgical Quarterly》2004,7(3):248

The relationship between a classic 1953 study by R. L. Solomon and L. C. Wynne on traumatic avoidance learning and the pioneering efforts by Robert Bush and Frederick Mosteller and others to develop mathematical models of learning is analyzed. The main purpose is to explore how Bush and Mosteller disembedded a carefully selected set of Solomon and Wynne's data from its original context, which allowed something as seemingly humble as a set of numbers to become a widely available and valuable resource for the newly emerging field of mathematical learning theory (MLT). The creative use that the MLT community made of these data once Bush and Mosteller had systematically reduced the empirical and conceptual uncertainties within Solomon and Wynne's study is also discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

20.

Review of Clinical versus statistical prediction: A theoretical analysis and review of the evidence.

Kelly E. Lowell 《Canadian Metallurgical Quarterly》1955,39(4):301b

Reviews the book "Clinical versus statistical prediction: A theoretical analysis and review of the evidence" by Paul E. Meehl (see record 1996-97896-000). This book talks about a continuing debate among psychologists regarding the relative accuracy and efficiency of statistical (actuarial) predictions and those made by clinicians on the basis of subjective "understanding" of individual cases. This book represents the author's first published statement of his position. In the reviewers opinion, the author has succeeded admirably. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献