期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The error of accepting the "theoretical" null hypothesis: The rise, fall, and resurrection of commonsense hypotheses in psychology.

Kluger Avraham N.; Tikochinsky Jonathan 《Canadian Metallurgical Quarterly》2001,127(3):408

When psychologists test a commonsense (CS) hypothesis and obtain no support, they tend to erroneously conclude that the CS belief is wrong. In many such cases it appears, after many years, that the CS hypothesis was valid after all. It is argued that this error of accepting the "theoretical" null hypothesis reflects confusion between the operationalized hypothesis and the theory or generalization that it is designed to test. That is, on the basis of reliable null data one can accept the operationalized null hypothesis (e.g., "A measure of attitude x is not correlated with a measure of behavior y"). In contrast, one cannot generalize from the findings and accept the abstract or theoretical null (e.g., "We know that attitudes do not predict behavior"). The practice of accepting the theoretical null hypothesis hampers research and reduces the trust of the public in psychological research. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

2.

Evaluating statistical difference, equivalence, and indeterminancy using inferential confidence intervals: An integrated alternative method of conducting null hypothesis statistical tests.

Tryon Warren W. 《Canadian Metallurgical Quarterly》2001,6(4):371

Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes an integrated, alternative inferential confidence interval approach to testing for statistical difference, equivalence, and indeterminacy that is algebraically equivalent to standard NHST procedures and therefore exacts the same evidential standard. The combined numeric and graphic tests of statistical difference, equivalence, and indeterminacy are designed to avoid common interpretive problems associated with NHST procedures. Multiple comparisons, power, sample size, test reliability, effect size, and cause-effect ratio are discussed. A section on the proper interpretation of confidence intervals is followed by a decision rule summary and caveats. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

3.

Hypothesis testing and theory evaluation at the boundaries: Surprising insights from Bayes's theorem.

Trafimow David 《Canadian Metallurgical Quarterly》2003,110(3):526

Because the probability of obtaining an experimental finding given that the null hypothesis is true [p(H?/F)] is not the same as the probability that the null hypothesis is true given a finding [p(H?/F)], calculating the former probability does not justify conclusions about the latter one. As the standard null-hypothesis significance-testing procedure does just that, it is logically invalid (J. Cohen, 1994). Theoretically, Bayes's theorem yields [p(H?/F)], but in practice, researchers rarely know the correct values for 2 of the variables in the theorem. Nevertheless, by considering a wide range of possible values for the unknown variables, it is possible to calculate a range of theoretical values for [p(H?/F)] and to draw conclusions about both hypothesis testing and theory evaluation. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

4.

The test of significance in psychological research.

Bakan David 《Canadian Metallurgical Quarterly》1966,66(6):423

The test of significance does not provide the information concerning psychological phenomena characteristically attributed to it; and a great deal of mischief has been associated with its use. The basic logic associated with the test of significance is reviewed. The null hypothesis is characteristically false under any circumstances. Publication practices foster the reporting of small effects in populations. Psychologists have "adjusted" by misinterpretation, taking the p value as a "measure," assuming that the test of significance provides automaticity of inference, and confusing the aggregate with the general. The difficulties are illuminated by bringing to bear the contributions from the decision-theory school on the Fisher approach. The Bayesian approach is suggested. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

5.

An effect size primer: A guide for clinicians and researchers.

Ferguson Christopher J. 《Canadian Metallurgical Quarterly》2009,40(5):532

Increasing emphasis has been placed on the use of effect size reporting in the analysis of social science data. Nonetheless, the use of effect size reporting remains inconsistent, and interpretation of effect size estimates continues to be confused. Researchers are presented with numerous effect sizes estimate options, not all of which are appropriate for every research question. Clinicians also may have little guidance in the interpretation of effect sizes relevant for clinical practice. The current article provides a primer of effect size estimates for the social sciences. Common effect sizes estimates, their use, and interpretations are presented as a guide for researchers. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

6.

On blowing trumpets to the tulips: To prove or not to prove the null hypothesis--Comment on B?sch, Steinkamp, and Boller (2006).

Wilson David B.; Shadish William R. 《Canadian Metallurgical Quarterly》2006,132(4):524

The H. B?sch, F. Steinkamp, and E. Boller (see record 2006-08436-001) meta-analysis reaches mixed and cautious conclusions about the possibility of psychokinesis. The authors argue that, for both methodological and philosophical reasons, it is nearly impossible to draw any conclusions from this body of research. The authors do not agree that any significant effect at all, no matter how small, is fundamentally important (B?sch et al., 2006, p. 517), and they suggest that psychokinesis researchers focus either on producing larger effects or on specifying the conditions under which they would be willing to accept the null hypothesis. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

7.

The seemingly quixotic pursuit of a cumulative psychological science: Introduction to the special issue.

Curran Patrick J. 《Canadian Metallurgical Quarterly》2009,14(2):77

The goal of any empirical science is to pursue the construction of a cumulative base of knowledge upon which the future of the science may be built. However, there is mixed evidence that the science of psychology can accurately be characterized by such a cumulative progression. Indeed, some argue that the development of a truly cumulative psychological science is not possible with the current paradigms of hypothesis testing in single-study designs. The author explores this controversy as a framework to introduce the 6 articles that make up this special issue on the integration of data and empirical findings across multiple studies. The author proposes that the methods and techniques described in this set of articles can significantly propel researchers forward in their ongoing quest to build a cumulative psychological science. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

8.

Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research.

Sharpe Donald 《Canadian Metallurgical Quarterly》2004,45(4):317

Reviews Kline's book (see record 2004-13019-000) which reviews the controversy regarding significance testing, offers methods for effect size and confidence interval estimation, and suggests some alternative methodologies. Whether or not one accepts Kline's view of the future of statistical significance testing, there is much of value in this book. As a textbook, it could serve as a reference for an upper level undergraduate course but it would be more appropriate for a graduate course. The book is a thought-provoking examination of the uneasy alliance between null hypothesis significance testing, and effect size and confidence interval estimation. There is much in this book for those on both sides of the null hypothesis testing debate and for those unsure where they stand. Whatever the future holds, Kline has done well in illustrating recent advances to statistical decision-making. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

9.

The fallacy of the null-hypothesis significance test. 总被引：1，自引：0，他引：1

Rozeboom William W. 《Canadian Metallurgical Quarterly》1960,57(5):416

Though several serious objections to the null-hypothesis significance test method are raised, "its most basic error lies in mistaking the aim of a scientific investigation to be a decision, rather than a cognitive evaluation… . It is further argued that the proper application of statistics to scientific inference is irrevocably committed to extensive consideration of inverse probabilities, and to further this end, certain suggestions are offered." (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

10.

"Effect sizes for experimenting psychologists": Correction to Rosnow and Rosenthal (2003).

Rosnow Ralph L.; Rosenthal Robert 《Canadian Metallurgical Quarterly》2009,63(2):123

Reports an error in "Effect sizes for experimenting psychologists" by Ralph L. Rosnow and Robert Rosenthal (Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale, 2003[Sep], Vol 57[3], 221-237). A portion of the note to Table 1 was incorrect. The second sentence of the note should read as follows: Fisher’s ?r is the log transformation of r, that is, ? loge [(1 + r)/(1 - r)]. (The following abstract of the original article appeared in record 2003-08374-009.) [Correction Notice: An erratum for this article was reported in Vol 63(1) of Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale (see record 2009-03064-004). Correction for Note in TABLE 1 (on page 222): The second sentence should read as follows: Fisher’s zr is the log transformation of r, that is, 1?2 loge[(1 + r)/(1 ? r)].] This article describes three families of effect size estimators and their use in situations of general and specific interest to experimenting psychologists. The situations discussed include both between- and within-group (repeated measures) designs. Also described is the counternull statistic, which is useful in preventing common errors of interpretation in null hypothesis significance testing. The emphasis is on correlation (r-type) effect size indicators, but a wide variety of difference-type and ratio-type effect size estimators are also described. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

11.

The epistemology of mathematical and statistical modeling: A quiet methodological revolution.

Rodgers Joseph Lee 《Canadian Metallurgical Quarterly》2010,65(1):1

A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rule-based framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

12.

The relative benefits of meta-analysis conducted with individual participant data versus aggregated data.

Cooper Harris; Patall Erika A. 《Canadian Metallurgical Quarterly》2009,14(2):165

The authors describe the relative benefits of conducting meta-analyses with (a) individual participant data (IPD) gathered from the constituent studies and (b) aggregated data (AD), or the group-level statistics (in particular, effect sizes) that appear in reports of a study’s results. Given that both IPD and AD are equally available, meta-analysis of IPD is superior to meta-analysis of AD. IPD meta-analysis permits synthesists to perform subgroup analyses not conducted by the original collectors of the data, to check the data and analyses in the original studies, to add new information to the data sets, and to use different statistical methods. However, the cost of IPD meta-analysis and the lack of available IPD data sets suggest that the best strategy currently available is to use both approaches in a complementary fashion such that the first step in conducting an IPD meta-analysis would be to conduct an AD meta-analysis. Regardless of whether a meta-analysis is conducted with IPD or AD, synthesists must remain vigilant in how they interpret their results. They must avoid ecological fallacies, Simpson’s paradox, and interpretation of synthesis-generated evidence as supporting causal inferences. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

13.

A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. 总被引：1，自引：0，他引：1

Sun Shuyan; Pan Wei; Wang Lihshing Leigh 《Canadian Metallurgical Quarterly》2010,102(4):989

Null hypothesis significance testing has dominated quantitative research in education and psychology. However, the statistical significance of a test as indicated by a p-value does not speak to the practical significance of the study. Thus, reporting effect size to supplement p-value is highly recommended by scholars, journal editors, and academic associations. As a measure of practical significance, effect size quantifies the size of mean differences or strength of associations and directly answers the research questions. Furthermore, a comparison of effect sizes across studies facilitates meta-analytic assessment of the effect size and accumulation of knowledge. In the current comprehensive review, we investigated the most recent effect size reporting and interpreting practices in 1,243 articles published in 14 academic journals from 2005 to 2007. Overall, 49% of the articles reported effect size—57% of which interpreted effect size. As an empirical study for the sake of good research methodology in education and psychology, in the present study we provide an illustrative example of reporting and interpreting effect size in a published study. Furthermore, a 7-step guideline for quantitative researchers is also summarized along with some recommended resources on how to understand and interpret effect size. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

14.

A model-averaging approach to replication: The case of prep.

Iverson Geoffrey J.; Wagenmakers Eric-Jan; Lee Michael D. 《Canadian Metallurgical Quarterly》2010,15(2):172

The purpose of the recently proposed prep statistic is to estimate the probability of concurrence, that is, the probability that a replicate experiment yields an effect of the same sign (Killeen, 2005a). The influential journal Psychological Science endorses prep and recommends its use over that of traditional methods. Here we show that prep overestimates the probability of concurrence. This is because prep was derived under the assumption that all effect sizes in the population are equally likely a priori. In many situations, however, it is advisable also to entertain a null hypothesis of no or approximately no effect. We show how the posterior probability of the null hypothesis is sensitive to a priori considerations and to the evidence provided by the data; and the higher the posterior probability of the null hypothesis, the smaller the probability of concurrence. When the null hypothesis and the alternative hypothesis are equally likely a priori, prep may overestimate the probability of concurrence by 30% and more. We conclude that prep provides an upper bound on the probability of concurrence, a bound that brings with it the danger of having researchers believe that their experimental effects are much more reliable than they actually are. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

15.

Statistical testing and null distributions: What to do when samples are not random.

Hunter Michael A.; May Richard B. 《Canadian Metallurgical Quarterly》2003,57(3):176

Selected literature related to statistical testing is reviewed to compare the theoretical models underlying parametric and nonparametric inference. Specifically, we show that these models evaluate different hypotheses, are based on different concepts of probability and resultant null distributions, and support different substantive conclusions. We suggest that cognitive scientists should be aware of both models, thus providing them with a better appreciation of the implications and consequences of their choices among potential methods of analysis. This is especially true when it is recognized that most cognitive science research employs design features that do not justify parametric procedures, but that do support nonparametric methods of analysis, particularly those based on the method of permutation/randomization. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

16.

Why psychologists must change the way they analyze their data: The case of psi: Comment on Bem (2011).

Wagenmakers Eric–Jan; Wetzels Ruud; Borsboom Denny; van der Maas Han L. J. 《Canadian Metallurgical Quarterly》2011,100(3):426

Does psi exist? D. J. Bem (2011) conducted 9 studies with over 1,000 participants in an attempt to demonstrate that future events retroactively affect people's responses. Here we discuss several limitations of Bem's experiments on psi; in particular, we show that the data analysis was partly exploratory and that one-sided p values may overstate the statistical evidence against the null hypothesis. We reanalyze Bem's data with a default Bayesian t test and show that the evidence for psi is weak to nonexistent. We argue that in order to convince a skeptical audience of a controversial claim, one needs to conduct strictly confirmatory studies and analyze the results with statistical tests that are conservative rather than liberal. We conclude that Bem's p values do not indicate evidence in favor of precognition; instead, they indicate that experimental psychologists need to change the way they conduct their experiments and analyze their data. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献

17.

Choosing the best method for local validity estimation: Relative accuracy of meta-analysis versus a local study versus Bayes-analysis.

Newman Daniel A.; Jacobs Rick R.; Bartram Dave 《Canadian Metallurgical Quarterly》2007,92(5):1394

This study assessed the relative accuracy of 3 techniques--local validity studies, meta-analysis, and Bayesian analysis--for estimating test validity, incremental validity, and adverse impact in the local selection context. Bayes-analysis involves combining a local study with nonlocal (meta-analytic) validity data. Using tests of cognitive ability and personality (conscientiousness) as predictors, an empirically driven selection scenario illustrates conditions in which each of the 3 estimation techniques performs best. General recommendations are offered for how to estimate local parameters, based on true population variability and the number of studies in the meta-analytic prior. Benefits of empirical Bayesian analysis for personnel selection are demonstrated, and equations are derived to help guide the choice of a local validity technique (i.e., meta-analysis vs. local study vs. Bayes-analysis). (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

18.

Multiplicity, directional (Type III) errors, and the Null Hypothesis.

Shaffer Juliet Popper 《Canadian Metallurgical Quarterly》2002,7(3):356

L. V. Jones and J. W. Tukey (2000) pointed out that the usual 2-sided, equal-tails null hypothesis test at level ot can be reinterpreted as simultaneous tests of 2 directional inequality hypotheses, each at level α/2, and that the maximum probability of a Type I error is α/2 if the truth of the null hypothesis is considered impossible. This article points out that in multiple testing with familywise error rate controlled at ot, the directional error rate (assuming all null hypotheses are false) is greater than α/2 and can be arbitrarily close to α. Single-step, step-down, and step-up procedures are analyzed, and other error rates, including the false discovery rate, are discussed. Implications for confidence interval estimation and hypothesis testing practices are considered. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

19.

Testing the null hypothesis in meta-analysis: A comparison of combined probability and confidence interval procedures.

Hedges Larry V.; Cooper Harris; Bushman Brad J. 《Canadian Metallurgical Quarterly》1992,111(1):188

Combined significance tests (combined p values) and tests of the weighted mean effect size are used to combine information across studies in meta-analysis. A combined significance test (Stouffer test) is compared with a test based on the weighted mean effect size as tests of the same null hypothesis. The tests are compared analytically in the case in which the within-group variances are known and compared through large-sample theory in the more usual case in which the variances are unknown. Generalizations suggested are then explored through a simulation study. This work demonstrates that the test based on the average effect size is usually more powerful than the Stouffer test unless there is a substantial negative correlation between within-study sample size and effect size. Thus, the test based on the average effect size is generally preferable, and there is little reason to also calculate the Stouffer test. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

20.

Testing differences between nested covariance structure models: Power analysis and null hypotheses.

MacCallum Robert C.; Browne Michael W.; Cai Li 《Canadian Metallurgical Quarterly》2006,11(1):19

For comparing nested covariance structure models, the standard procedure is the likelihood ratio test of the difference in fit, where the null hypothesis is that the models fit identically in the population. A procedure for determining statistical power of this test is presented where effect size is based on a specified difference in overall fit of the models. A modification of the standard null hypothesis of zero difference in fit is proposed allowing for testing an interval hypothesis that the difference in fit between models is small, rather than zero. These developments are combined yielding a procedure for estimating power of a test of a null hypothesis of small difference in fit versus an alternative hypothesis of larger difference. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献