共查询到20条相似文献,搜索用时 15 毫秒
1.
Researchers have looked at comparisons between medical epidemiological research and psychological research using effect size r in an effort to compare relative effects. Often the outcomes of such efforts have demonstrated comparatively low effects for medical epidemiology research in comparison with effect sizes seen in psychology. The conclusion has often been that relatively small effects seen in psychology research are as strong as those found in important epidemiological medical research. The author suggests that many of the calculated effect sizes from medical epidemiological research on which this conclusion has been based are flawed. Specifically, rather than calculating effect sizes for treatment, many results have been for a Treatment Effect × Disease Effect interaction that was irrelevant to the main study hypothesis. A technique for developing a “hypothesis-relevant” effect size r is proposed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
2.
A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. 总被引:1,自引:0,他引:1
Null hypothesis significance testing has dominated quantitative research in education and psychology. However, the statistical significance of a test as indicated by a p-value does not speak to the practical significance of the study. Thus, reporting effect size to supplement p-value is highly recommended by scholars, journal editors, and academic associations. As a measure of practical significance, effect size quantifies the size of mean differences or strength of associations and directly answers the research questions. Furthermore, a comparison of effect sizes across studies facilitates meta-analytic assessment of the effect size and accumulation of knowledge. In the current comprehensive review, we investigated the most recent effect size reporting and interpreting practices in 1,243 articles published in 14 academic journals from 2005 to 2007. Overall, 49% of the articles reported effect size—57% of which interpreted effect size. As an empirical study for the sake of good research methodology in education and psychology, in the present study we provide an illustrative example of reporting and interpreting effect size in a published study. Furthermore, a 7-step guideline for quantitative researchers is also summarized along with some recommended resources on how to understand and interpret effect size. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
3.
Reviews Kline's book (see record 2004-13019-000) which reviews the controversy regarding significance testing, offers methods for effect size and confidence interval estimation, and suggests some alternative methodologies. Whether or not one accepts Kline's view of the future of statistical significance testing, there is much of value in this book. As a textbook, it could serve as a reference for an upper level undergraduate course but it would be more appropriate for a graduate course. The book is a thought-provoking examination of the uneasy alliance between null hypothesis significance testing, and effect size and confidence interval estimation. There is much in this book for those on both sides of the null hypothesis testing debate and for those unsure where they stand. Whatever the future holds, Kline has done well in illustrating recent advances to statistical decision-making. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
4.
[Correction Notice: An erratum for this article was reported in Vol 63(2) of Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale (see record 2009-08130-003). A portion of the note to Table 1 was incorrect. The second sentence of the note should read as follows: Fisher’s z-sub(r) is the log transformation of r, that is, ? log-sub(e) [(1 + r)/(1 - r)].] This article describes three families of effect size estimators and their use in situations of general and specific interest to experimenting psychologists. The situations discussed include both between- and within-group (repeated measures) designs. Also described is the counternull statistic, which is useful in preventing common errors of interpretation in null hypothesis significance testing. The emphasis is on correlation (r-type) effect size indicators, but a wide variety of difference-type and ratio-type effect size estimators are also described. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
5.
The test of significance does not provide the information concerning psychological phenomena characteristically attributed to it; and a great deal of mischief has been associated with its use. The basic logic associated with the test of significance is reviewed. The null hypothesis is characteristically false under any circumstances. Publication practices foster the reporting of small effects in populations. Psychologists have "adjusted" by misinterpretation, taking the p value as a "measure," assuming that the test of significance provides automaticity of inference, and confusing the aggregate with the general. The difficulties are illuminated by bringing to bear the contributions from the decision-theory school on the Fisher approach. The Bayesian approach is suggested. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
6.
Reports an error in "Effect sizes for experimenting psychologists" by Ralph L. Rosnow and Robert Rosenthal (Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale, 2003[Sep], Vol 57[3], 221-237). A portion of the note to Table 1 was incorrect. The second sentence of the note should read as follows: Fisher’s ?r is the log transformation of r, that is, ? loge [(1 + r)/(1 - r)]. (The following abstract of the original article appeared in record 2003-08374-009.) [Correction Notice: An erratum for this article was reported in Vol 63(1) of Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale (see record 2009-03064-004). Correction for Note in TABLE 1 (on page 222): The second sentence should read as follows: Fisher’s zr is the log transformation of r, that is, 1?2 loge[(1 + r)/(1 ? r)].] This article describes three families of effect size estimators and their use in situations of general and specific interest to experimenting psychologists. The situations discussed include both between- and within-group (repeated measures) designs. Also described is the counternull statistic, which is useful in preventing common errors of interpretation in null hypothesis significance testing. The emphasis is on correlation (r-type) effect size indicators, but a wide variety of difference-type and ratio-type effect size estimators are also described. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
7.
Howard George S.; Maxwell Scott E.; Fleming Kevin J. 《Canadian Metallurgical Quarterly》2000,5(3):315
Some methodologists have recently suggested that scientific psychology's overreliance on null hypothesis significance testing (NHST) impedes the progress of the discipline. In response, a number of defenders have maintained that NHST continues to play a vital role in psychological research. Both sides of the argument to date have been presented abstractly. The authors take a different approach to this issue by illustrating the use of NHST along with 2 possible alternatives (meta-analysis as a primary data analysis strategy and Bayesian approaches) in a series of 3 studies. Comparing and contrasting the approaches on actual data brings out the strengths and weaknesses of each approach. The exercise demonstrates that the approaches are not mutually exclusive but instead can be used to complement one another. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
8.
As a potential alternative to standard null hypothesis significance testing, we describe methods for graphical presentation of data--particularly condition means and their corresponding confidence intervals--for a wide range of factorial designs used in experimental psychology. We describe and illustrate confidence intervals specifically appropriate for between-subject versus within-subject factors. For designs involving more than two levels of a factor, we describe the use of contrasts for graphical illustration of theoretically meaningful components of main effects and interactions. These graphical techniques lend themselves to a natural and straightforward assessment of statistical power. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
9.
A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rule-based framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
10.
The reporting and interpretation of effect sizes in addition to statistical significance tests is becoming increasingly recognized as good research practice, as evidenced by the editorial policies of at least 23 journals that now require effect sizes. Statistical significance tests are limited in the information they provide readers about results, and effect sizes can be useful when evaluating result importance. The current article (a) summarizes statistical versus practical significance, (b) briefly discusses various effect size options, (c) presents a review of research articles published in the International Journal of Play Therapy (1993-2003) regarding use of effect sizes and statistical significance tests, and (d) provides recommendations for improved research practice in the journal and elsewhere. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
11.
The statistical analysis of mediation effects has become an indispensable tool for helping scientists investigate processes thought to be causal. Yet, in spite of many recent advances in the estimation and testing of mediation effects, little attention has been given to methods for communicating effect size and the practical importance of those effect sizes. Our goals in this article are to (a) outline some general desiderata for effect size measures, (b) describe current methods of expressing effect size and practical importance for mediation, (c) use the desiderata to evaluate these methods, and (d) develop new methods to communicate effect size in the context of mediation analysis. The first new effect size index we describe is a residual-based index that quantifies the amount of variance explained in both the mediator and the outcome. The second new effect size index quantifies the indirect effect as the proportion of the maximum possible indirect effect that could have been obtained, given the scales of the variables involved. We supplement our discussion by offering easy-to-use R tools for the numerical and visual communication of effect size for mediation effects. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献
12.
Because the probability of obtaining an experimental finding given that the null hypothesis is true [p(H?/F)] is not the same as the probability that the null hypothesis is true given a finding [p(H?/F)], calculating the former probability does not justify conclusions about the latter one. As the standard null-hypothesis significance-testing procedure does just that, it is logically invalid (J. Cohen, 1994). Theoretically, Bayes's theorem yields [p(H?/F)], but in practice, researchers rarely know the correct values for 2 of the variables in the theorem. Nevertheless, by considering a wide range of possible values for the unknown variables, it is possible to calculate a range of theoretical values for [p(H?/F)] and to draw conclusions about both hypothesis testing and theory evaluation. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
13.
One version of requivalent, calculated from Fisher's exact test p values and recommended for small samples, is considered "a more realistic . . . [and] a more accurate estimate of the population correlation than . . . the sample correlation, rsample" (R. Rosenthal & D. B. Rubin, 2003, p. 494). Small sample properties of rsample and of two effect size estimators (requivalent* and rhybrid) that use requivalent were examined: rsample is preferable to requivalent* (defined as requivalent used without restrictions) in terms of bias and mean squared error (MSE); rhybrid (defined as requivalent only when rsample = 1.0) is generally preferable to requivalent*, and preferable to rsample in terms of MSEs, except when population correlations are very large. Conditions favoring rsample over requivalent* and rhybrid in meta-analyses are noted. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
14.
The H. B?sch, F. Steinkamp, and E. Boller (see record 2006-08436-001) meta-analysis reaches mixed and cautious conclusions about the possibility of psychokinesis. The authors argue that, for both methodological and philosophical reasons, it is nearly impossible to draw any conclusions from this body of research. The authors do not agree that any significant effect at all, no matter how small, is fundamentally important (B?sch et al., 2006, p. 517), and they suggest that psychokinesis researchers focus either on producing larger effects or on specifying the conditions under which they would be willing to accept the null hypothesis. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
15.
Objective: In 2005, the Journal of Consulting and Clinical Psychology (JCCP) became the first American Psychological Association (APA) journal to require statistical measures of clinical significance, plus effect sizes (ESs) and associated confidence intervals (CIs), for primary outcomes (La Greca, 2005). As this represents the single largest editorial effort to improve statistical reporting practices in any APA journal in at least a decade, in this article we investigate the efficacy of that change. Method: All intervention studies published in JCCP in 2003, 2004, 2007, and 2008 were reviewed. Each article was coded for method of clinical significance, type of ES, and type of associated CI, broken down by statistical test (F, t, chi-square, r/R2, and multivariate modeling). Results: By 2008, clinical significance compliance was 75% (up from 31%), with 94% of studies reporting some measure of ES (reporting improved for individual statistical tests ranging from η2 = .05 to .17, with reasonable CIs). Reporting of CIs for ESs also improved, although only to 40%. Also, the vast majority of reported CIs used approximations, which become progressively less accurate for smaller sample sizes and larger ESs (cf. Algina & Kessleman, 2003). Conclusions: Changes are near asymptote for ESs and clinical significance, but CIs lag behind. As CIs for ESs are required for primary outcomes, we show how to compute CIs for the vast majority of ESs reported in JCCP, with an example of how to use CIs for ESs as a method to assess clinical significance. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
16.
In his article, “An alternative to null-hypothesis significance tests,” Killeen (2005) urged the discipline to abandon the practice of pobs-based null hypothesis testing and to quantify the signal-to-noise characteristics of experimental outcomes with replication probabilities. He described the coefficient that he invented, prep, as the probability of obtaining “an effect of the same sign as that found in an original experiment” (Killeen, 2005, p. 346). The journal Psychological Science quickly came to encourage researchers to employ prep, rather than pobs, in the reporting of their experimental findings. In the current article, we (a) establish that Killeen's derivation of prep contains an error, the result of which is that prep is not, in fact, the probability that Killeen set out to derive; (b) establish that prep is not a replication probability of any kind but, rather, is a quasi-power coefficient; and (c) suggest that Killeen has mischaracterized both the relationship between replication probabilities and statistical inference, and the kinds of claims that are licensed by knowledge of the value assumed by the replication probability that he attempted to derive. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
17.
Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes an integrated, alternative inferential confidence interval approach to testing for statistical difference, equivalence, and indeterminacy that is algebraically equivalent to standard NHST procedures and therefore exacts the same evidential standard. The combined numeric and graphic tests of statistical difference, equivalence, and indeterminacy are designed to avoid common interpretive problems associated with NHST procedures. Multiple comparisons, power, sample size, test reliability, effect size, and cause-effect ratio are discussed. A section on the proper interpretation of confidence intervals is followed by a decision rule summary and caveats. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
18.
The practice of coaching by individuals who consider themselves professional coaches has proliferated, yet coaching is not recognized as a profession. Through a metareview of scholarly works and a qualitative content analysis, an agenda for coaching-related research is proposed and applied to the criteria for a profession as a means of illustrating how coaching-related research can be utilized to support the professionalization of coaching. Recommendations for further study and their linkage to the criterion for professionalization are suggested. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
19.
Underpowered studies persist in the psychological literature. This article examines reasons for their persistence and the effects on efforts to create a cumulative science. The "curse of multiplicities" plays a central role in the presentation. Most psychologists realize that testing multiple hypotheses in a single study affects the Type I error rate, but corresponding implications for power have largely been ignored. The presence of multiple hypothesis tests leads to 3 different conceptualizations of power. Implications of these 3 conceptualizations are discussed from the perspective of the individual researcher and from the perspective of developing a coherent literature. Supplementing significance tests with effect size measures and confidence intervals is shown to address some but not necessarily all problems associated with multiple testing. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
20.
Classic parametric statistical significance tests, such as analysis of variance and least squares regression, are widely used by researchers in many disciplines, including psychology. For classic parametric tests to produce accurate results, the assumptions underlying them (e.g., normality and homoscedasticity) must be satisfied. These assumptions are rarely met when analyzing real data. The use of classic parametric methods with violated assumptions can result in the inaccurate computation of p values, effect sizes, and confidence intervals. This may lead to substantive errors in the interpretation of data. Many modern robust statistical methods alleviate the problems inherent in using parametric methods with violated assumptions, yet modern methods are rarely used by researchers. The authors examine why this is the case, arguing that most researchers are unaware of the serious limitations of classic methods and are unfamiliar with modern alternatives. A range of modern robust and rank-based significance tests suitable for analyzing a wide range of designs is introduced. Practical advice on conducting modern analyses using software such as SPSS, SAS, and R is provided. The authors conclude by discussing robust effect size indices. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献