期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The converse inequality argument against tests of statistical significance.

Markus Keith A. 《Canadian Metallurgical Quarterly》2001,6(2):147

Critics have put forth several arguments against the use of tests of statistical significance (TOSSes). Among these, the converse inequality argument stands out but remains sketchy, as does criticism of it. The argument states that we want P(HΔ) (where H and D represent hypothesis and data, respectively), we get P(DΗ), and the 2 do not equal one another. Each of the terms in 'P(DΗ)?≠?P(HΔ)' requires clarification. Furthermore, the argument as a whole allows for multiple interpretations. If the argument questions the logic of TOSSes, then defenses of TOSSes fall into 2 distinct types. Clarification and analysis of the argument suggest more moderate conclusions than previously offered by friends and critics of TOSSes. Furthermore, the general method of clarification through formalization may offer a way out of the current impasse. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

2.

The statistical concepts of confidence and significance.

Chandler Robert E. 《Canadian Metallurgical Quarterly》1957,54(5):429

"Inasmuch as explicit terminology is needed to convey the probabilities of committing statistical errors in the respective areas of interval estimation and testing hypotheses, the concept of confidence should never be associated with the statistical test of an H regardless of the nature of the test being employed." (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

3.

A note on measurement scales and statistical tests.

Boneau C. Alan 《Canadian Metallurgical Quarterly》1961,16(5):260

The author defends the use of parametric tests (Boneau, 1960), and has been challenged on more than one occasion to justify the use of the t test in many typical psychological situations where there are measurement considerations. Intelligence is often given as an instance, the point being that intelligence is actually measured by an ordinal scale, that equal differences between scores represent different magnitudes at different places on the underlying continuum. This is seen as somehow invalidating the use of the t test with such scores. Burke (1953) has presented an argument which should have ended further discussion, but, in view of the present concern, a restatement of the argument and the addition of a few comments would seem indicated. The present concern seems to have been stimulated by the publication by psychologists of two texts in the field of statistics (Senders, 1958; Siegel, 1956) both of which are organized around Stevens' (1951) system of classifying measurement scales. Siegel and Senders belabor the point that parametric statistics, specifically the t and F tests should be avoided when the measurement scales are no stronger than ordinal, a state of affairs purportedly typical in psychology. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

4.

The Power of Statistical Tests for Moderators in Meta-Analysis.

Hedges Larry V.; Pigott Therese D. 《Canadian Metallurgical Quarterly》2004,9(4):426

Calculation of the statistical power of statistical tests is important in planning and interpreting the results of research studies, including meta-analyses. It is particularly important in moderator analyses in meta-analysis, which are often used as sensitivity analyses to rule out moderator effects but also may have low statistical power. This article describes how to compute statistical power of both fixed- and mixed-effects moderator tests in meta-analysis that are analogous to the analysis of variance and multiple regression analysis for effect sizes. It also shows how to compute power of tests for goodness of fit associated with these models. Examples from a published meta-analysis demonstrate that power of moderator tests and goodness-of-fit tests is not always high. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

5.

Misuse of statistical tests in three decades of psychotherapy research.

Dar Reuven; Serlin Ronald C.; Omer Haim 《Canadian Metallurgical Quarterly》1994,62(1):75

Reviews the misuse of statistical tests in psychotherapy research studies published in the Journal of Consulting and Clinical Psychology in the years 1967–1968, 1977–1978, and 1987–1988. It focuses on 3 major problems in statistical practice: inappropriate uses of null hypothesis tests and p values, neglect of effect size, and inflation of Type 1 error rate. The impressive frequency of these problems is documented, and changes in statistical practices over the past 3 decades are interpreted in light of trends in psychotherapy research. The article concludes with practical suggestions for rational application of statistical tests. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

6.

Retesting in selection: A meta-analysis of coaching and practice effects for tests of cognitive ability.

Hausknecht John P.; Halpert Jane A.; Di Paolo Nicole T.; Moriarty Gerrard Meghan O. 《Canadian Metallurgical Quarterly》2007,92(2):373

Previous studies have indicated that as many as 25% to 50% of applicants in organizational and educational settings are retested with measures of cognitive ability. Researchers have shown that practice effects are found across measurement occasions such that scores improve when these applicants retest. In this study, the authors used meta-analysis to summarize the results of 50 studies of practice effects for tests of cognitive ability. Results from 107 samples and 134,436 participants revealed an adjusted overall effect size of .26. Moderator analyses indicated that effects were larger when practice was accompanied by test coaching and when identical forms were used. Additional research is needed to understand the impact of retesting on the validity inferences drawn from test scores. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

7.

Advances in testing the statistical significance of mediation effects. 总被引：1，自引：0，他引：1

Mallinckrodt Brent; Abraham W. Todd; Wei Meifen; Russell Daniel W. 《Canadian Metallurgical Quarterly》2006,53(3):372

P. A. Frazier, A. P. Tix, and K. E. Barron (2004) highlighted a normal theory method popularized by R. M. Baron and D. A. Kenny (1986) for testing the statistical significance of indirect effects (i.e., mediator variables) in multiple regression contexts. However, simulation studies suggest that this method lacks statistical power relative to some other approaches. The authors describe an alternative developed by P. E. Shrout and N. Bolger (2002) based on bootstrap resampling methods. An example and step-by-step guide for performing bootstrap mediation analyses are provided. The test of joint significance is also briefly described as an alternative to both the normal theory and bootstrap methods. The relative advantages and disadvantages of each approach in terms of precision in estimating confidence intervals of indirect effects, Type I error, and Type II error are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

8.

Construct validity in psychological tests.

Cronbach Lee J.; Meehl Paul E. 《Canadian Metallurgical Quarterly》1955,52(4):281

"Construct validation was introduced in order to specify types of research required in developing tests for which the conventional views on validation are inappropriate. Personality tests, and some tests of ability, are interpreted in terms of attributes for which there is no adequate criterion. This paper indicates what sorts of evidence can substantiate such an interpretation, and how such evidence is to be interpreted." 60 references. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

9.

Accurate tests of statistical significance for rWG and average deviation interrater agreement indexes.

Dunlap William P.; Burke Michael J.; Smith-Crowe Kristin 《Canadian Metallurgical Quarterly》2003,88(2):356

The authors demonstrated that the most common statistical significance test used with rWG-type interrater agreement indexes in applied psychology, based on the chi-square distribution, is flawed and inaccurate. The chi-square test is shown to be extremely conservative even for modest, standard significance levels (e.g., .05). The authors present an alternative statistical significance test, based on Monte Carlo procedures, that produces the equivalent of an approximate randomization test for the null hypothesis that the actual distribution of responding is rectangular and demonstrate its superiority to the chi-square test. Finally, the authors provide tables of critical values and offer downloadable software to implement the approximate randomization test for rWG type and for average deviation (AD)-type interrater agreement indexes. The implications of these results for studying a broad range of interrater agreement problems in applied psychology are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

10.

Seventy-two tests of the sequential lineup superiority effect: A meta-analysis and policy discussion.

Steblay Nancy K.; Dysart Jennifer E.; Wells Gary L. 《Canadian Metallurgical Quarterly》2011,17(1):99

A decade ago, a meta-analysis showed that identification of a suspect from a sequential lineup versus a simultaneous lineup was more diagnostic of guilt (Steblay, Dysart, Fulero, & Lindsay, 2001). Since then, controversy and debate regarding sequential superiority has emerged. We report the results of a new meta-analysis involving 72 tests of simultaneous and sequential lineups from 23 different labs involving 13,143 participant-witnesses. The results are very similar to the 2001 results in showing that the sequential lineup is less likely to result in an identification of the suspect, but also more diagnostic of guilt than is the simultaneous lineup. An examination of the full diagnostic design dataset (27 tests that used the full simultaneous/sequential × culprit-present/culprit-absent design) showed that the average gap in correct identifications favoring the simultaneous lineup over the sequential lineup—8%—is smaller than the 15% figure obtained from the 2001 meta-analysis (and from the current full 72-test dataset). The lower error rate incurred for culprit-absent lineups with use of a sequential format remains consistent across the years, with 22% fewer errors than simultaneous lineups. A Bayesian analysis shows that the posterior probability of guilt following an identification of the suspect is higher for the sequential lineup across the entire base rate for culprit presence/absence. New ways to think about policy issues are discussed. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献

11.

Evaluating statistical difference, equivalence, and indeterminancy using inferential confidence intervals: An integrated alternative method of conducting null hypothesis statistical tests.

Tryon Warren W. 《Canadian Metallurgical Quarterly》2001,6(4):371

Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes an integrated, alternative inferential confidence interval approach to testing for statistical difference, equivalence, and indeterminacy that is algebraically equivalent to standard NHST procedures and therefore exacts the same evidential standard. The combined numeric and graphic tests of statistical difference, equivalence, and indeterminacy are designed to avoid common interpretive problems associated with NHST procedures. Multiple comparisons, power, sample size, test reliability, effect size, and cause-effect ratio are discussed. A section on the proper interpretation of confidence intervals is followed by a decision rule summary and caveats. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

12.

SPSS统计软件包在矿山统计分析中的应用 总被引：6，自引：2，他引：4

李英龙严碧《黄金》2000,21(5):17-19

文中以某矿山应用SPSS统计软件包为例 ,对该矿的技术经济指标进行数据分析 ,并对矿石品位数据进行统计检验。相似文献

13.

On the power of multivariate latent growth curve models to detect correlated change.

Hertzog Christopher; Lindenberger Ulman; Ghisletta Paolo; Oertzen Timo von 《Canadian Metallurgical Quarterly》2006,11(3):244

We evaluated the statistical power of single-indicator latent growth curve models (LGCMs) to detect correlated change between two variables (covariance of slopes) as a function of sample size, number of longitudinal measurement occasions, and reliability (measurement error variance). Power approximations following the method of Satorra and Saris (1985) were used to evaluate the power to detect slope covariances. Even with large samples (N=500) and several longitudinal occasions (4 or 5), statistical power to detect covariance of slopes was moderate to low unless growth curve reliability at study onset was above .90. Studies using LGCMs may fail to detect slope correlations because of low power rather than a lack of relationship of change between variables. The present findings allow researchers to make more informed design decisions when planning a longitudinal study and aid in interpreting LGCM results regarding correlated interindividual differences in rates of development. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

14.

The principal replicated factors discovered in objective personality tests.

Cattell Raymond B. 《Canadian Metallurgical Quarterly》1955,50(3):291

From some 150 factors in objective personality tests, 18 potentially invariant patterns have been found by cross matching in all possible ways. These are divided into 12 of a satisfactory degree of invariance and universality, and 6 of lesser statistical significance. The former are discussed in this paper. 63 references. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

15.

The reliability of relationship satisfaction: A reliability generalization meta-analysis.

Graham James M.; Diebels Kate J.; Barnow Zoe B. 《Canadian Metallurgical Quarterly》2011,25(1):39

We conducted a reliability-generalization meta-analysis of 7 of the most frequently used measures of relationship satisfaction: the Locke–Wallace Marital Adjustment Test (LWMAT), the Kansas Marital Satisfaction Scale (KMS), the Quality of Marriage Index, the Relationship Assessment Scale, the Marital Opinion Questionnaire, Karney and Bradbury's (1997) semantic differential scale, and the Couples Satisfaction Index. Six hundred thirty-nine reliability coefficients from 398 articles and 636,806 individuals provided internal consistency reliability estimates for this meta-analysis. We present the average score reliabilities for each measure, characterize the variance in score reliabilities across studies, and consider sample and study characteristics that are predictive of score reliability. Overall, the KMS and the LWMAT appear to be the strongest and weakest measures, respectively, from a reliability perspective. We discuss the importance of considering reliability invariance when making cross-group comparisons and provide recommendations for researchers when electing a measure of relationship satisfaction. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献

16.

Reliability of measurement and the power of statistical tests: Some new results.

Nicewander W. Alan; Price James M. 《Canadian Metallurgical Quarterly》1983,94(3):524

The authors (see record 1979-00153-001) argued that the reliability coefficient for the dependent variable in a controlled experiment has no direct relevance for hypothesis testing. Specifically, they demonstrated that increasing the reliability coefficient for the dependent variable did not necessarily increase the power of standard statistical tests. The authors present further evidence that large reliability coefficients are not always desirable in true experiments, and replies to J. P. Sutcliffe's (see record 1980-29332-001) basic criticisms of Nicewander and Price's contentions. (6 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

17.

Review of Randomization tests.

Herzberg Paul A. 《Canadian Metallurgical Quarterly》1984,25(2):142

Reviews the book, Randomization Tests by Eugene S. Edgington (1980). Edgington begins his preface by suggesting that his book has two goals: "a practical guide for experimenters" and "a textbook for courses in applied statistics." As indicated above, the book is not the detailed and authoritative volume which experimenters need as a guide to randomization tests. However, Edgington's cogent criticisms of "the long-standing fiction of random sampling in experimental research" (p. iii) will lead experimenters to consider the merits of randomization tests. Similarly, the book is not thorough enough to be a successful textbook, but it should alert all teachers of statistics and experimental design to the importance of randomization and to the weakness of the random-sampling assumption in most statistical tests. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

18.

Instruction in the use of tests.

Holland Glen A. 《Canadian Metallurgical Quarterly》1954,9(2):71

19.

Gender differences in temperament: A meta-analysis.

Else-Quest Nicole M.; Hyde Janet Shibley; Goldsmith H. Hill; Van Hulle Carol A. 《Canadian Metallurgical Quarterly》2006,132(1):33

The authors used meta-analytical techniques to estimate the magnitude of gender differences in mean level and variability of 35 dimensions and 3 factors of temperament in children ages 3 months to 13 years. Effortful control showed a large difference favoring girls and the dimensions within that factor (e.g., inhibitory control: d = -.41, perceptual sensitivity: d = -0.38) showed moderate gender differences favoring girls, consistent with boys' greater incidence of externalizing disorders. Surgency showed a difference favoring boys, as did some of the dimensions within that factor (e.g., activity: d = 0.33, high-intensity pleasure: d = 0.30), consistent with boys' greater involvement in active rough-and-tumble play. Negative affectivity showed negligible gender differences. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献

20.

A meta-analysis of teen cigarette smoking cessation.

Sussman Steve; Sun Ping; Dent Clyde W. 《Canadian Metallurgical Quarterly》2006,25(5):549

相似文献