共查询到20条相似文献,搜索用时 12 毫秒
1.
Assuming that the linear models for classical test theory and ANOVA hold simultaneously for some dependent variable, it is shown that 2 contradictory statements concerning the relationship between reliability and statistical power are both correct. J. E. Overall and J. A. Woodward (see PA, Vol 53:8623, 57:7284) showed that when the reliability of a difference or change score is zero, the power of a statistical test of a hypothesis of no change can be at a maximum. J. L. Fleiss (see record 1977-07259-001) found the opposite result (i.e., that the power of a statistical test of no pre–post change is at a maximum when the reliability of the difference or gain scores is equal to one). The role of the reliability of the dependent variable in statistical evaluations of controlled experiments is examined. It is argued that the conditions that yield high reliability coefficients are not necessarily optimal for significance testing. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
2.
The statistical theory of the power of significance tests, combined with the classical theory of the reliability of measurement, reveals that the power of a statistical test sometimes increases and sometimes decreases as the reliability coefficient of a dependent variable increases. This seeming paradox arises because the relation between statistical power and the reliability coefficient is not a functional relation unless another variable—either true variance or error variance—remains constant. The authors show that the paradox disappears if widely accepted, elementary results in statistical theory and measurement theory are considered together. This approach explains why some authors have reached different conclusions about how reliability influences significance tests. (12 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
3.
Shortcut approximate equations are described that provide estimates of the sample size required for 50% power (α?=?0.05, two-tailed) for 1 degree of freedom tests of significance for simple correlations, differences between 2 independent group means, and Pearson's chi-square test for 2?×?2 contingency tables. These sample sizes should be thought of as minima, because power equal to 50% means that the chance of a significant finding is that of flipping a fair coin. A more desirable sample size can be computed by simply doubling the 50% sample sizes, which is shown to result in power between 80% and 90%. With these simple tools, power can be estimated rapidly, which, it is hoped, will lead to greater use and understanding of power in the teaching of statistics and in research. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
4.
To evaluate the effect of variation in illumination on Ortho-Rater scores, acuity measurements were collected from 55 college students. Illumination was varied from 10 to 125% of standard. Near and far acuity scores, for both eyes and for the right eye, were tabulated. Decreases in illumination as great as one-fourth of standard, or increases as great as one-fourth of standard, did not affect visual acuity scores. Near acuity suffers to a greater degree than far acuity scores when illumination is decreased more than 25%. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
5.
Piedmont Ralph L.; Sokolove Robert L.; Fleming Michael Z. 《Canadian Metallurgical Quarterly》1989,1(2):155
This report examines the psychometric integrity of the Wechsler Adult Intelligence Scale—Revised (WAIS—R) subscales, and the differences between them, in a sample of 229 psychiatric patients from 2 community mental health centers (ages 16 to 85). The results verify the overall alpha and split-half reliabilities of the instrument and indicate that greater caution needs to be exercised in clinically evaluating difference scores. Cutoff values presented in the manual appear too low to be of any statistical or diagnostic merit. Distributions for each of the 55 possible difference scores found in this sample are presented and provide a better guide for making nosological determinations. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
6.
Argues that the use of difference scores to measure change in experimental research has often been faulted on the grounds that errors of measurement are additive. It is suggested that in research concerned with differences between experimental treatment groups, the loss in reliability due to calculation of difference scores is not a valid concern because the power of tests of significance is maximum when the reliability of the difference scores is zero. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
7.
In this article, I examine the inferences that can be based on the meta-analysis summaries known as "tests of combined significance." First, the effect size, significance value, and one example of a test of combined significance are introduced. Next, the statistical null and alternative hypotheses for combined significance tests are compared with those for analyses based on measures of effect magnitude. The hypotheses tested in effect-size analyses are more specific than the hypothesis tested in combined significance tests. Three previously analyzed sets of effect sizes are transformed into significance values and reanalyzed by using one of the most highly recommended tests of combined significance. Effect-size analyses appear more informative because the combined significance test gives identical results for three very different patterns of study outcomes. An assessment of the usefulness of combined significance methods concludes the article. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
8.
Calculations of the power of statistical tests are important in planning research studies (including meta-analyses) and in interpreting situations in which a result has not proven to be statistically significant. The authors describe procedures to compute statistical power of fixed- and random-effects tests of the mean effect size, tests for heterogeneity (or variation) of effect size parameters across studies, and tests for contrasts among effect sizes of different studies. Examples are given using 2 published meta-analyses. The examples illustrate that statistical power is not always high in meta-analysis. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
9.
The proportion of studies that use one-tailed statistical significance tests (π) in a population of studies targeted by a meta-analysis can affect the bias of the sample effect sizes (sample ESs, or ds) that are accessible to the meta-analyst. H. C. Kraemer, C. Gardner, J. O. Brooks, and J. A. Yesavage (1998) found that, assuming π?=?1.0, for small studies (small Ns) the overestimation bias was large for small population ESs (δ?=?0.2) and reached a maximum for the smallest population ES (viz., δ?=?0). The present article shows (with a minor modification of H. C. Kraemer et al.'s model) that when π?=?0, the small-N bias of accessible sample ESs is relatively small for δ?≤?0.2, and a minimum (in fact, nonexistent) for δ?=?0. Implications are discussed for interpretations of meta-analyses of (a) therapy efficacy and therapy effectiveness studies, (b) comparative outcome studies, and (c) studies targeting small but important population ESs. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
10.
Reports an error in the original article by L. Atkinson (Psychological Assessment, 1991[Jun], Vol 3[2], 292–294). In Table 1, SEest(d) of Wechsler Adult Intelligence Scale-Revised (WAIS-R) difference scores for the standardization sample is incorrect; the corrected table is presented. (The following abstract of this article originally appeared in record 1991-26153-001.) Prior tables (Atkinson, 1991) describing WAIS and its revision, WAIS-R, subtest scores did not account for the fact that the best estimate of the "true' difference between 2 scores obtained by an individual is not the one actually obtained, but one based on the obtained difference and regressed toward the mean difference. Furthermore, the previously published WAIS-R table is based on a psychiatric sample. This article examined the WAIS-R standardization sample (N?=?1,880) using regressed difference scores to derive statistics describing subtest discrepancies. Results indicated improved difference score reliability in the WAIS-R, as compared with the WAIS, although reliability remained poor. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
11.
Research on informant discrepancies has increasingly utilized difference scores. This article demonstrates the statistical equivalence of regression models using difference scores (raw or standardized) and regression models using separate scores for each informant to show that interpretations should be consistent with both models. First, regression equations were used to demonstrate that difference score models are equivalent to models using separate scores for each informant. Second, a hypothesis-driven empirical example (218 mother–child dyads, mean age = 11.5 years, 49% female participants, 49% White, 47% African American) was used to provide an illustration of the equivalence of the 2 models. Implications of the equivalence of models using difference scores and models using separate scores for each informant are discussed in terms of the growing prevalence of an interpretation in the literature of difference score analyses that is inconsistent with results from equivalent separate informant analyses. Differences in the separate predictive ability of informants should be acknowledged as an alternative interpretation of the difference score regression coefficient. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献
12.
Used procedures established by R. Flesch (1948) to measure the reading ease and human interest of 36 introductory educational psychology texts (31 textbooks and 5 books of readings) all published since 1970. Random samples of 20 pages were selected from each text, and from these, 5 100-word samples were chosen so that the page span of the total text was included. A reading ease score and a human interest score were obtained from each 100-word sample. Compared to a similar study of introductory general psychology texts by B. Gillen (1973), results indicate that educational psychology texts have a greater probability of being classified as dull and very difficult than do the introductory general psychology texts. (40 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
13.
Presents a general approach based on ANCOVA structures that can be used by cognitive psychologists to calculate correlations involving a component of information processing time that cannot be directly measured. The usual procedure has been to express this component as a difference between 2 times that can be directly measured. But correlations involving difference scores are notably attenuated by the presence of measurement error, and the substantive assumptions implicit in the calculation of difference scores may not be plausible. The recommended approach begins with reasonable statements of how components of processes are structured in confirmatory factor analysis models, which can be estimated by LISREL or COFAMM. In the process of fitting such models, the proper disattenuated correlation is estimated as part of a set of parameters implied by substantive assumptions. The validity of these assumptions can be tested by comparing the fit of the model to observed data. Such a comparison may suggest how assumptions should be modified to increase the plausibility of the model. This model-fitting approach is illustrated with data relating information-processing tasks to ability measures. (20 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
14.
In moderated regression analysis with both a continuous predictor and nominal-level (group membership) variables, there are conditions in which the hypothesis of equal slopes of the regression of Y onto X across groups is equivalent to the hypothesis of equality of X–Y correlations across groups. This research uses those conditions to investgate the impact of heterogeneity of error variance on the power accuracy of the F test for equality of regression slopes. The results show that even when sample sizes are equal, the test is not robust and, under unequal sample sizes, the pattern of excessively high or excessively low rejection rates can be severe. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献
15.
Critics have put forth several arguments against the use of tests of statistical significance (TOSSes). Among these, the converse inequality argument stands out but remains sketchy, as does criticism of it. The argument states that we want P(HΔ) (where H and D represent hypothesis and data, respectively), we get P(DΗ), and the 2 do not equal one another. Each of the terms in 'P(DΗ)?≠?P(HΔ)' requires clarification. Furthermore, the argument as a whole allows for multiple interpretations. If the argument questions the logic of TOSSes, then defenses of TOSSes fall into 2 distinct types. Clarification and analysis of the argument suggest more moderate conclusions than previously offered by friends and critics of TOSSes. Furthermore, the general method of clarification through formalization may offer a way out of the current impasse. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
16.
A persistent problem in the measurement of lateral advantage (greater ability to perform on one side—that is, one visual hemifield or ear—than on the other) has been the artifactual curvilinear relationship of the right-minus-left (R?–?L) difference score to (R?+?L) overall accuracy. This relationship is not primarily attributable to the often-cited restriction imposed by (R?+?L) overall accuracy on the possible size of the (R?–?L) difference score. Rather, the relationship is a consequence of the mere existence of a difference between mean scores on two measures of accuracy. The generality of this psychometric principle is demonstrated using two vocabulary tests. Alternative designs are described that make it possible to measure lateral advantage free from effects of the artifact. One design solution is demonstrated in two studies of how the manipulation of exposure time affects lateral advantage. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
17.
Conducted a meta-analysis of 15 outcome studies on paradoxical interventions (PIs), including 86 effect sizes for PIs and 39 for nonparadoxical treatments. Results indicate a mean effect size of .99 for PIs compared to no-treatment controls, and a mean of .56 when compared to placebo-control groups. An analysis of those studies containing both PIs and nonparadoxical treatments revealed that PIs were consistently and significantly more effective. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
18.
Describes procedures necessary for adjusting for nonindependence in meta-analysis. The assumption of independence among significance levels that are analyzed is often violated in practice. Because the failure to adjust for nonindependence is likely to produce bias on the meta-analytic level, ignoring the assumption of independence is not of minor consequence and requires careful attention. (12 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
19.
20.
Are auto accidents related to driver personality? Using a paper and pencil personality inventory (MMPI), the driver behavior and MMPI scores of 993 college students were compared. Some slight relationship was found. "Knowledge of the kind of personality organization and motivation of a driver may be useful for purposes of both licensing and training drivers." (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献