首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Questions the legitimacy of raising the probability of a Type I error (alpha level) to achieve greater power for important or interesting comparisons. The analysis of this problem suggests that there is no basis for assuming any fixed relation between the importance of a comparison and the relative costs of Type I and Type II errors. In general, however, it is suggested that experimenters tend to overestimate the costs of a Type II error and ignore important costs of a Type I error. It is concluded that the introduction of unnecessary subjective elements into significance tests should be resisted. (14 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

2.
Discusses 2 approaches in the use of multiple comparison procedures. One is that of the practical-research-oriented investigator, who emphasizes the importance of the Type II error and associated power. The other is exemplified by mathematical statisticians who emphasize "pure" mathematical aspects and concentrate on the importance of controlling Type I errors. Two important issues are discussed: emphasis on Type I errors within single experiments vs emphasis on Type II errors within multiple experiments. (French abstract) (36 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

3.
It is well-known that for normally distributed errors parametric tests are optimal statistically, but perhaps less well-known is that when normality does not hold, nonparametric tests frequently possess greater statistical power than parametric tests, while controlling Type I error rate. However, the use of nonparametric procedures has been limited by the absence of easily performed tests for complex experimental designs and analyses and by limited information about their statistical behavior for realistic conditions. A Monte Carlo study of tests of predictor subsets in multiple regression analysis indicates that various nonparametric tests show greater power than the F test for skewed and heavy-tailed data. These nonparametric tests can be computed with available software. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

4.
When more than one correlation coefficient is tested for significance in a study, the probability of making at least one Type I error rises rapidly as the number of tests increases, and the probability of making a Type I error after a Type I error on a previous test is usually greater than the nominal significance level used in each test. To avoid excessive Type I errors with multiple tests of correlations, it is noted that researchers should use procedures that answer research questions with a single statistical test and/or should use special multiple-test procedures. A review of simultaneous-test and multiple-test procedures for correlations (e.g. Bartlett and Rajalakshman's test, multistage Bonferroni procedure, union-intersection tests, and the rank adjusted method) is presented, and several new procedures are described. (40 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

5.
Describes procedures, based on the Bonferroni inequality, for avoiding increases in Type I errors that typically occur when an increasing number of contrasts is to be computed. The 3 types of Bonferroni tests differ in the degree to which the planned contrasts are specified beforehand and the relative importance attached to each one. This system of procedures is recommended for its flexibility, simplicity, and generality. When the power of the basic Bonferroni method is focused (by ordering the contrasts or other tests of significance by their importance), the disadvantage of conservatism can be overcome. Calculations are appended. (28 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

6.
The article, "Single-Sample Tests for Many Correlations," by Robert E. Larzelere and Stanley A. Mulaik (Psychological Bulletin, 1977, Vol. 84, No. 3, pp. 557-569) contains the following errors: On page 558, the sentence that includes Equation 1 should read: "Assuming that the X and Y variables have a joint multivariate normal distribution, one then regards the multiple correlation as significantly different from zero if: (1)F=R2(N-p-1)/[(1-R2)p] is greater than Fα, the corresponding critical value of the F distribution with p and N-p-1 degrees of freedom at the 100(1-α) percentile level, with N the sample size." On page 559, the sentence that includes Equation 2 should read: "The population correlation p(W, Y) between W and Y is then regarded as significantly different from zero if: (2) F=rwy2(N-p-1)/[1-rwy2]p) is greater than the critical value of Fα of the F distribution with p and N-p-1 degrees of freedom at the 100(1-α) percentile level." On page 559, the sentence that reads, "If p(W, Y) is regarded as equal to zero, then any variable Xι with a nonzero weight wι in the linear combination W is also considered to have a zero correlation with Y," should be deleted. Thanks are due to Paul A. Games. (The following abstract originally appeared in record 1978-00149-001) When more than one correlation coefficient is tested for significance in a study, the probability of making at least one Type I error rises rapidly as the number of tests increases, and the probability of making a Type I error after a Type I error on a previous test is usually greater than the nominal significance level used in each test. To avoid excessive Type I errors with multiple tests of correlations, it is noted that researchers should use procedures that answer research questions with a single statistical test and/or should use special multiple-test procedures. A review of simultaneous-test and multiple-test procedures for correlations (e.g. Bartlett and Rajalakshman's test, multistage Bonferroni procedure, union-intersection tests, and the rank adjusted method) is presented, and several new procedures are described. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
Monte Carlo studies provide the information needed to help researchers select appropriate analytical procedures under design conditions in which the underlying assumptions of the procedures are not met. In Monte Carlo studies, the 2 errors that one could commit involve (a) concluding that a statistical procedure is robust when it is not or (b) concluding that it is not robust when it is. In previous attempts to apply standard statistical design principles to Monte Carlo studies, the less severe of these errors has been wrongly designated the Type I error. In this article, a method is presented for controlling the appropriate Type I error rate; the determination of the number of iterations required in a Monte Carlo study to achieve desired power is described; and a confidence interval for a test's true Type I error rate is derived. A robustness criterion is also proposed that is a compromise between W. G. Cochran's (1952) and J. V. Bradley's (1978) criteria. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

8.
One imposing directional decisions on nondirectional tests will overestimate power, underestimate sample size, and ignore the risk of Type III error (getting the direction wrong) if traditional calculations—those applying to nondirectional decisions—are used. Usually trivial with the z test, the errors might be important where α is large and effect size is small or with tests using other distributions. One can avoid the errors by using calculations that apply to directional decisions or by using a directional two-tailed test at the outset, a conceptually simpler solution. With a revised concept of power, this article shows calculations for the test; explains how to find its power, Type III error risk, and sample size in statistical tables for traditional tests; compares it to conventional one- and two-tailed tests and to one- and two-sided confidence intervals; and concludes that when a significance test is planned it is the best choice for most purposes. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

9.
Challenges C. J. Brainerd's (see record 1973-20177-001) conclusion that Ss' judgments on Piagetian tasks are more appropriate than explanations as bases for inferences about cognitive structures. A more reasonable conclusion is that explanations may yield more Type II errors, but judgments yield more Type I errors. It is recommended that both be used. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
Comments on L. R. O'Leary's (see record 1973-25947-001) article on the use of job sample tests as valid predictors of job performance. Problems with O'Leary's presentation involve (1) individual exceptions to probabilistic predictions, (2) his switching of criteria in an example, and (3) his statement that the use of job simulation tests reduces both Type I and Type II errors. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
Regards J. R. Levin and L. A. Marascuilo's (see record 1973-05774-001) conception of Type IV errors in an analysis of variance as a dubious contribution since most of the examples they cite as errors are reasonable procedures. Their use of interaction estimates rather than simple effect tests on cell means is opposed since such estimates involve subtraction of meaningless main effects and, therefore, become meaningless themselves. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
Simultaneous test procedures have been found to be very conservative with respect to Type I errors. The present article emphasizes that simultaneous test procedures are defined for all hypotheses implied by the overall hypothesis and demonstrates that the conservativeness of simultaneous test procedures is most often due to their application to single-variable hypotheses. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
Combined 3 factors, parameter used, technique used, and method of control of Type I errors, into a model that includes 100 different statistical tests, of which 64 are defensible. Tests on complex hypotheses about correlations, ρ, proportions, P, and variances, ?–2, comparable to tests on means, μ, are available. For the equal n case, the statistics needed can all be formulated either as t statistics or as omnibus F statistics. The technique factor with 5 levels includes 3 variations whereby a t is contrasted with 1 of 3 critical values appropriate for a given set of contrasts. The F statistic may be used on 1-way or multifactor designs on any of the above parameters. The experiment's design and experimental hypotheses dictate which cells of the crossing of these 2 factors are appropriate. The experimenter's major choice is the method of control of Type I errors. A simultaneous and 4 stepwise methods are discussed as general methods that could be used with most statistics. Setting alpha as the familywise rate of Type I errors and the use of simultaneous methods are recommended. (38 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
Comments on the original article "The impact of chief executive officer personality on top management team dynamics: One mechanism by which leadership affects organizational performance", by R. S. Peterson et al. (see record 2003-08045-002). This comment illustrates how small sample sizes, when combined with many statistical tests, can generate unstable parameter estimates and invalid inferences. Although statistical power for 1 test in a small-sample context is too low, the experimentwise power is often high when many tests are conducted, thus leading to Type I errors that will not replicate when retested. This comment's results show how radically the specific conclusions and inferences in R. S. Peterson, D. B. Smith, P. V. Martorana, and P. D. Owens's (2003) study changed with the inclusion or exclusion of 1 data point. When a more appropriate experimentwise statistical test was applied, the instability in the inferences was eliminated, but all the inferences become nonsignificant, thus changing the positive conclusions. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

15.
Adverse impact is often assessed by evaluating whether the success rates for 2 groups on a selection procedure are significantly different. Although various statistical methods have been used to analyze adverse impact data, Fisher's exact test (FET) has been widely adopted, especially when sample sizes are small. In recent years, however, the statistical field has expressed concern regarding the default use of the FET and has proposed several alternative tests. This article reviews Lancaster's mid-P (LMP) test (Lancaster, 1961), an adjustment to the FET that tends to have increased power while maintaining a Type I error rate close to the nominal level. On the basis of Monte Carlo simulation results, the LMP test was found to outperform the FET across a wide range of conditions typical of adverse impact analyses. The LMP test was also found to provide better control over Type I errors than the large-sample Z-test when sample size was very small, but it tended to have slightly lower power than the Z-test under some conditions. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

16.
When undertaking many tests of significance, researchers are faced with the problem of how best to control the probability of committing a Type I error. The familywise approach deals directly with multiplicity problems by setting a level of significance for an entire set of related hypotheses; the comparison approach ignores the issue by setting the rate of error on each individual hypothesis. A new formulation of control, the false discovery rate, does not provide control as stringent as that of the familywise rate, but concomitant with this relaxation in stringency is an increase in sensitivity to detect effects relative to the sensitivity of familywise control. Type I error and power rates for 4 relatively powerful and easily computed pairwise multiple comparison procedures were compared with the false discovery rate procedure for various 1-way layouts by use of test statistics that do not assume variance homogeneity. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
Compares 2 procedures for protecting the number of false rejections for a set of all possible pairwise comparisons. The 2-stage strategy of computing pairwise comparisons, conditional on a significant omnibus test, is compared with the multiple comparison strategy that sets a "familywise" critical value directly. The ANOVA test, the Brown and Forsythe test, and the Welch omnibus test, as well as 3 procedures for assessing the significance of pairwise comparisons, are combined into 9 2-stage testing strategies. The data from this study establish that the common strategy of following a significant ANOVA F with Student's t tests on pairs of means results in a substantially inflated rate of Type I error when variances are heterogeneous. Type I error control, however, can be obtained with other 2-stage procedures, and the authors tentatively consider the Welch F″ Welch t″ combination desirable. In addition, the 2 techniques for controlling Type I error do not substantially differ as much as might be expected; some 2-stage procedures are comparable to simultaneous techniques. (18 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

18.
Many psychological phenomena involve 2 individuals whose actions, thoughts, or feelings are interdependent. Psychologists' usual research methods assume independence of observations, hence they are ill-suited to illuminate these dyadic phenomena. In this article, the authors review some dyadic research methods: a round robin research design and a social relations statistical model. They also explain how to use these methods and describe a few substantive applications. The article focuses on significance tests for round robin data. The authors report computer simulations that compare the round robin significance tests currently in use to a within-group t test they are proposing. Results indicate that all of these significance tests adequately control Type I errors, but that the authors' new test has power advantages over existing tests. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

19.
Evaluated the statistical power of the Callender-Osburn method for testing the situational specificity hypothesis in validity generalization studies. The Schmidt-Hunter 75% rule for testing the situational specificity hypothesis was also studied with regard to its sensitivity for detecting both Type I and Type II errors. Results show that both the Callender-Osburn procedure and Schmidt-Hunter 75% rule lacked sufficient statistical power to detect low-to-moderate true validity variance when sample size was below 100. (13 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

20.
In speeded response tasks with redundant signals, parallel processing of the signals is tested by the race model inequality. This inequality states that given a race of two signals, the cumulative distribution of response times for redundant stimuli never exceeds the sum of the cumulative distributions of response times for the single-modality stimuli. It has been derived for synchronous stimuli and for stimuli with stimulus onset asynchrony (SOA). In most experiments with asynchronous stimuli, discrete SOA values are chosen and the race model inequality is separately tested for each SOA. Due to the high number of statistical tests, Type I and II errors are increased. Here a straightforward method is demonstrated to collapse these multiple tests into one test by summing the inequalities for the different SOAs. The power of the procedure is substantially increased by assigning specific weights to SOAs at which the violation of the race model prediction is expected to be large. In addition, the method enables data analysis for experiments in which stimuli are presented with SOA from a continuous distribution rather than in discrete steps. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号