首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Bootstrapping is introduced as a method for approximating the standard errors of validity generalization (VG) estimates. A Monte Carlo study was conducted to evaluate the accuracy of bootstrap validity-distribution parameter estimates, bootstrap standard error estimates, and nonparametric bootstrap confidence intervals. In the simulation study the authors manipulated the sample sizes per correlation coefficient, the number of coefficients per VG analysis, and the variance of the distribution of true correlation coefficients. The results indicate that the standard error estimates produced by the bootstrapping procedure were very accurate. It is recommended that the bootstrap standard-error estimates and confidence intervals be used in the interpretation of the results of VG analyses. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

2.
Examined clinical judgment in estimating premorbid intellectual function (IQ). In 2 experiments, clinical neuropsychologists were asked to (1) specify their beliefs about the interrelationships between IQ and demographic predictors and (2) estimate IQ scores for hypothetical individuals. The clinicians believed that the relationships between the variables were stronger than previous research has established. On the judgment tasks, the clinicians provided narrower confidence intervals than those derived from their beliefs about the correlations, although this effect was primarily limited to estimates of Performance IQ. There were also discrepancies between clinicians' beliefs about the IQ–predictor correlations and the correlations between the clinicians' IQ estimates and the same predictors, suggesting inability to appropriately regress estimates. Although the clinicians' IQ estimates were close to those of an actuarial formula using the same information, their confidence was considerably higher. Constraints on human reasoning and memory disable clinical reasoners from making estimates of premorbid IQ that reflect the predictive power of demographic variables. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

3.
OBJECTIVE: As physiology based assessments of mortality risk become more accurate, their potential utility in clinical decision support and resource rationing decisions increases. Before these prediction models can be used, however, their performance must be statistically evaluated and interpreted in a clinical context. We examine the issues of confidence intervals (as estimates of survival ranges) and confidence levels (as estimates of clinical certainty) by applying Pediatric Risk of Mortality III (PRISM III) in two scenarios: (1) survival prediction for individual patients and (2) resource rationing. DESIGN: A non-concurrent cohort study. SETTING: 32 pediatric intensive care units (PICUs). PATIENTS: 10608 consecutive patients (571 deaths). INTERVENTIONS: None. MEASUREMENTS AND RESULTS: For the individual patient application, we investigated the observed survival rates for patients with low survival predictions and the confidence intervals associated with these predictions. For the resource rationing application, we investigated the maximum error rate of a policy which would limit therapy for patients with scores exceeding a very high threshold. For both applications, we also investigated how the confidence intervals change as the confidence levels change. The observed survival in the PRISM III groups >28, >35, and >42 were 6.3, 5.3, and 0%, with 95% upper confidence interval bounds of 10.5, 13.0, and 13.3%, respectively. Changing the confidence level altered the survival range by more than 300% in the highest risk group, indicating the importance of clinical certainty provisions in prognostic estimates. The maximum error rates for resource allocation decisions were low (e. g., 29 per 100000 at a 95% certainty level), equivalent to many of the risks of daily living. Changes in confidence level had relatively little effect on this result. CONCLUSIONS: Predictions for an individual patient's risk of death with a high PRISM score are statistically not precise by virtue of the small number of patients in these groups and the resulting wide confidence intervals. Clinical certainty (confidence level) issues substantially influence outcome ranges for individual patients, directly affecting the utility of scores for individual patient use. However, sample sizes are sufficient for rationing decisions for many groups with higher certainty levels. Before there can be widespread acceptance of this type of decision support, physicians and families must confront what they believe is adequate certainty.  相似文献   

4.
Cost-effectiveness ratios usually appear as point estimates without confidence intervals, since the numerator and denominator are both stochastic and one cannot estimate the variance of the estimator exactly. The recent literature, however, stresses the importance of presenting confidence intervals for cost-effectiveness ratios in the analysis of health care programmes. This paper compares the use of several methods to obtain confidence intervals for the cost-effectiveness of a randomized intervention to increase the use of Medicaid's Early and Periodic Screening, Diagnosis and Treatment (EPSDT) programme. Comparisons of the intervals show that methods that account for skewness in the distribution of the ratio estimator may be substantially preferable in practice to methods that assume the cost-effectiveness ratio estimator is normally distributed. We show that non-parametric bootstrap methods that are mathematically less complex but computationally more rigorous result in confidence intervals that are similar to the intervals from a parametric method that adjusts for skewness in the distribution of the ratio. The analyses also show that the modest sample sizes needed to detect statistically significant effects in a randomized trial may result in confidence intervals for estimates of cost-effectiveness that are much wider than the boundaries obtained from deterministic sensitivity analyses.  相似文献   

5.
Indices of positive and negative agreement for observer reliability studies, in which neither observer can be regarded as the standard, have been proposed. In this article, it is demonstrated by means of an example and a small simulation study that a recently published method for constructing confidence intervals for these indices leads to intervals that are too wide. Appropriate asymptotic (i.e., large sample) variance estimates and confidence intervals for the positive and negative agreement indices are presented and compared with bootstrap confidence intervals. We also discuss an alternative method of interval estimation motivated from a Bayesian viewpoint. The asymptotic intervals performed adequately for sample sizes of 200 or more. For smaller samples, alternative confidence intervals such as bootstrap intervals or Bayesian intervals should be considered.  相似文献   

6.
An approach to sample size planning for multiple regression is presented that emphasizes accuracy in parameter estimation (AIPE). The AIPE approach yields precise estimates of population parameters by providing necessary sample sizes in order for the likely widths of confidence intervals to be sufficiently narrow. One AIPE method yields a sample size such that the expected width of the confidence interval around the standardized population regression coefficient is equal to the width specified. An enhanced formulation ensures, with some stipulated probability, that the width of the confidence interval will be no larger than the width specified. Issues involving standardized regression coefficients and random predictors are discussed, as are the philosophical differences between AIPE and the power analytic approaches to sample size planning. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
The coding of time in growth curve models has important implications for the interpretation of the resulting model that are sometimes not transparent. The authors develop a general framework that includes predictors of growth curve components to illustrate how parameter estimates and their standard errors are exactly determined as a function of receding time in growth curve models. Linear and quadratic growth model examples are provided, and the interpretation of estimates given a particular coding of time is illustrated. How and why the precision and statistical power of predictors of lower order growth curve components changes over time is illustrated and discussed. Recommendations include coding time to produce readily interpretable estimates and graphing lower order effects across time with appropriate confidence intervals to help illustrate and understand the growth process. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

8.
Large-sample confidence intervals (CI) for reliability, validity, and unattenuated validity are presented. The CI for unattenuated validity is based on the Bonferroni inequality, which relies on one CI for test–retest reliability and one for validity. Covered are four reliability–validity situations: (a) both estimates were from random samples; (b) reliability was from a random sample but validity was from a selected sample; (c) validity was from a random sample but reliability was from a selected sample; and (d) both estimates were from selected samples. All CIs were evaluated by using a simulation. CIs on reliability, validity, or unattenuated validity are accurate as long as selection ratio is at least 20% and selected sample size is 100 or larger. When selection ratio is less than 20%, estimators tend to underestimate their parameters. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

9.
This research presents the inferential statistics for Cronbach's coefficient alpha on the basis of the standard statistical assumption of multivariate normality. The estimation of alpha's standard error (ASE) and confidence intervals are described, and the authors analytically and empirically investigate the effects of the components of these equations. The authors then demonstrate the superiority of this estimate compared with previous derivations of ASE in a separate Monte Carlo simulation. The authors also present a sampling error and test statistic for a test of independent sample alphas. They conclude with a recommendation that all alpha coefficients be reported in conjunction with standard error or confidence interval estimates and offer SAS and SPSS programming codes for easy implementation. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
There has been much recent attention given to the problems involved with the traditional approach to null hypothesis significance testing (NHST). Many have suggested that, perhaps, NHST should be abandoned altogether in favor of other bases for conclusions such as confidence intervals and effect size estimates (e.g., F. L. Schmidt; see record 83-24994) . The purposes of this article are to (a) review the function that data analysis is supposed to serve in the social sciences, (b) examine the ways in which these functions are performed by NHST, (c) examine the case against NHST, and (d) evaluate interval-based estimation as an alternative to NHST. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
The bootstrap is a nonparametric technique for estimating standard errors and approximate confidence intervals. Rasmussen has used a simulation experiment to suggest that bootstrap confidence intervals perform very poorly in the estimation of a correlation coefficient. Part of Rasmussen's simulation is repeated. A careful look at the results shows the bootstrap intervals performing quite well. Some remarks are made concerning the virtues and defects of bootstrap intervals in general. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
Confidence regions (CR) for heritability (h2) and fraction of variance accounted for by permanent environmental effects (c2) from Method R estimates were obtained from simulated data using a univariate, repeated measures, full animal model, with 50% subsampling. Bootstrapping techniques were explored to assess the optimum number of subsamples needed to compute Method R estimates of h2 and c2 with properties similar to those of exact estimators. One thousand estimates of each parameter set were used to obtain 90, 95, and 99% CR in four data sets including 2,500 animals with four measurements each. Two approaches were explored to assess CR accuracy: a parametric approach assuming bivariate normality of h2 and c2 and a nonparametric approach based on the sum of squared rank deviations. Accuracy of CR was assessed by the average loss of confidence (LOSS) by number of estimates sampled (NUMEST). For NUMEST = 5, bootstrap estimates of h2 and c2 were within 10(-3) of the asymptotic ones. The same degree of convergence in the estimates of SE was achieved with NUMEST = 20. Correlation between estimates of h2 and c2 ranged from -.83 to -.98. At NUMEST < 10, the nonparametric CR were more accurate than parametric CR. However, with the parametric CR, LOSS approached zero at rate NUMEST(-1). This rate was an order of magnitude larger for the nonparametric CR. These results suggested that when the computational burden of estimating genetic parameters limits the number of Method R estimates that can be obtained to, say, 10 or 20, reliable CR can still be obtained by processing Method R estimates through bootstrapping techniques.  相似文献   

13.
Judges were asked to make numerical estimates (e.g., "In what year was the first flight of a hot air balloon?"). Judges provided high and low estimates such that they were X% sure that the correct answer lay between them. They exhibited substantial overconfidence: The correct answer fell inside their intervals much less than X% of the time. This contrasts with choices between 2 possible answers to a question, which showed much less overconfidence. The authors show that overconfidence in interval estimates can result from variability in setting interval widths. However, the main cause is that subjective intervals are systematically too narrow given the accuracy of one's information-sometimes only 40% as large as necessary to be well calibrated. The degree of overconfidence varies greatly depending on how intervals are elicited. There are also substantial differences among domains and between male and female judges. The authors discuss the possible psychological mechanisms underlying this pattern of findings. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
Confidence intervals are widely accepted as a preferred way to present study results. They encompass significance tests and provide an estimate of the magnitude of the effect. However, comparisons of correlations still rely heavily on significance testing. The persistence of this practice is caused primarily by the lack of simple yet accurate procedures that can maintain coverage at the nominal level in a nonlopsided manner. The purpose of this article is to present a general approach to constructing approximate confidence intervals for differences between (a) 2 independent correlations, (b) 2 overlapping correlations, (c) 2 nonoverlapping correlations, and (d) 2 independent R2s. The distinctive feature of this approach is its acknowledgment of the asymmetry of sampling distributions for single correlations. This approach requires only the availability of confidence limits for the separate correlations and, for correlated correlations, a method for taking into account the dependency between correlations. These closed-form procedures are shown by simulation studies to provide very satisfactory results in small to moderate sample sizes. The proposed approach is illustrated with worked examples. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

15.
Estimates of brief time intervals—ranging from 2-120 sec.—were obtained from 24 young offenders and 48 controls by the methods of production and verbal estimation. The verbal estimations were obtained of "empty" intervals as well as of intervals "filled" with a buzzer tone. Intelligence estimates were obtained on all Ss. The results indicate that brief time intervals appear longer to delinquents than to nondelinquents. The controls, but not the delinquents, gave shorter verbal estimates of the "filled" than of the "empty" intervals. Intelligence was not a significant source of variance in either the delinquents' or the controls' verbal estimation scores, but correlated significantly with the delinquents' production scores of the relatively longer intervals (15-120 sec.). (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
Research with general knowledge items demonstrates extreme overconfidence when people estimate confidence intervals for unknown quantities, but close to zero overconfidence when the same intervals are assessed by probability judgment. In 3 experiments, the authors investigated if the overconfidence specific to confidence intervals derives from limited task experience or from short-term memory limitations. As predicted by the naive sampling model (P. Juslin, A. Winman, & P. Hansson, 2007), overconfidence with probability judgment is rapidly reduced by additional task experience, whereas overconfidence with intuitive confidence intervals is minimally affected even by extensive task experience. In contrast to the minor bias with probability judgment, the extreme overconfidence bias with intuitive confidence intervals is correlated with short-term memory capacity. The proposed interpretation is that increased task experience is not sufficient to cure the overconfidence with confidence intervals because it stems from short-term memory limitations. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
One conceptualization of meta-analysis is that studies within the meta-analysis are sampled from populations with mean effect sizes that vary (random-effects models). The consequences of not applying such models and the comparison of different methods have been hotly debated. A Monte Carlo study compared the efficacy of Hedges and Vevea's random-effects methods of meta-analysis with Hunter and Schmidt's, over a wide range of conditions, as the variability in population correlations increases. (a) The Hunter-Schmidt method produced estimates of the average correlation with the least error, although estimates from both methods were very accurate; (b) confidence intervals from Hunter and Schmidt's method were always slightly too narrow but became more accurate than those from Hedges and Vevea's method as the number of studies included in the meta-analysis, the size of the true correlation, and the variability of correlations increased; and (c) the study weights did not explain the differences between the methods. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

18.
The fixed-effects (FE) meta-analytic confidence intervals for unstandardized and standardized mean differences are based on an unrealistic assumption of effect-size homogeneity and perform poorly when this assumption is violated. The random-effects (RE) meta-analytic confidence intervals are based on an unrealistic assumption that the selected studies represent a random sample from a large superpopulation of studies. The RE approach cannot be justified in typical meta-analysis applications in which studies are nonrandomly selected. New FE meta-analytic confidence intervals for unstandardized and standardized mean differences are proposed that are easy to compute and perform properly under effect-size heterogeneity and nonrandomly selected studies. The proposed meta-analytic confidence intervals may be used to combine unstandardized or standardized mean differences from studies having either independent samples or dependent samples and may also be used to integrate results from previous studies into a new study. An alternative approach to assessing effect-size heterogeneity is presented. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

19.
In survival analysis, estimates of median survival times in homogeneous samples are often based on the Kaplan-Meier estimator of the survivor function. Confidence intervals for quantiles, such as median survival, are typically constructed via large sample theory or the bootstrap. The former has suspect accuracy for small sample sizes under moderate censoring and the latter is computationally intensive. In this paper, improvements on so-called test-based intervals and reflected intervals (cf., Slud, Byar, and Green, 1984, Biometrics 40, 587-600) are sought. Using the Edgeworth expansion for the distribution of the studentized Nelson-Aalen estimator derived in Strawderman and Wells (1997, Journal of the American Statistical Association 92), we propose a method for producing more accurate confidence intervals for quantiles with randomly censored data. The intervals are very simple to compute, and numerical results using simulated data show that our new test-based interval outperforms commonly used methods for computing confidence intervals for small sample sizes and/or heavy censoring, especially with regard to maintaining specified coverage.  相似文献   

20.
Correlational analysis is a cornerstone method of statistical analysis, yet most presentations of correlational techniques deal primarily with tests of significance. The focus of this article is obtaining explicit expressions for confidence intervals for functions of simple, partial, and multiple correlations. Not only do these permit tests of hypotheses about differences but they also allow a clear statement about the degree to which correlations differ. Several important differences of correlations for which tests and confidence intervals are not widely known are included among the procedures discussed. Among these is the comparison of 2 multiple correlations based on independent samples. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号