首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
A technique alternative to the conventional ratings of engineers by their supervisors was studied. A 20-triad forced-choice rating scale was constructed. 33 engineers were rated by their supervisors using this device. The reliability of these ratings was .90. An item analysis showed 19 of the 20 triads to have strong discriminating power between high and low scorers. The same Ss were also rated in 8 different areas on a 4-point scale. The reliability of the 2nd rating scale was .87. The 2 scales correlated .73 with each other. These findings support previous research concerned with the more general applicability of the forced-choice technique for the determination of criterion scores. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Standard least squares analysis of variance methods suffer from poor power under arbitrarily small departures from normality and fail to control the probability of a Type I error when standard assumptions are violated. This article describes a framework for robust estimation and testing that uses trimmed means with an approximate degrees of freedom heteroscedastic statistic for independent and correlated groups designs in order to achieve robustness to the biasing effects of nonnormality and variance heterogeneity. The authors describe a nonparametric bootstrap methodology that can provide improved Type I error control. In addition, the authors indicate how researchers can set robust confidence intervals around a robust effect size parameter estimate. In an online supplement, the authors use several examples to illustrate the application of an SAS program to implement these statistical methods. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7 SVIB scales were developed and cross validated on 461 managers from 13 varied Minnesota companies. Questions studied were (a) Which item weighting method results in the highest scale validity? (b) Are shorter scales as valid as longer scales? (c) How much may scales be shortened? (d) Why may they be shortened? Controls for scale length, content, validity, and for item weighting method were introduced. Results indicated (a) there was no practical difference in validities between simple unit versus variably weighted scales, (b) shorter scales were as valid as longer scales, (c) Clark's "40 to 60 item optimum scale length" hypothesis was supported, (d) although not conclusive, shorter scales appeared superior partly because their average item validities were greater and thus they perhaps should not be used where developmental item pools are rich in valid items. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

127 dream reports of 24 Ss were assessed on 20 psychological characteristics. Nearly all characteristics were assessed by 2 raters, and some by S as well, creating a total of 43 variables. These were subjected to principal component analysis and analytic orthogonal rotation. About 63% of the total variance is accounted for by 8 dimensions: vivid fantasy, active control, pleasantness, verbal aggression, physical aggression, heterosexuality, perception (vs. conception), and reference to past experience. In a resulting condensed scale, each dimension is indexed by a single characteristic. These 8 characteristics are essentially uncorrelated. The last 2 are assessed by S alone; rater agreement in assessing the 1st 6 is .63, .71, .62, .74, .44, and .66. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

One of the main objectives in meta-analysis is to estimate the overall effect size by calculating a confidence interval (CI). The usual procedure consists of assuming a standard normal distribution and a sampling variance defined as the inverse of the sum of the estimated weights of the effect sizes. But this procedure does not take into account the uncertainty due to the fact that the heterogeneity variance (τ2) and the within-study variances have to be estimated, leading to CIs that are too narrow with the consequence that the actual coverage probability is smaller than the nominal confidence level. In this article, the performances of 3 alternatives to the standard CI procedure are examined under a random-effects model and 8 different τ2 estimators to estimate the weights: the t distribution CI, the weighted variance CI (with an improved variance), and the quantile approximation method (recently proposed). The results of a Monte Carlo simulation showed that the weighted variance CI outperformed the other methods regardless of the τ2 estimator, the value of τ2, the number of studies, and the sample size. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

"The RBD III, a forced-choice rating form to provide scores indicative of a person's productive research behavior in physical science research settings, was administered in a setting other than the one in which it was developed." 50 Ss were selected at random from 168 research engineers. Supervisory judgments of a person's creativity activity indicated its validity. The "RBD III can be used to provide criterion scores for research productivity in other physical science research settings." (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

A frequently used experimental design in psychological research randomly divides a set of available cases, a local population, between 2 treatments and then applies an independent-samples t test to either test a hypothesis about or estimate a confidence interval (CI) for the population mean difference in treatment response. C. S. Reichardt and H. F. Gollob (1999) established that the t test can be conservative for this design--yielding hypothesis test P values that are too large or CIs that are too wide for the relevant local population. This article develops a less conservative approach to local population inference, one based on the logic of B. Efron's (1979) nonparametric bootstrap. The resulting randomization bootstrap is then compared with an established approach to local population inference, that based on randomization or permutation tests. Finally, the importance of local population inference is established by reference to the distinction between statistical and scientific inference. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

An attempt was made to evaluate the predictive validity for improvement of the Rorschach Prognostic Rating Scale (RPRS) and the MMPI. A group of untreated psychiatric outpatients (N = 40) and a group of outpatients treated with psychotherapy (N = 21) were given the Rorschach and MMPI before their assignment to the treatment or waiting list groups. After a waiting or treatment period of approximately 6 mo., each patient was clinically rated as improved or unimproved. The RPRS was significantly (p  相似文献   

Attention deficits are nearly ubiquitous after traumatic brain injury (TBI). In the subacute phase of moderate to severe TBI, these deficits may be difficult to measure with the precision needed to predict outcomes, assess degree of recovery, and monitor treatment response. This article reports the findings of four studies, three observational and one a randomized, controlled treatment trial of methylphenidate (MP), designed to provide construct validation of the Moss Attention Rating Scale (MARS), an observational measure of attention dysfunction following TBI. One hundred seven participants with moderate to severe TBI were enrolled during treatment on an inpatient rehabilitation unit. MARS scores were provided independently by four rehabilitation disciplines (Physical, Occupational and Speech Therapies and Nursing). Results indicated that the MARS: (1) is more strongly related to concurrent measures of cognitive versus physical disability, supporting its validity as a measure of cognition, (2) is more strongly related to concurrent psychometric measures of attention versus measures thought to rely less on attention, supporting its validity as a measure of attention; and (3) predicts 1-year outcomes of TBI better than psychometric measures of attention. However, the MARS (4) was not differentially affected by MP versus placebo treatment. Results support the construct validity and utility of the MARS, with further research needed to clarify its role in treatment outcome assessment. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

The widespread employment of the Beck Depression Inventory-1A ({bdi}-1{a}) has spawned a number of practices: (1) The employment of an unweighted total score as a measure of depression; (2) Its use in populations other than that in which it was normed; and (3) The employment of {bdi}-1{a} total scores in hypothesis tests about population differences in mean depression. A sequential procedure based on item response theory was employed to assess the validity of these practices for the case of 4 populations: clinical depressives (n?=?210), mixed nondepressed psychiatric patients (n?=?98), and students from 2 different universities (n?=?624). The findings suggest that the 1st practice was not justified for any of these populations, that the {bdi}-1{a} was employable only with clinical depressives and with 1 of the university populations, and that mean comparisons were not allowable. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Methods for planning sample size (SS) for the standardized mean difference so that a narrow confidence interval (CI) can be obtained via the accuracy in parameter estimation (AIPE) approach are developed. One method plans SS so that the expected width of the CI is sufficiently narrow. A modification adjusts the SS so that the obtained CI is no wider than desired with some specified degree of certainty (e.g., 99% certain the 95% CI will be no wider than ω). The rationale of the AIPE approach to SS planning is given, as is a discussion of the analytic approach to CI formation for the population standardized mean difference. Tables with values of necessary SS are provided. The freely available Methods for the Behavioral, Educational, and Social Sciences (K. Kelley, 2006a) R (R Development Core Team, 2006) software package easily implements the methods discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes an integrated, alternative inferential confidence interval approach to testing for statistical difference, equivalence, and indeterminacy that is algebraically equivalent to standard NHST procedures and therefore exacts the same evidential standard. The combined numeric and graphic tests of statistical difference, equivalence, and indeterminacy are designed to avoid common interpretive problems associated with NHST procedures. Multiple comparisons, power, sample size, test reliability, effect size, and cause-effect ratio are discussed. A section on the proper interpretation of confidence intervals is followed by a decision rule summary and caveats. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

When the distribution of the response variable is skewed, the population median may be a more meaningful measure of centrality than the population mean, and when the population distribution of the response variable has heavy tails, the sample median may be a more efficient estimator of centrality than the sample mean. The authors propose a confidence interval for a general linear function of population medians. Linear functions have many important special cases including pairwise comparisons, main effects, interaction effects, simple main effects, curvature, and slope. The confidence interval can be used to test 2-sided directional hypotheses and finite interval hypotheses. Sample size formulas are given for both interval estimation and hypothesis testing problems. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

A brief, face-valid personality test was devised to be used as a limited range screening mode to locate general problem areas and the degree to which an individual is operating under "stress" in his daily life. The test uses a novel double-question method in which the 2nd question elaborates the answer to the 1st and is contingent upon it. This format is intended to encourage candidness on the part of the respondent and to enhance the power of the individual item. Criterion analyses and reliability indicate that the test and/or item form may prove useful in personality assessment. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Correlations between Dempsey's revised depression scale for the MMPI, D30, and the SD scale were obtained for 3 different samples. The correlations were—.80,—.82, and .—83. For the same 3 samples, the correlations between the original D scale and the SD scale were—.57,—.63, and—.63. The increase in the D30 correlations with the SD scale is interpreted as resulting from the fact that all of the items in the D30 scale are keyed for socially undesirable responses. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

对惰性气体熔融热导法测定钢中氮量的不确定度来源进行了详细分析,对测定过程中的主要不确定度分量(样品称量、标准物质、测量重复性、仪器分辨率和仪器校准等)进行了合理评定,最后以合成标准不确定度乘以95%置信概率下的扩展因子2获得测量结果的扩展不确定度。在对不确定度评定过程中,发现标准物质和测量重复性对合成标准不确定度的影响最大,所以在工作中要特别注意选择合适的标准样品进行曲线校正,并要重复测定后计算结果。  相似文献   

Objective: In 2005, the Journal of Consulting and Clinical Psychology (JCCP) became the first American Psychological Association (APA) journal to require statistical measures of clinical significance, plus effect sizes (ESs) and associated confidence intervals (CIs), for primary outcomes (La Greca, 2005). As this represents the single largest editorial effort to improve statistical reporting practices in any APA journal in at least a decade, in this article we investigate the efficacy of that change. Method: All intervention studies published in JCCP in 2003, 2004, 2007, and 2008 were reviewed. Each article was coded for method of clinical significance, type of ES, and type of associated CI, broken down by statistical test (F, t, chi-square, r/R2, and multivariate modeling). Results: By 2008, clinical significance compliance was 75% (up from 31%), with 94% of studies reporting some measure of ES (reporting improved for individual statistical tests ranging from η2 = .05 to .17, with reasonable CIs). Reporting of CIs for ESs also improved, although only to 40%. Also, the vast majority of reported CIs used approximations, which become progressively less accurate for smaller sample sizes and larger ESs (cf. Algina & Kessleman, 2003). Conclusions: Changes are near asymptote for ESs and clinical significance, but CIs lag behind. As CIs for ESs are required for primary outcomes, we show how to compute CIs for the vast majority of ESs reported in JCCP, with an example of how to use CIs for ESs as a method to assess clinical significance. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

The problem of valid measurement of psychological constructs remains an impediment to scientific progress, and the measurement of executive functions is not an exception. This study examined the statistical and theoretical derivation of a behavioral screener for the estimation of executive functions in children from the well-established Behavior Assessment System for Children (BASC). The original national standardization sample of the BASC–Teacher Rating Scales for children ages 6 through 11 was used (N = 2,165). Moderate-to-high internal consistency was obtained within each factor (.80–.89). A panel of experts was used for content validity examination. A confirmatory factor analysis model with 25 items loading on 4 latent factors (behavioral control, emotional control, attentional control, and problem solving) was developed, and its statistical properties were examined. The multidimensional model demonstrated adequate fit, and it was deemed invariant after configural, metric, and scalar measurement invariance tests across sex and age. Given its strong psychometric properties, with further tests of item validity, this instrument promises future clinical and research utility for the screening of executive functions in school-age children. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

This study describes a double-press method for experimentally controlling item length and reading speed when measuring response latency to computer-administered personality items. Previous research has required several statistical transformations to control for item length and reading speed. Five approaches validated the new, double-press method. First, valid profiles showing reasonable read time and psychological response time resulted in few outliers. Second, read and psychological response times were internally consistent. Third, valid separation of read time from total response time was demonstrated by a positive relationship between read time and item length. Fourth, negatively stated items took longer to understand than positively stated items. Fifth, in accordance with schema research, items that were highly similar or dissimilar to the self-schema were answered more quickly than other items, resulting in an inverted-U effect. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

This study assessed the relative accuracy of 3 techniques--local validity studies, meta-analysis, and Bayesian analysis--for estimating test validity, incremental validity, and adverse impact in the local selection context. Bayes-analysis involves combining a local study with nonlocal (meta-analytic) validity data. Using tests of cognitive ability and personality (conscientiousness) as predictors, an empirically driven selection scenario illustrates conditions in which each of the 3 estimation techniques performs best. General recommendations are offered for how to estimate local parameters, based on true population variability and the number of studies in the meta-analytic prior. Benefits of empirical Bayesian analysis for personnel selection are demonstrated, and equations are derived to help guide the choice of a local validity technique (i.e., meta-analysis vs. local study vs. Bayes-analysis). (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号