首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A latent-class model of rater agreement is presented for which 1 of the model parameters can be interpreted as the proportion of systematic agreement. The latent classes of the model emerge from the factorial combination of the "true" category in which a target belongs and the ease with which raters are able to classify targets into the true category. Several constrained cases of the model are described, and the relations to other well-known agreement models and kappa-type summary coefficients are explained. The differential quality of the rating categories can be assessed on the basis of the model fit. The model is illustrated using data from diagnoses of psychiatric disorders and classifications of individuals in a persuasive communication study. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

2.
[Correction Notice: An erratum for this article was reported in Vol 13(2) of Psychological Methods (see record 2008-06808-006). The DOI for the supplemental materials was printed incorrectly. The correct DOI is as follows: http://dx.doi.org/10.1037/1082-989X.12.4.451.supp] Genetically informative data can be used to address fundamental questions concerning the measurement of behavior in children. The authors illustrate this with longitudinal multiple-rater data on internalizing problems in twins. Valid information on the behavior of a child is obtained for behavior that multiple raters agree upon and for rater-specific perception of the child's behavior. Rater-disagreement variance =?2(rd) accounted for 35% of the individual differences in internalizing behavior. Up to 17% of this =?2(rd) was accounted for by rater-specific additive genetic variance=?2(Au). Thus, the disagreement should not be considered only to be bias/error but also as representing the unique feature of the relationships between that parent and the child. The longitudinal extension of this model helps to make a distinction between measurement error and the raters' unique perception of the child's behavior. For internalizing behavior, the results show large stability across time, which is accounted for by common additive genetic and common shared environmental factors. Rater-specific shared environmental factors show substantial influence on stability. This could mean that rater bias may be persistent and affect longitudinal studies. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

3.
We use levels-of-processing theory and social facilitation theory to explain the effect of training format and group size on distance and correlation accuracy, leniency-severity, halo, retention of training and pretraining information, and subject arousal. The training factor included frame-of-reference (FOR) training, information only (INFO) training, and no training (NOT). Group size was n?=?1, n?=?6, and n?=?12, respectively. A total of 108 subjects, randomly assigned to one of nine Training?×?Group Size conditions, viewed and rated videotaped lectures. Results indicated that FOR training effected improved retention of training information, improved distance accuracy, and less halo over INFO training or NOT (p  相似文献   

4.
AIMS: To investigate, in healthy volunteers, the relationships between the plasma concentrations (C, ng ml(-1)) of zabiciprilat, the active metabolite of the angiotensin I-converting enzyme inhibitor (ACEI) zabicipril, and the effects (E) induced on plasma converting enzyme activity (PCEA, nmol ml(-1) min(-1)), brachial and femoral artery flows (BAF, FAF, ml min(-1)), and brachial and femoral vascular resistances (BVR, FVR, mmHg x s ml(-1)) after a single oral administration of two doses (0.5 and 2.5 mg) of zabicipril. METHODS: The study was placebo-controlled, randomized, double-blind and crossover. E was related to C by the Hill model, E = Emax x Cgamma/(CE50gamma + Cgamma), fitted to the data of both doses simultaneously. RESULTS: We obtained (mean +/- s.d.) Emax = -99 +/- 1%, CE50 = 2.2 +/- 1.0 ng ml(-1) and gamma = 1.0 +/- 0.4 for PCEA, Emax = 55 +/- 26 ml min(-1), CE50 = 5.1 +/- 4.0 ng ml(-1) and gamma = 2.4 +/- 1.6 for BAF, and Emax = -45 +/- 10%, CE50 = 2.0 +/- 1.3 ng ml(-1) and gamma = 2.3 +/- 1.4 for BVR. The parameters obtained for FAF and FVR were similar to those obtained for BAF and BVR, respectively. The CE95 (C required to induce 95% of Emax) varies from 7 to 17 ng ml(-1) for haemodynamic effects. CONCLUSIONS: As zabiciprilat peak plasma concentrations average 20 ng ml(-1) after the 2.5 mg dose of zabicipril, this dose of the drug should be sufficient to induce optimal haemodynamic effects.  相似文献   

5.
We give a simple technique that allows to transform dynamic programming type algorithms for the Maximum Agreement Subtree problem (MAST) for rooted trees into algorithms for the Maximum Agreement Subtree problem for unrooted trees (UMAST). Using this technique we obtain an O (n log n)-time algorithm for the UMAST problem for binary trees. This matches the complexity of the best known algorithm for the rooted case.  相似文献   

6.
Notes that ratings for performance appraisal are frequently made by supervisors. In the present study, judgments of effectiveness for 153 hospital nurses were obtained from the nurse herself and her peers in addition to her supervisor, using the same rating form. Factor analysis indicated that each rating source could be clearly identified and characterized. The data reaffirm the notion that interrater disagreement may reflect systematic rater bias as well as meaningful differences in the ways in which judgments are made. Implications for comprehensive appraisals are suggested. (29 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
Research in developmental and educational psychology has come to rely less on conventional psychometric tests and more on records of behavior made by human observers in natural and quasi-natural settings. Three coefficients that purport to reflect the quality of data collected in these observational studies are discussed: the interobserver agreement percentage, the reliability coefficient, and the generalizability coefficient. Three-facet generalizability studies that parallel intraobserver–interobserver, split-half, and test–retest reliability studies are described as examples. It is concluded that although high interobserver agreement is desirable in observational studies, high agreement alone is not sufficient to insure the quality of the data that are collected. Evidence of the reliability or generalizability of the data should also be reported. Other uses for generalizability theory (e.g., attribution of variance, single-S studies) are suggested, and further advantages of generalizability designs are discussed. (27 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

8.
Reliability coefficients often take the form of intraclass correlation coefficients. In this article, guidelines are given for choosing among 6 different forms of the intraclass correlation for reliability studies in which n targets are rated by k judges. Relevant to the choice of the coefficient are the appropriate statistical model for the reliability study and the applications to be made of the reliability results. Confidence intervals for each of the forms are reviewed. (23 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

9.
Few studies in counseling and psychotherapy have investigated rater bias. The purpose of this study was to outline a method for studying rater bias. We studied three potential sources of rater bias: (a) characteristics of the rater, client, and therapist; (b) the similarity of characteristics between rater and therapist or client; and (c) perceived similarity between rater and therapist or client. We used a new rater-bias measure. The data for the study were ratings on the Collaborative Study Psychotherapy Rating Scale for 826 sessions of psychotherapy in the Treatment of Depression Collaborative Research Program. High interrater reliability was found for all scales of the measure. We found evidence of rater bias only on the facilitative conditions scale. Rater bias was not found for the other scales, perhaps because of the extensive development of the measure, careful selection of the raters, lengthy rater training, and continued contact with raters throughout the rating period. The rater-bias measure may be useful to other researchers as a means of testing the reactivity of their measures to rater bias. Finally, the method for investigating rater bias can be used by other researchers to evaluate rater bias. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
Frame-of-reference training has been shown to be an effective intervention for improving the accuracy of performance ratings (e.g., Woehr & Huffcutt, 1994). Despite evidence in support of the effectiveness of frame-of-reference training, few studies have empirically addressed the ultimate goal of such training, which is to teach raters to share a common conceptualization of performance (Athey & McIntyre, 1987; Woehr, 1994). The present study tested the hypothesis that, following training, frame-of-reference–trained raters would possess schemas of performance that are more similar to a referent schema, as compared with control-trained raters. Schema accuracy was also hypothesized to be positively related to rating accuracy. Results supported these hypotheses. Implications for frame-of-reference training research and practice are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
Examined the influence of counselor statements on rater judgements of client self-exploration. Audiotaped segments of counseling interviews that included both counselor and client statements and identical autiotaped segments, but with the counselors' statements deleted, were rated on client self-exploration by separate groups of raters (totaling 20 counseling graduate students). A significantly positive correlation was found between the 2 sets of ratings. With 1 exception, no significant differences were found for each segment. Finally, no differences were found between ratings for segments, unedited and edited, in which counselors were functioning at high levels of accurate empathy and ratings in which counselors were functioning at low levels of accurate empathy. (34 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
Diagnostic agreement tests the reliability and concordance of diagnostic systems. The introduction of measures of agreement with reputations for baserate independence (e.g., Yule's Y and Q), and new studies occasioned by the publication of the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) and the International Classification of Diseases—10 (ICD-10, World Health Organization, 1992) make it necessary to study the relationship of illness baserates to measures of agreement. Testing diagnostic concordance for diagnoses of drug dependence from the third edition of the DSM (American Psychiatric Association, 1980) versus DSM-IV diagnoses of drug dependence under 3 baserate conditions, it was found that Yule's Y and Q proved as vulnerable to differences in baserates as kappa or percent agreement and that specificity covaried with baserate rather than being fixed, as most theoretical discussions assume. The uncritical use of Y and Q, therefore, is likely to lead to optimistic interpretations of agreement. Kappa should be preferred for most purposes, although an adjustment to the computational formulas for Y and Q is presented that can diminish their positive bias. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
Assessing interobserver agreement calls for measuring the degree to which numbers generated by one observer match those generated by another observer. However, for all scales of measurement save one, the absolute scale, using interobserver agreement as a measure of interobserver consistency is too strict because the observers might disagree only on empirically meaningless relationships. Two observers that are rating behaviors on an ordinal scale need only to generate orders that agree, not ratings that agree. This concept is formalized into a notion of relational agreement. Observers need to agree only on empirically meaningful (in a measurement theoretical sense) relationships. Those relationships that are empirically meaningful change as a function of the scale of measurement in use. A class of measures for measuring relational agreement (based on F. E. Zegers and J. M. F. ten Berge [see PA, Vol 72:24356]) is presented. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
15.
Compared the psychometric properties of ratings on behavioral expectation scales (BES) across 4 groups totalling 156 undergraduate raters. Groups differed with respect to amount of prior training (1 hr or more), the nature of psychometric errors, and the extent of exposure to scales (read scales and recorded observed critical incidents, discussed general scale dimensions, or no exposure to scales). Three Ss from each group rated 1 of 13 instructors during the last week of a 10-wk term. Significantly less leniency error and halo effect, plus higher interrater reliability, were found for the group that had received the hour of training and full exposure to the BES. Ss who had received only training had significantly less halo error than those that had received no training. The need for rater training prior to observation and the use of BES as a context for observation are discussed. (20 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
An interesting feature of the kindling phenomenon relates to the finding that kindling established in one region of the brain may reduce the number of stimulations required to establish the phenomenon in a second region. It has been proposed that this 'transfer' phenomenon reflects the extent to which seizures arising in two distinct regions share common underlying mechanisms. The perirhinal cortex (PRC) is currently receiving considerable attention with regard to its possible role in epileptogenesis. Although the role of this region in limbic seizures is unclear, the existence of reciprocal connections between the PRC and amygdala provides a possible neural substrate through which these two regions may influence one another. On the basis of this connectivity, one might expect a transfer between PRC kindling and amygdaloid kindling. Using kindling transfer, the present study was formulated to determine the nature of the relationship between electrical kindling of the PRC and amygdala. Animals previously kindled from the PRC to a cortico-generalised level displayed significantly more advanced behavioural seizures during the early stages of amygdaloid kindling than either controls or those partially kindled. This suggests that primary PRC kindling may facilitate amygdaloid access to systems responsible for the generation of motor seizures. Thus, in terms of kindling, the PRC and amygdala appear to be functionally related, with generalised seizures elicited from the PRC and amygdala sharing, at some level, common underlying mechanisms. Finally, the finding that seizures kindled from the dorsal component of the PRC tended to exhibit characteristics which were quite distinct from those elicited by ventral PRC kindling suggests that these two subregions may have different kindling characteristics and/or different patterns of connectivity with the amygdaloid complex.  相似文献   

17.
18.
19.
Investigated the effects of frame-of-reference (FOR) training on various indexes of distance and correlational accuracy under alternative time delays. 150 Ss were assigned randomly to either FOR- or control- (i.e., minimal) training conditions, with 1 of 3 time delays: (1) no delay between training, observation, and rating; (2) ratings performed 2 days following training and ratee observations; or (3) ratee observations and ratings completed 2 days following training. Hypotheses were proposed predicting specific relationships between accuracy, recall memory, and learning, depending on the delay period. Overall, results support the categorization perspective on FOR-training effectiveness; however, different results were obtained depending on the type of accuracy index and time delay. The implications of these findings are discussed in terms of how they relate to the conceptual distinction between distance and correlational accuracy and to the role of on-line, memory-based, and inference-memory-based processing in the ratings of FOR trained raters. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

20.
108 undergraduates were randomly assigned to 1 of 4 experimental groups to rate videotaped performances of several managers talking with a problem subordinate. The research employed a single-factor experimental design in which rater error training (RET), rater accuracy training (RAT), rating error and accuracy training (RET/RAT), and no training were compared for 2 rating errors (halo and leniency) and accuracy of performance evaluations. Differences in program effectiveness for various performance dimensions were also assessed. Results show that RAT yielded the most accurate ratings and no-training the least accurate ratings. The presence of error training (RET or RET/RAT) was associated with reduced halo, but the presence of accuracy training (RAT or RET/RAT) was associated with less leniency. Dimensions?×?Training interactions revealed that training was not uniformly effective across the rating dimensions. (23 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号