首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Rater bias is a substantial source of error in psychological research. Bias distorts observed effect sizes beyond the expected level of attenuation due to intrarater error, and the impact of bias is not accurately estimated using conventional methods of correction for attenuation. Using a model based on multivariate generalizability theory, this article illustrates how bias affects research results. The model identifies 4 types of bias that may affect findings in research using observer ratings, including the biases traditionally termed leniency and halo errors. The impact of bias depends on which of 4 classes of rating design is used, and formulas are derived for correcting observed effect sizes for attenuation (due to bias variance) and inflation (due to bias covariance) in each of these classes. The rater bias model suggests procedures for researchers seeking to minimize adverse impact of bias on study findings. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

2.
Under trait theory, ratings may be modeled as a function of the temperament of the child and the bias of the rater. Two linear structural equation models are described, one for mutual self and partner ratings, and one for multiple ratings of related individuals. Application of the first model to EASI temperament data collected from spouses rating each other shows moderate agreement between raters and little rating bias. Spouse pairs agree moderately when rating their twin children, but there is significant rater bias, with greater bias for monozygotic than for dizygotic twins. MLEs of heritability are approximately .5 for all temperament scales with no common environmental variance. Results are discussed with reference to trait validity, the person–situation debate, halo effects, and stereotyping. Questionnaire development using ratings on family members permits increased rater agreement and reduced rater bias. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

3.
The psychology literature at large considers rater bias to be a substantial source of error in observer ratings. Yet, it is typically ignored by psychotherapy researchers using participant (psychotherapist/client) ratings. In particular, interrater variability, or differences between raters' overall tendency to rate others favorably or unfavorably, has been a largely ignored source of error in studies that use psychotherapists and/or clients as raters. Ignoring rater bias can have serious consequences for statistical power and for interpretation of research findings. Rater bias may be a particular problem in psychotherapy research, as psychotherapists are often asked to rate subjective variables that require much rater inference. Consequently, we examined the extent to which rater bias is a factor in psychotherapist ratings of client transference and insight, by comparing psychotherapist variance from these ratings to psychotherapist variance in ratings of client-perceived emotional intelligence, using Hierarchical Linear Modeling. Results suggest that bias may be a substantial source of error in psychotherapist process and relationship ratings, accounting for, on average, 38% of the total variance in scores, and 30% after accounting for perceived emotional intelligence. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

4.
Few studies in counseling and psychotherapy have investigated rater bias. The purpose of this study was to outline a method for studying rater bias. We studied three potential sources of rater bias: (a) characteristics of the rater, client, and therapist; (b) the similarity of characteristics between rater and therapist or client; and (c) perceived similarity between rater and therapist or client. We used a new rater-bias measure. The data for the study were ratings on the Collaborative Study Psychotherapy Rating Scale for 826 sessions of psychotherapy in the Treatment of Depression Collaborative Research Program. High interrater reliability was found for all scales of the measure. We found evidence of rater bias only on the facilitative conditions scale. Rater bias was not found for the other scales, perhaps because of the extensive development of the measure, careful selection of the raters, lengthy rater training, and continued contact with raters throughout the rating period. The rater-bias measure may be useful to other researchers as a means of testing the reactivity of their measures to rater bias. Finally, the method for investigating rater bias can be used by other researchers to evaluate rater bias. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

5.
This study extends multisource feedback research by assessing the effects of rater source and raters' cultural value orientations on rating bias (leniency and halo). Using a motivational perspective of performance appraisal, the authors posit that subordinate raters followed by peers will exhibit more rating bias than superiors. More important, given that multisource feedback systems were premised on low power distance and individualistic cultural assumptions, the authors expect raters' power distance and individualism-collectivism orientations to moderate the effects of rater source on rating bias. Hierarchical linear modeling on data collected from 1,447 superiors, peers, and subordinates who provided developmental feedback to 172 military officers show that (a) subordinates exhibit the most rating leniency, followed by peers and superiors; (b) subordinates demonstrate more halo than superiors and peers, whereas superiors and peers do not differ; (c) the effects of power distance on leniency and halo are strongest for subordinates than for peers and superiors; (d) the effects of collectivism on leniency were stronger for subordinates and peers than for superiors; effects on halo were stronger for subordinates than superiors, but these effects did not differ for subordinates and peers. The present findings highlight the role of raters' cultural values in multisource feedback ratings. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

6.
Rater biases are of interest to behavior genetic researchers, who often use ratings data as a basis for studying heritability. Inclusion of multiple raters for each sibling pair (M. Bartels, D. I. Boomsma, J. J. Hudziak, T. C. E. M. van Beijsterveldt, & E. J. C. G. van den Oord, see record 2007-18729-006) is a promising strategy for controlling bias variance and may yield information about sources of bias in heritability studies. D. A. Kenny's (2004) PERSON model is presented as a framework for understanding determinants of rating reliability and validity. Empirical findings on rater bias in other contexts provide a starting point for addressing the impact of rater-unique perceptions in heritability studies. However, heritability studies use distinctive rating designs that may accentuate some sources of bias, such as rater communication and contrast effects, which warrant further study. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
Although the letter of recommendation is the most commonly requested information relating to the personal qualities of applicants to dietetics internship programs, little research has focused on its value in selection decisions. The purpose of this study was to review how 318 letters of recommendation submitted on a standardized form related to the source of the reference and to the admission status of the applicants. The form contained 40 attributes that raters assigned to one of six categories. Nine of the 40 attributes were not rated by more than 75% of the raters, and 3 of the attributes were rated as outstanding by more than 60% of the raters. We concluded that these attributes did little to distinguish among applicants. The attribute maturity correlated 0.70 with 5 attributes and 0.99 with 2 attributes, so duplication of information existed. Raters were categorized as follows: adviser, major professor, other professor, employer, and other. The highest mean ratings were given by advisers; major professors rated students lowest. Analysis of variance supported a significant difference in rating by type of rater. Our findings suggest that fewer items should be used on a standardized form and that the type of rater should be specified if references are to distinguish among applicants.  相似文献   

8.
A note on the statistical correction of halo error.   总被引:1,自引:0,他引:1  
Attempts to eliminate halo error from rating scales by statistical correction have assumed halo to be a systematic error associated with a ratee–rater pair that adds performance-irrelevant variance to ratings. Furthermore, overall performance ratings have been assumed to reflect this bias. Consideration of the source of halo error, however, raises the possibility that the cognitive processes resulting in halo also mediate expectations of and interactions with employees, indirectly influencing true performance and ability via instruction, feedback, and reinforcement. If so, it would not be possible to correct for halo error using overall performance ratings. (26 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

9.
D. A. Kenny (1994) estimated the components of personality rating variance to be 15, 20, and 20% for target, rater, and relationship, respectively. To enhance trait variance and minimize rater variance, we designed a series of studies of personality perception in discussion groups (N?=?79, 58, and 59). After completing a Big Five questionnaire, participants met 7 times in small groups. After Meetings 1 and 7, group members rated each other. By applying the Social Relations Model (D. A. Kenny and L. La Voie, 1984) to each Big Five dimension at each point in time, we were able to evaluate 6 rating effects as well as rating validity. Among the findings were that (a) target variance was the largest component (almost 30%), whereas rater variance was small (less than 11%); (b) rating validity improved significantly with acquaintance, although target variance did not; and (c) no reciprocity was found, but projection was significant for Agreeableness. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
This study investigated within-source interrater reliability of supervisor, peer, and subordinate feedback ratings made for managerial development. Raters provided 360-degree feedback ratings on a sample of 153 managers. Using generalizability theory, results indicated that little within-source agreement exists; a large portion of the error variance is attributable to the combined rater main effect and Rater X Ratee effect; more raters are needed than currently used to reach acceptable levels of reliability; supervisors are the most reliable with trivial differences between peers and subordinates when the numbers of raters and items are held constant; and peers are the most reliable, followed by subordinates, followed by supervisors, under conditions commonly encountered in practice. Implications for the validity, design, and maintenance of 360-degree feedback systems are discussed along with directions for future research in this area. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
This study quantified the effects of 5 factors postulated to influence performance ratings: the ratee's general level of performance, the ratee's performance on a specific dimension, the rater's idiosyncratic rating tendencies, the rater's organizational perspective, and random measurement error. Two large data sets, consisting of managers (n?=?2,350 and n?=?2,142) who received developmental ratings on 3 performance dimensions from 7 raters (2 bosses, 2 peers, 2 subordinates, and self) were used. Results indicated, that idiosyncratic rater effects (62% and 53%) accounted for over half of the rating variance in both data sets. The combined effects of general and dimensional ratee performance (21% and 25%) were less than half the size of the idiosyncratic rater effects. Small perspective-related effects were found in boss and subordinate ratings but not in peer ratings. Average random error effects in the 2 data sets were 11% and 18%. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
Using a round-robin design in which every subject served both as judge and target, subjects made liking judgments, trait ratings, and physical attractiveness ratings of each other on each of 4 days. Although there was some agreement in the liking judgments, most of the variance was due to idiosyncratic preferences for different targets. Differences in evaluations were due to at least 2 factors: disagreements in how targets were perceived (is this person honest?) and disagreements in how to weight the trait attributes that predicted liking (is honesty more important than friendliness?) When evaluating the targets in specific roles (as a study partner), judgments showed much greater agreement, as did the weights of the trait attributes. A 2nd study confirmed the differential weighting of trait attributes when rating liking in general and the increased agreement when rating specific roles. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
In light of consistently observed correlations among Big Five ratings, the authors developed and tested a model that combined E. L. Thorndike’s (1920) general evaluative bias (halo) model and J. M. Digman’s (1997) higher order personality factors (alpha and beta) model. With 4 multitrait–multimethod analyses, Study 1 revealed moderate convergent validity for alpha and beta across raters, whereas halo was mainly a unique factor for each rater. In Study 2, the authors showed that the halo factor was highly correlated with a validated measure of evaluative biases in self-ratings. Study 3 showed that halo is more strongly correlated with self-ratings of self-esteem than self-ratings of the Big Five, which suggests that halo is not a mere rating bias but actually reflects overly positive self-evaluations. Finally, Study 4 demonstrated that the halo bias in Big Five ratings is stable over short retest intervals. Taken together, the results suggest that the halo-alpa-beta model integrates the main findings in structural analyses of Big Five correlations. Accordingly, halo bias in self-ratings is a reliable and stable bias in individuals’ perceptions of their own attributes. Implications of the present findings for the assessment of Big Five personality traits in monomethod studies are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
The construct validity of developmental ratings of managerial performance was assessed by using 2 data sets, each based on a different 360° rating instrument. Specifically, the authors investigated the nature of the constructs measured by developmental ratings, the structural relationships among those constructs, and the generalizability of results across 4 rater perspectives (boss, peer, subordinate, and self). A structure with 4 lower order factors (Technical Skills, Administrative Skills, Human Skills, and Citizenship Behaviors) and 2 higher order factors (Task Performance and Contextual Performance) was tested against competing models. Results consistently supported the lower order constructs, but the higher order structure was problematic, indicating that the structure of ratings is not yet well understood. Multisample analyses indicated few practically significant differences in factor structures across perspectives. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

15.
144 deputy sheriffs were rated on 9 job performance dimensions with 2 rating scales by 2 raters. Results indicate that the rating scales (the Multiple Item Appraisal Form and the Global Dimension Appraisal Form) developed in this study were able to minimize the major problems often associated with performance ratings (i.e., leniency error, restriction of range, and low reliability). A multitrait/multimethod analysis indicated that the rating scales possessed high convergent and discriminant validity. A multitrait/multirater analysis indicated that although the interrater agreement and the degree of rated discrimination on different traits by different raters were good, there was a substantial rater bias, or strong halo effect. This halo effect in the ratings, however, may really be a legitimate general factor rather than an error. (11 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
The purpose of this study was to test competing theories regarding the relationship between true halo (actual dimensional correlations) and halo rater error (effects of raters' general impressions on specific ratee qualities) at both the individual and group level of analysis. Consistent with the prevailing general impression model of halo rater error, results at both the individual and group level analyses indicated a null (vs. positive or negative) true halo-halo rater error relationship. Results support the ideas that (a) the influence of raters' general impressions is homogeneous across rating dimensions despite wide variability in levels of true halo; (b) in assigning ratings, raters rely both on recalled observations of actual ratee behaviors and on general impressions of ratees in assigning dimensional ratings; and (c) these 2 processes occur independently of one another. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
Three methods of assessing subgroup bias in performance measurement commonly found in the literature are identified. After a review of these approaches, findings are reported from analyses of data collected in the US Army's Project A (J. P. Campbell, 1987). Correlations between nonrating performance measures and supervisor ratings were generally not moderated by race, but correlations between nonrating indicators of negative performance and ratings assigned by peers were. In addition, significant interactions between rater and ratee race on performance ratings were not eliminated when variance in the nonrating measures was removed from the ratings provided by Black and White raters. Conclusions about the magnitude and nature of bias in supervisor and peer ratings are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

18.
130 undergraduates rated 33 paragraphs describing the performance of supermarket checkers for one of the following purposes: merit raise, development, or retention. The paragraphs were assembled using previously scaled behavioral anchors describing 5 dimensions of performance. The authors conclude that (a) purpose of the rating was a more important variable in explaining the overall variability in ratings than was rater training; (b) training raters to evaluate for some purposes led to more accurate evaluations than training for other purposes; and (c) rater strategy varied with purpose of the rating (i.e., identical dimensions were weighed, combined, and integrated differently as a function of purpose). (24 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

19.
The effects of rater source, rater and ratee race, rater and ratee sex, and job type were investigated on ratings collected for 8,642 first-term Army enlisted personnel. Ratings were made on 10 behaviorally based dimensions developed for evaluating all first-term soldiers. Results of between-Ss analyses similar to those conducted in past research revealed significant main effects and interactions for sex, race, rater source, and job type, but the variance accounted for by these effects was minimal. Repeated measures analyses were also performed, with each ratee evaluated by one Black and one White rater for the race effects analyses and one female and one male rater for the sex effects analyses. These analyses, which unconfounded rater bias and actual performance differences, yielded results similar to those obtained with the between-Ss design. Implications of the findings are discussed. ? (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

20.
The purpose of this study was to test whether a multisource performance appraisal instrument exhibited measurement invariance across different groups of raters. Multiple-groups confirmatory factor analysis as well as item response theory (IRT) techniques were used to test for invariance of the rating instrument across self, peer, supervisor, and subordinate raters. The results of the confirmatory factor analysis indicated that the rating instrument was invariant across these rater groups. The IRT analysis yielded some evidence of differential item and test functioning, but it was limited to the effects of just 3 items and was trivial in magnitude. Taken together, the results suggest that the rating instrument could be regarded as invariant across the rater groups, thus supporting the practice of directly comparing their ratings. Implications for research and practice are discussed, as well as for understanding the meaning of between-source rating discrepancies. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号