首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 750 毫秒
1.
Rater contamination may occur when the same rater makes both prediction and criterion ratings; technique contamination may occur when both the prediction and criterion ratings are on the same form. 400 Army officers provided ratings on a graphic scale and two versions of a forced-choice scale. Each officer rated 20 of his fellow officers on the graphic scale, and eight days later rerated two of them on the graphic scale and one of the forced-choice scales. Intercorrelations for the same raters on the same (graphic) form were .69-.82. Inter-correlations for the same rater on different forms were .52-.57. For different raters, inter-correlations were below .3, for same or different forms. It is concluded that the rater is the principal source of contamination. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

2.
We evaluated the effects of a voice output communication aid (VOCA) and naturalistic teaching procedures on the communicative interactions of young children with autism. A teacher and three assistants were taught to use naturalistic teaching strategies to provide opportunities for VOCA use in the context of regularly occurring classroom routines. Naturalistic teaching procedures and VOCA use were introduced in multiple probe fashion across 4 children and two classroom routines (snack and play). As the procedures were implemented, all children showed increases in communicative interactions using VOCAs. Also, there was no apparent reductive effect of VOCA use within the naturalistic teaching paradigm on other communicative behaviors. Teachers' ratings of children's VOCA communication, as well as ratings of a person unfamiliar with the children, supported the contextual appropriateness of the VOCA. Probes likewise indicated that the children used the VOCAs for a variety of different messages including requests, yes and no responses, statements, and social comments. Results are discussed in regard to the potential benefits of a VOCA when combined with naturalistic teaching procedures. Future research needs are also discussed, focusing on more precise identification of the attributes of VOCA use for children with autism, as well as for their support personnel.  相似文献   

3.
PURPOSE: To evaluate whether clinical-teaching skills could be improved by providing teachers with augmented student feedback. METHOD: A randomized, controlled trial in 1994 included 42 attending physicians and 39 residents from the Department of Medicine at the Indiana University School of Medicine who taught 110 students on medicine ward rotations for one-month periods. Before teaching rotations, intervention group teachers received norm-referenced, graphic summaries of their teaching performances as rated by students. At mid-month, intervention group teachers received students' ratings augmented by individualized teaching-effectiveness guidelines based on the Stanford Faculty Development Program framework. Linear models were used to analyze the students' mean ratings of teaching behaviors at mid-month and end-of-month. Independent variables included performance ratings, intervention status, teacher status, teaching experience, and interactions with baseline ratings. RESULTS: Complex interactions with baseline performance were found for most teaching categories at mid-month and end-of-month. The intervention-group teachers who had high baseline performance scores had higher student ratings than did the control group teachers with similar baseline scores; the intervention group teachers who had low baseline performance scores were rated lower than were the control group teachers with comparable baseline scores. The residents who had medium or high baseline scores were rated higher than were the attending physicians with comparable baseline scores; the performance of the residents who had low baseline scores was similar to that of the attending physicians with comparable baseline scores. CONCLUSION: Baseline performance is important for targeting those teachers most likely to benefit from augmented student feedback. Potential deterioration in teaching performance warrants a reconsideration of distributing students' ratings to teachers with low baseline performance scores.  相似文献   

4.
A technique alternative to the conventional ratings of engineers by their supervisors was studied. A 20-triad forced-choice rating scale was constructed. 33 engineers were rated by their supervisors using this device. The reliability of these ratings was .90. An item analysis showed 19 of the 20 triads to have strong discriminating power between high and low scorers. The same Ss were also rated in 8 different areas on a 4-point scale. The reliability of the 2nd rating scale was .87. The 2 scales correlated .73 with each other. These findings support previous research concerned with the more general applicability of the forced-choice technique for the determination of criterion scores. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

5.
Attempted to verify the value of classical and modified forced-choice rating systems. A double modification was performed: (a) only items of a general nature, i.e., those concerning all jobs in a large group of jobs were used; and (b) only items of neutral attractiveness were used. The lists of classical and modified forced-choice items were constructed separately for 250 blue-collar workers, 110 white-collar workers, and 97 supervisors. The modified forced-choice system was superior for the blue- and white-collar workers. The comparison of the mean ratings of each group also indicated superiority separation using the modified forced-choice system. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

6.
Rated 900 counselor statements on a persuasion scale that assesses counselor conviction and client agreement. The 24 highest-rated and 15 lowest-rated persuasive statements were fed into a graphic level recorder. Differences in the graphs of the high- and low-persuasive statements indicate that loudness is a characteristic of persuasion. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
Do behavioral observation scales measure observation?   总被引:1,自引:0,他引:1  
G. Latham and K. Wexley (see record 1980-02200-001) have claimed that behavioral observation scales (BOS) pose a simpler task for the rater than do either behaviorally anchored rating scales or graphic rating scales; with BOS, the rater need only observe and record behavior and need not make complex judgments about performance. Research on person memory suggests that recall for behaviors is structured by the same trait inferences and judgments that BOS are designed to avoid. In 2 experiments, 91 undergraduates rated videotaped lectures; data from the 1st experiment were used to construct BOS measuring clarity and speaking style. In the 2nd experiment, Ss used the BOS and a graphic rating scale to rate videotaped lectures in immediate and delayed rating conditions. As expected, the correlations between BOS ratings and judgmental ratings of performance were stronger when demands were placed on rater's recall. It is suggested that recall of behaviors is determined by the degree to which certain behaviors are representative of general judgments made about Ss being rated, and that BOS measure traitlike judgments rather than behavioral observation. (27 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

8.
9.
Examined the generalizability and validity of student (N?=?485) ratings by studying within-class and between-classes correlations of ratings with other variables for regular faculty teaching lecture courses as well as for graduate assistants teaching recitation sections. Results indicate that most ratings were highly generalizable but only some were related to learning and that certain aspects of both generalizability and validity varied with instructor's role and with level of data. The implications of these findings for the evaluation of teaching are discussed with reference to 2 alternative paradigms: construct validity and criterion development. (22 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
An attempt was made to determine the extent to which the interval properties of attitude scales constructed by the method of successive intervals are dependent upon the stimulus spacing properties of the statement group that is judged. 4 stimulus spacing conditions were used. 312 Ss were asked to judge on a 9-category scale sets of statements that had been extracted from the Thurstone and Chave scale measuring attitudes toward the church. The results of the study showed a straight line fit could accommodate scale values coming from paired experimental groups. Dispersion estimates did not permit a linear fit. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
College instructors in 329 classes evaluated their own teaching effectiveness with the same 35-item rating form used by their students. There was student–instructor agreement in courses taught by teaching assistants (r?=?.46), undergraduate courses taught by faculty (r?=?.41), and graduate level courses (r?=?.39). Separate factor analyses of the student and instructor ratings demonstrated that the same 9 evaluation factors (e.g., work load, organization, interaction) underlay both sets of ratings. A multitrait–multimethod analysis supported convergent and divergent validity of these rating factors. Not only were correlations between student and instructor ratings on the same factors statistically significant for each of 9 factors, but correlations between their ratings on different factors were low. Findings demonstrate student–instructor agreement on evaluation of teaching effectiveness, support the validity of student ratings for both graduate and undergraduate courses, and emphasize the importance of using multifactor rating scales derived through the application of factor analysis. (28 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
Previous attempts at statistically controlling for bias in ratings have been unsuccessful due to the inability to separate valid from invalid halo. By identifying item validities, the author statistically removed ratings on invalid items from ratings on valid items in the prediction of forced-choice ratings. Using this procedure, 11 studies were conducted in 2 organizations, with ratings done for 3 purposes, using 5 rating forms to evaluate 5 levels of jobs by 4 levels of raters. The multiple tests provided consistently high relationships between the forced-choice ratings and the statistically controlled behavioral checklist ratings. (22 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
"… an attempt to integrate the results of previous studies by analyzing a comprehensive list of appearance scales derived from… the literature and from insights of people making decisions in a forced-choice test situation." 100 items were taken randomly from a larger pool, but later randomly reduced to 40. To 12 appearance scales was added a discrimination index. A correlation matrix, generated from mean item ratings, was factor analyzed to yield 5 factors. "The finding of a general factor, supported by previous studies, brings an element of economy to forced-choice scale construction tending to support the pairing of items on only one appearance index." (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
To investigate the effects of Ss' attitudes and of response language on judgments of attitude statements, 62 university students rated 20 statements on the issue of the use of hallucinogenic drugs in terms of personal acceptability, and on 4 other rating scales. Two types of scales were used: A+ scales, where the antidrug end was marked by an evaluatively positive label and the prodrug end by an evaluatively negative label; and P+ scales, where the antidrug end was negatively labeled and the prodrug end positively labeled. In Condition 1, Ss were given only A+ scales; in Condition 2, only P+ scales; and in Conditions 3 and 4, 2 A+ and 2 P+ scales. Results confirm the accentuation theory prediction that "anti"-Ss should give more polarized ratings than "pro" Ss on A+ scales and less polarized ratings than "pro" Ss on P+ scales. This was so regardless of whether scale type was a between-Ss factor (comparison of Conditions 1 and 2) or, as in previous studies, a within-Ss factor (Conditions 3 and 4). Previous findings of a tendency for more anti ratings overall on A+ scales, and more pro ratings on P+ scales, were contradicted but it is argued that this may be due to Ss finding the statements mainly unacceptable. Other findings concern choice of adjectives to describe similar and dissimilar others. (French abstract) (26 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

15.
"A group of Ss was given the Taylor MAS and 6 weeks later the Heinman forced-choice version of the scale. Skin conductance measures were obtained for each S under 2 conditions: A rest period; and a task period involving shock threat for some Ss, and no shock threat for others. The results indicated that the forced-choice scale was positively related to the readings taken under rest condition, and negatively related to changes in conductance obtained under the threat of shock. The MAS did not correlate with any of the conductance readings." 18 references. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
Peer ratings made during a structured small group interaction (using the Group Assessment of Interpersonal Traits) were used to select participants from a pool of 103 female undergraduates who had volunteered for a human service practicum. Participants with the 32 highest and 30 lowest scores on behavioral measures of empathy, warmth, and openness (therapeutic talent) were randomly assigned to 3 training conditions: problem-solving interviewing, diagnostic interviewing, or no training. Each participant then served as an understanding listener in a problem-focused dyadic interview. Ratings made by interviewees and by 2 independent, objective raters were higher for those initially selected as having high therapeutic talent. Those noted as high in therapeutic talent also performed better as measured by objective behavioral ratings of the content of their interview statements. There were no systematic training effects. The implications of these results for the selection and training of nonprofessional mental health workers are discussed. (33 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
Bayesian reasoning can be improved by representing information in frequency formats rather than in probabilities. This thesis opens up applications in medicine, law, statistics education, and other fields. The beneficial effect is no longer in dispute, but rather its cause and its boundary conditions. C. Lewis and G. Keren (1999) argued that the effect of frequency formats is due to "joint statements" rather than to "frequency statements." However, they overlooked the fact that our thesis is about frequency formats, not just any kind of frequency statements. We show that joint statements alone cannot account for the effect. B. A. Mellers and A. P. McGraw (1999) proposed a boundary condition under which the beneficial effect is reduced. In a reanalysis of our original data, we found this reduction for the problem they used but not for any other problem. We conclude by summarizing results indicating that teaching frequency representations fosters insight into Bayesian reasoning. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

18.
The list-strength effect arises when increasing the strength of some items in a list reduces memory for the remaining items. The list-strength effect was investigated under conditions of rapid visual presentation. Randomized and blocked formats were used for the mixed lists. Performance was measured with both yes–no and forced-choice recognition procedures. Overall no evidence for a list-strength effect in recognition was found except under conditions that may promote reverse rehearsal borrowing. Two experiments were conducted to determine why performance on the yes–no tests was greater than on the forced-choice tests. Repeated testing with the yes–no procedure promoted more effective encoding than the forced-choice procedure. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

19.
"A forced-choice rating scale designed to determine the extent of a person's productive research behavior was developed at a Midwestern physical science foundation. Of the two experimental scales developed the better form showed an interrator reliability coefficient of .62 and a validity of .60. When the ratings of two raters were averaged the validity of the scale increased to .74." (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

20.
A revised system for numerically coding mixed standard scale (MSS) response combinations is proposed, and the psychometric implications of the new system are examined in the context of a comparative empirical study. The MSS ratings and graphic scale ratings obtained from 18 police supervisors for a total of 92 police patrol officers suggest that the proposed system does not substantially alter the findings of such comparative studies. The data further suggest that contradictions in the literature regarding the relative psychometric strengths and weaknesses of MSS ratings are probably not a function of the coding system originally proposed by F. Blanz and E. E. Ghiselli (1972) for translating MSS responses into numerical ratings. Adoption of the revised system is nevertheless recommended on the basis of (a) anticipated reliability increments and (b) greater "face validity" of the coding system, thereby rendering the MSS format potentially more acceptable to raters and ratees. (12 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号