首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Validity concerns and usefulness of student ratings of instruction.   总被引:6,自引:0,他引:6  
The validity of student rating measures of instructional quality was severely questioned in the 1970s. By the early 1980s, however, most expert opinion viewed student rating measures as valid and as worthy of widespread use. In retrospect, older discriminant-validity concerns were not so much resolved as they were displaced from research attention by accumulating evidence for convergent validity. This article introduces a Current Issues section that gives new attention to validity concerns associated with student ratings. The section's 4 articles deal, respectively, with (a) conceptual structure (are student ratings unidimensional or multidimensional?), (b) convergent validity (how well do ratings correlate with other indicators of effective teaching?), (c) discriminant validity (are ratings influenced by factors other than teaching effectiveness?), and (d) consequential validity (are ratings used effectively in personnel development and evaluation?). Although all 4 articles favor the use of ratings, they disagree on controversial points associated with interpretation and use of ratings data. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Navigating student ratings of instruction.   总被引:8,自引:0,他引:8  
Many colleges and universities have adopted the use of student ratings of instruction as one (often the most influential) measure of instructional effectiveness. In this article, the authors present evidence that although effective instruction may be multidimensional, student ratings of instruction measure general instructional skill, which is a composite of 3 subskills: delivering instruction, facilitating interactions, and evaluating student learning. The authors subsequently report the results of a meta-analysis of the multisection validity studies that indicate that student ratings are moderately valid; however, administrative, instructor, and course characteristics influence student ratings of instruction. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

We argue that the multisection validation design is the strongest design for addressing the degree to which student ratings predict teacher-produced learning. Results of several dozen multisection validity studies appear inconsistent. Unfortunately, prior quantitative reviews did not answer questions about the diversity of findings. The authors explore sensitivity of the prior analyses to identify true explanatory characteristics, generalizability of the findings across dimensions of teaching, and adequacy of the analyses to identify potential explanatory characteristics. They conclude that prior analyses lack adequate statistical power, explanatory characteristics vary with the dimension of teaching being validated, and a host of other study features remain to be investigated. Those features are identified through nomological coding of 43 validity studies. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Reanalyzed data from 2 studies with 419 undergraduates of the Dr. Fox effect (R. G. Williams and J. E. Ware; see PA, Vol 55:8285 and Vol 60:12778). Factor analysis identified 5 evaluation factors that varied in the way they were affected by experimental manipulations of instructor expressiveness (IE) and content coverage in 3 incentive conditions. For Ss in the incentive condition most like the actual classroom, the Dr. Fox effect was not supported in that (a) IE only affected ratings of Instructor Enthusiasm—the factor most logically related to the manipulation—and (b) content coverage affected ratings of Instructor Knowledge—the factor most logically related to that manipulation—and examination performance, but not ratings of Instructor Enthusiasm. However, when Ss were not given the incentive to learn, IE had a greater impact than did content coverage on each of the rating factors (supporting the Dr. Fox effect) and examination performance. (23 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

This paper describes the underlying structure of student ratings of instruction, using three units of analysis. Data were analyzed using individual ratings, class means, and deviations from class means. Course characteristics were then used as independent variables in the prediction of factor scores. Results indicated that the underlying structure of class means is different from the structure yielded by the other units of analysis. Course characteristics described a small percentage of the variance of the factor scores. The usual robust factor structure of ratings, assumed to be due to shared implicit theories about instructor behavior or item similarity, was questioned. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

D. P. Hoyt et al (1973, 1978) recommend the use of norm groups, controlled for class size and student motivation, to adjust student ratings of instruction and make comparisons among teachers more appropriate. The method that Hoyt employs for generating norm groups relies heavily on the use of volunteer Ss. This study made a student evaluation-of-instruction questionnaire available to 169 teachers over a 2-yr period. 69 instructors volunteered to use the questionnaire. Volunteers were compared with nonvolunteers on a previously collected mandatory student evaluation questionnaire. The hypothesis that volunteer teachers' student ratings were reliably superior to the ratings of nonvolunteer teachers was confirmed. The implications for the generation and use of norm groups are discussed. (23 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Numerous studies have examined the relationship between grades and student ratings of instruction in order to help determine whether students can provide unbiased appraisals. The results of these studies have been inconsistent, and there has been little explanation of the diverse findings. Analysis of the previous research suggests that some of the variability in results can be explained by distinguishing between individual and class effects. The present study directly compared individual and class effects using large samples of both undergraduate and graduate students (Ns?=?5,893 and 21,648, respectively). Class effects were generally stronger than individual effects, and the statistical manipulation involved in changing units of analysis did not account for this difference. This suggests that the grades–rating relationships may be attributable to different elements in the rating process for each level of analysis. (51 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

The authors surveyed students (N?=?396) and faculty (N?=?156) at a 2-year college to determine their views toward publishing student ratings of instruction. Students favored published ratings of instruction, whereas faculty did not. Students cited many advantages of published ratings and rated the likelihood of potential benefits as high relative to faculty. In contrast, faculty cited numerous disadvantages of published ratings and rated the likelihood of potential costs as high relative to students. The authors discuss reasons for the contrasting views of students and faculty and offer suggestions for reconciling them. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Examined the generalizability and validity of student (N?=?485) ratings by studying within-class and between-classes correlations of ratings with other variables for regular faculty teaching lecture courses as well as for graduate assistants teaching recitation sections. Results indicate that most ratings were highly generalizable but only some were related to learning and that certain aspects of both generalizability and validity varied with instructor's role and with level of data. The implications of these findings for the evaluation of teaching are discussed with reference to 2 alternative paradigms: construct validity and criterion development. (22 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Investigated the extent to which the student's perception of the purpose for evaluating an instructor, the instructor's expressiveness, and the density of content presented in a lecture influenced student ratings and student achievement. 161 college students were randomly assigned to view lectures that systematically differed in lecturer expressiveness and density of content. The perceived purpose for evaluating the instructor had no effect on the Ss' ratings. All 5 student-rating subscale scores were significantly higher for the expressive lectures than for the nonexpressive lectures. On the dimension of instructor explanations, medium-content lectures received higher ratings than high-content lectures. (12 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Examined how the side effects of initial and final lecture quality on end-of-course student ratings can be predicted from seemingly unrelated gain–loss theory. Also investigated was the effect on ratings of student belief that the instructor will use midterm rating feedback to improve teaching. Using videotaped lectures in a 2?×?2?×?2 laboratory analog study, the present authors manipulated Lecture 1 (good, poor), Lecture 2 (good, poor), and whether 131 college students were told that feedback to the instructor about Lecture 1 would be used to improve teaching (yes, no). With Lecture 2 ratings as the principal measure, ratings varied moderately and inversely with Lecture 1 quality (negative primacy effect), greatly and directly with Lecture 2 quality (positive recency effect), and trivially with feedback. The primacy/recency findings confirm gain–loss predictions and illustrate how gain–loss theory can be interpreted as primacy/recency effects. Implications for expectancy research and field research on instructors using midterm ratings to improve instruction in the final portion of the course are discussed. (32 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Measures of learning and retention were obtained in ongoing college-level classes along with student ratings of instruction. Analyses of the resulting data from 90 undergraduates show that Ss who studied for and took a test not only achieved more but also retained their learning longer than Ss who "studied in order to learn rather than for a test." However, ratings of the method of instruction were lower when Ss were tested. It is concluded that testing is valuable in the learning process, but teachers who test might expect less positive evaluations from their students. (15 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Studied the relationship between grades professors give their students and ratings those students give their professors, using multivariate techniques and a large sample size (2,360 of 2,449 course sections taught in the spring semester of 1973) to avoid weaknesses of previous studies. Results show the following: (a) Only one factor was present among the 8 rating items; (b) the correlation between average student grade in each course section and average student rating of the teacher of that section was .35; (c) average grade was the best predictor of average rating; and (d) when average grade was added to several other available predictors, it significantly improved the multiple correlation from .25 to .39. Findings suggest that students' grades probably do influence their ratings of faculty, accounting for about 9% of the total variance. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Samples of about 200 undergraduate courses were investigated in each of 3 consecutive academic terms. Course survey forms assessed evaluative ratings, expected grades, and course workloads. A covariance structure model was developed in exploratory fashion for the 1st term's data, and then successfully cross-validated in each of the next 2 terms. The 2 major features of the successful model were that (a) courses that gave higher grades were better liked (a positive path from expected grades to evaluative ratings), and (b) courses that gave higher grades had lighter workloads (a negative relation between expected grades and workload). These findings support the conclusion that instructors' grading leniency influences ratings. This effect of grading leniency also importantly qualifies the standard interpretation that student ratings are relatively pure indicators of instructional quality. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Conclusions reached in previous research about the magnitude and nature of personality–performance linkages have been based almost exclusively on self-report measures of personality. The purpose of this study is to address this void in the literature by conducting a meta-analysis of the relationship between observer ratings of the five-factor model (FFM) personality traits and overall job performance. Our results show that the operational validities of FFM traits based on observer ratings are higher than those based on self-report ratings. In addition, the results show that when based on observer ratings, all FFM traits are significant predictors of overall performance. Further, observer ratings of FFM traits show meaningful incremental validity over self-reports of corresponding FFM traits in predicting overall performance, but the reverse is not true. We conclude that the validity of FFM traits in predicting overall performance is higher than previously believed, and our results underscore the importance of disentangling the validity of personality traits from the method of measurement of the traits. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

Black-and-white photographs of teachers, controlled for race, age, sex, and attractiveness, were rated on 7 factors of teacher performance by 150 students in Grades 2, 5, 7, 11, and 13. Across all developmental levels and on all factors, ratings of unattractive teachers tended to be lower. At all developmental levels, older teachers tended to receive lower ratings than younger teachers. Sex of the teacher appeared to be a more influential factor at Grades 11 and 13. Interactions showed that unattractive middle-aged female teachers and unattractive old male teachers frequently received lower ratings. (21 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

In a recent publication, Mowrer has described a method for obtaining palmar prints. Presumably densitometer measurements of such prints vary directly with emotional upset and relaxation. This document describes a study that was designed to test directly the hypothesis that real-life stress can be related to the index of palmar sweat. All measurements were taken during a period of stress rather than during a temporally adjacent period. Briefly, the palmar prints of a group of 34 college students were taken during an examination and contrasted with control prints of the same Ss which were taken either two weeks prior to the examination or two weeks after the examination. The method utilized for taking the palmar prints was that described by Mowrer. The results are described. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

This study used a 2?×?2?×?2 factorial experiment to examine student satisfaction with eight processes of collecting student ratings of instruction by varying (a) method (group interviews vs. individual standardized rating forms), (b) timing (midterm vs. end of course), and (c) amount of instructor reaction to student ratings (restricted vs. extended). Consistent with predictions drawn from reactance and social comparison theories, students were more satisfied with interview methods at midterm followed by extended instructor reaction than with traditional approaches for collecting student opinions about instruction (i.e., standardized rating forms administered at the end of a course). (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Examined the validity of observer (supervisor, coworker, and customer) ratings and self-ratings of personality measures. Results based on a sample of 105 sales representatives supported the 2 hypotheses tested. First, supervisor, coworker, and customer ratings of the 2 job-relevant personality dimensions—conscientiousness and extraversion—were valid predictors of performance ratings, and the magnitude of the validities was at least as large as for self-ratings. Second, supervisor, coworker, and customer ratings accounted for significant variance in the criterion measure beyond self-ratings alone for the relevant dimensions. Overall, the results suggest that validities of personality measures based on self-assessments alone may underestimate the true validity of personality constructs. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

Previous "educational seduction" research (D. H. Naftulin et al, 1973; J. E. Ware and R. G. Williams, 1975, 1977; Williams and Ware, 1976, 1977) suggests that teacher differences in expressiveness controlled the degree to which lecture content affected student ratings differently from student achievement. The present experiment with 245 university students attempted to replicate statistically this Expressiveness?×?Content?×?Measures interaction in a factorial design which investigated 4 simulated classes. The interaction was found for the high-incentive/no-study-opportunity class and the high-incentive/study-opportunity class, which most resembles typical classes, but not for the low-incentive/study-opportunity class or the low-incentive/no-study-opportunity class, which most resembles educational seduction research. In only the high-incentive/no-study-opportunity class did probes of the interaction replicate education seduction research in which content affected ratings and achievement similarly only for low expressiveness. (18 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号