首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Recent legal developments appear to sanction the use of psychometrically unsound procedures for examining differential item functioning (DIF) on standardized tests. More appropriate approaches involve the use of item response theory (IRT). However, many IRT-based DIF studies have used F. M. Lord's (see record 1987-17535-001) joint maximum likelihood procedure, which can lead to incorrect and misleading results. A Monte Carlo simulation was conducted to evaluate the effectiveness of two other methods of parameter estimation: marginal maximum likelihood estimation and Bayes modal estimation. Sample size and data dimensionality were manipulated in the simulation. Results indicated that both estimation methods (a) provided more accurate parameter estimates and less inflated Type I error rates than joint maximum likelihood, (b) were robust to multidimensionality, and (c) produced more accurate parameter estimates and higher rates of identifying DIF with larger samples. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

2.
Drinking behavior in preadolescence is a significant predictor of both short- and long-term negative consequences. This study examined the psychometric properties of 1 known risk factor for drinking in this age group, alcohol expectancies, within an item response theory framework. In a sample of middle school youths (N = 1,273), the authors tested differential item functioning (DIF) in positive and negative alcohol expectancies across grade, gender, and ethnicity. Multiple-indicator multiple-cause model analyses tested differences in alcohol use as a potential explanation for observed DIF across groups. Results showed that most expectancy items did not exhibit DIF. For items where DIF was indicated, differences in alcohol use did not explain differences in item parameters. Positive and negative expectancies also systematically differed in the location parameter. Latent variable scale scores of both positive and negative expectancies were associated with drinking behavior cross-sectionally, while only positive expectancies predicted drinking prospectively. Improving the measurement of alcohol expectancies can help researchers better assess this important risk factor for drinking in this population, particularly the identification of those with either very high positive or very low negative alcohol expectancies. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

3.
In this article, the authors developed a common strategy for identifying differential item functioning (DIF) items that can be implemented in both the mean and covariance structures method (MACS) and item response theory (IRT). They proposed examining the loadings (discrimination) and the intercept (location) parameters simultaneously using the likelihood ratio test with a free-baseline model and Bonferroni corrected critical p values. They compared the relative efficacy of this approach with alternative implementations for various types and amounts of DIF, sample sizes, numbers of response categories, and amounts of impact (latent mean differences). Results indicated that the proposed strategy was considerably more effective than an alternative approach involving a constrained-baseline model. Both MACS and IRT performed similarly well in the majority of experimental conditions. As expected, MACS performed slightly worse in dichotomous conditions but better than IRT in polytomous cases where sample sizes were small. Also, contrary to popular belief, MACS performed well in conditions where DIF was simulated on item thresholds (item means), and its accuracy was not affected by impact. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

4.
5.
In this article, the authors have 2 aims. First, hierarchical, nonhierarchical, and nonstandard log-linear models are defined. Second, application scenarios are presented for nonhierarchical and nonstandard models, with illustrations of where these scenarios can occur. Parameters can be interpreted in regard to their formal meaning and in regard to their magnitude. The interpretation of the meaning of parameters is the main focus of this article. Design matrices are used to describe the hypotheses tested in models and to illustrate cases in which parameters are interpretable. Also, design matrices are used to show where and how nonstandard models differ from standard hierarchical models. Coding schemes are discussed, in particular, dummy coding and effects coding. Data examples are given with data and models discussed in the literature. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

6.
Statistical methods based on item response theory (IRT) were used to bidirectionally evaluate the measurement equivalence of translated American and German intelligence tests. Items that displayed differential item functioning (DIF) were identified, and content analysis was used to determine probable sources, of DIF, either cultural or linguistic. The benefits of using an IRT analysis in examining the fidelity of translated tests are described. In addition, the influence of cultural differences on test translations and the use of DIF items to elucidate cultural differences are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
8.
Strategies for the selection of log-linear models   总被引:1,自引:0,他引:1  
In a multidimensional contingency table strategies have been proposed to build log-linear models using either stepwise methods or standardized estimates of the parameters of the saturated model. Brown (1976) proposed a two-step procedure to screen effects and then test a subset of models. Alternate methods of model building are discussed with respect to the final choice of model and with respect to intermediate information available to the data analyst during the selection process.  相似文献   

9.
Comments on R. W. Motta et al's (see record 1994-04005-001) conclusion that the use of human figure drawings (HFDs) in psychological testing is invalid. Motta et al completely ignore strong positive evidence that narrowly prescribed uses of HFD are valid for assessing certain aspects of personality and intellectual functioning. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
In this study, an item response theory-based differential functioning of items and tests (DFIT) framework (N. S. Raju, W. J. van der Linden, & P. F. Fleer, 1995) was applied to a Likert-type scale. Several differential item functioning (DIF) analyses compared the item characteristics of a 10-item satisfaction scale for Black and White examinees and for female and male examinees. F. M. Lord's (1980) chi-square and the extended signed area (SA) measures were also used. The results showed that the DFIT indices consistently performed in the expected manner. The results from Lord's chi-square and the SA procedures were somewhat varied across comparisons. A discussion of these results along with an illustration of an item with significant DIF and suggestions for future DIF research are presented. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
The present analyses examined age-related measurement bias in responses to items on the revised Beck Depression Inventory (BDI) in depressed late-life patients versus midlife patients. Item response theory (IRT) models were used to equate the scale and to differentiate true-group differences from bias in measurement in the 2 samples. Baseline BDI data (218 late life and 613 midlife) were used for the present analysis. IRT results indicated that late-life patients tended to report fewer cognitive symptoms, especially at low to average levels of depression. Conversely, they tended to report more somatic symptoms, especially at higher levels of depression. Adjusted cutoff scores in the late-life group are provided, and possible reasons for age-related differences in the performance of the BDI are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
I. L. Janis and L. Mann (1977) proposed a decisional balance sheet of incentives as a general schema for representing both the cognitive and motivational aspects of human decision making. In the present study, a 24-item paper and pencil measure was constructed to study the decision-making process in smoking cessation among 960 current or previous smokers. Two scales, labeled the Pros of Smoking and the Cons of Smoking, were successful in differentiating between 5 groups representing stages of change in the quitting process. The 2 scales were also successful when employed as predictors of smoking status at a 6-mo follow-up. The decisional balance scale appears to be a powerful construct of potentially wide application in behavior change. (35 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
14.
Research on emotion regulation has shown cognitive reappraisal to be positively correlated with better psychological functioning. Prior research has failed to account for contextual influences on this important relationship. We examined how this relationship plays out across two United States ethnic groups that represent different contexts of oppression: Puerto Ricans, experiencing distal oppression (societal level) but not proximal oppression (immediate environment), and Latino Americans, experiencing both. We also captured individual beliefs regarding oppression of one's group and implications of that oppression by measuring oppressed minority ideology (OMI). Results confirmed our hypothesis that the relationship between reappraisal and psychological functioning would be moderated by the context of oppression (as measured by ethnic group membership and OMI). For Latino Americans high on OMI, reappraisal was negatively associated with psychological functioning. For Puerto Ricans, regardless of OMI, this relationship remained positive, suggesting a possible benefit for minorities in being surrounded by in-group members. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

15.
84 university counseling center clients (61 women and 23 men) self-reporting childhood physical, sexual, or emotional abuse (n?=?30) or no childhood abuse (n?=?54) completed 3 measures of psychological functioning. Multivariate analysis of variance revealed that clients reporting abuse were more depressed (with the mean Beck Depression Inventory score in the borderline clinical depression range), had more symptomatology (with the mean Global Severity Index of the Brief Symptom Inventory at about the average level of a psychiatric outpatient population), and scored higher on the Borderline Personality scale of the Millon Clinical Multiaxial Inventory (with the mean base-rate score near the cutoff score for presence of borderline personality features). 19 clients reporting emotional abuse only did not differ on any measure from 11 clients reporting sexual or multiple forms of abuse. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
S. L. Bem's definition of psychological androgny as the integration of both masculinity and femininity within a single individual obscures a potentially important distinction between those individuals who score high on both masculinity and femininity and those who score low on both. To assess the importance of this distinction, the Bem Sex-Role Inventory was administered to 375 male and 290 female undergraduates, along with a variety of other pencil-and-paper questionnaires, and in addition, the results of Bem's earlier laboratory studies were reanalyzed with the low-low scorers separated out. High-high and low-low scorers did not differ significantly on the Attitudes Toward Women Scale, Rotter's Internal-External Locus of Control Scale, the Mach IV Scale, or the Attitudes Toward Problem-Solving Scale, nor did they differ significantly in 2 of Bem's 3 previous studies. Nevertheless, low-low scorers were significantly lower in self-esteem (Texas Social Behavior Inventory) than high-high scorers, they displayed significantly less responsiveness toward a kitten, and, among men, they reported significantly less self-disclosure (Jourard's Self-Disclosure Scale). Although the results are not consistent, it is concluded that a distinction between high-high and low-low scorers does seem to be warranted. (29 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
Change in adult intellectual performance was assessed with longitudinal data from the Intergenerational Studies at the Institute of Human Development. Wechsler Intelligence data from two age cohorts spanning ages 18 to 61 were analyzed at the subtest and item level. Hotelling T–2 analyses on sets of equivalent items from Wechsler subtests were studied to determine if change in response occurred between pairwise combinations of occasions of test administrations. We used Bowker's test to analyze data at the item level to determine the direction of change in performance. Consistent improvement in performance occurred between the ages of 18–40 and 18–54. Between the ages of 40 and 61, results showed mostly improved performance on the Information, Comprehension, and Vocabulary subtests, mixed change on the Picture Completion subtest, and decline on the Digit Symbol and Block Design subtests. The pattern of mixed change on the Picture Completion subtest indicated improvement on the easy items and decline on the difficult items. Decline in performance on the Block Design test occurred only for the most difficult items. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

18.
The Mattis Dementia Rating Scale (MDRS) is a commonly used cognitive measure designed to assess the course of decline in progressive dementias. However, little information is available about possible systematic racial bias on the items presented in this test. We investigated race as a potential source of test bias and differential item functioning in 40 pairs of African American and Caucasian dementia patients (N = 80), matched on age, education, and gender. Principal component analysis revealed similar patterns and magnitudes across component loadings for each racial group, indicating no clear evidence of test bias on account of race. Results of an item analysis of the MDRS revealed differential item functioning across groups on only 4 of 36 items, which may potentially be dropped to produce a modified MDRS that may be less sensitive to cultural factors. Given the absence of test bias because of race, the observed racial differences on the total MDRS score are most likely associated with group differences in dementia severity. We conclude that the MDRS shows no appreciable evidence of test bias and minimal differential item functioning (item bias) because of race, suggesting that the MDRS may be used in both African American and Caucasian dementia patients to assess dementia severity.  相似文献   

19.
The population-dependent concept of reliability is used in test score models such as classical test theory and the binomial error model, whereas in item response models, the population-independent concept of information is used. Reliability and information apply to both test score and item response models. Information is a conditional definition of precision, that is, the precision for a given subject; reliability is an unconditional definition, that is, the precision for a population of subjects. Information and reliability do not distinguish test score and item response models. The main distinction is that the parameters are specific for the test and the subject in test score models, whereas in item response models, the item parameters are separated from the subject parameters. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

20.
This study demonstrated the application of an innovative item response theory (IRT) based approach to evaluating measurement equivalence, comparing a newly developed Spanish version of the Posttraumatic Stress Disorder Checklist-Civilian Version (PCL-C) with the established English version. Basic principles and practical issues faced in the application of IRT methods for instrument evaluation are discussed. Data were derived from a study of the mental health consequences of community violence in both Spanish speakers (n = 102) and English speakers (n = 284). Results of differential item functioning (DIF) analyses revealed that the 2 versions were not fully equivalent on an item-by-item basis in that 6 of the 17 items displayed uniform DIF. No bias was observed, however, at the level of the composite PCL-C scale score, indicating that the 2 language versions can be combined for scale-level analyses. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号