首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
2.
Mixed models take the dependency between observations based on the same cluster into account by introducing 1 or more random effects. Common item response theory (IRT) models introduce latent person variables to model the dependence between responses of the same participant. Assuming a distribution for the latent variables, these IRT models are formally equivalent with nonlinear mixed models. It is shown how a variety of IRT models can be formulated as particular instances of nonlinear mixed models. The unifying framework offers the advantage that relations between different IRT models become explicit and that it is rather straightforward to see how existing IRT models can be adapted and extended. The approach is illustrated with a self-report study on anger. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

3.
The authors discuss the applicability of nonparametric item response theory (IRT) models to the construction and psychometric analysis of personality and psychopathology scales, and they contrast these models with parametric IRT models. They describe the fit of nonparametric IRT to the Depression content scale of the Minnesota Multiphasic Personality Inventory-2 (J. N. Butcher, W. G. Dahlstrom, J. R. Graham, A. Tellegen, & B. Kaemmer, 1989). They also show how nonparametric IRT models can easily be applied and how misleading results from parametric IRT models can be avoided. They recommend the use of nonparametric IRT modeling prior to using parametric logistic models when investigating personality data. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

4.
Person-fit statistics have been proposed to investigate the fit of an item score pattern to an item response theory (IRT) model. The author investigated how these statistics can be used to detect different types of misfit. Intelligence test data were analyzed using person-fit statistics in the context of the G. Rasch (1960) model and R. J. Mokken's (1971, 1997) IRT models. The effect of the choice of an IRT model to detect misfitting item score patterns and the usefulness of person-fit statistics for diagnosis of misfit are discussed. Results showed that different types of person-fit statistics can be used to detect different kinds of person misfit. Parametric person-fit statistics had more power than nonparametric person-fit statistics. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

5.
The goal of this study was to explore similarities and differences in person-fit assessment under item response theory (IRT) and covariance structure analysis (CSA) measurement models. The responses of 3,245 individuals who completed 3 personality scales were analyzed under an IRT model and a CSA model. The authors then computed person-fit statistics for individual examinees under both IRT and CSA models. To be specific, for each examinee, the authors computed a standardized person-fit index for the IRT models, called Zl; in addition, an individual's contribution to chi-square, called IND{chi}, was used as a person-fit indicator for CSA models. Findings indicated that these indices are relatively free of confounds with examinee trait level. However, the relationship between Zl, and IND{chi}, values was small, suggesting that the indices identify different examinees as not fitting a model. Implications of the results and directions for future inquiry are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

6.
The Rutgers Alcohol Problem Index (RAPI; H. R. White & E. W. Labouvie, 1989) is a frequently used measure of alcohol-related consequences in adolescents and college students, but psychometric evaluations of the RAPI are limited and it has not been validated with college students. This study used item response theory (IRT) to examine the RAPI on students (N = 895; 65% female, 35% male) assessed in both high school and college. A series of 2-parameter IRT models were computed, examining differential item functioning across gender and time points. A reduced 18-item measure demonstrating strong clinical utility is proposed, with scores of 8 or greater implying greater need for treatment. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

7.
Two-choice response times are a common type of data, and much research has been devoted to the development of process models for such data. However, the practical application of these models is notoriously complicated, and flexible methods are largely nonexistent. We combine a popular model for choice response times—the Wiener diffusion process—with techniques from psychometrics in order to construct a hierarchical diffusion model. Chief among these techniques is the application of random effects, with which we allow for unexplained variability among participants, items, or other experimental units. These techniques lead to a modeling framework that is highly flexible and easy to work with. Among the many novel models this statistical framework provides are a multilevel diffusion model, regression diffusion models, and a large family of explanatory diffusion models. We provide examples and the necessary computer code. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

8.
A framework is presented to model instances of local dependence between items within the context of unidimensional item response theory (IRT). A distinction is made between item main effects and item interactions. Four types of models for interdependent items are considered, on the basis of the distinction between order dependency and combination dependency on the one hand, and dimension-dependent versus constant interaction on the other hand. For each of the 4 model types, variants of the 1-parameter logistic model can be formulated as well as variants of the 2-parameter logistic model. A number of existing IRT models for polytomous items that are variants of the partial credit model may be reconsidered in these terms. Two examples are given to demonstrate the approach. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

9.
Popular methods for fitting unidimensional item response theory (IRT) models to data assume that the latent variable is normally distributed in the population of respondents, but this can be unreasonable for some variables. Ramsay-curve IRT (RC-IRT) was developed to detect and correct for this nonnormality. The primary aims of this article are to introduce RC-IRT less technically than it has been described elsewhere; to evaluate RC-IRT for ordinal data via simulation, including new approaches for model selection; and to illustrate RC-IRT with empirical examples. The empirical examples demonstrate the utility of RC-IRT for real data, and the simulation study indicates that when the latent distribution is skewed, RC-IRT results can be more accurate than those based on the normal model. Along with a plot of candidate curves, the Hannan-Quinn criterion is recommended for model selection. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
Investigated the utility of confirmatory factor analysis (CFA) and item response theory (IRT) models for testing the comparability of psychological measurements. Both procedures were used to investigate whether mood ratings collected in Minnesota and China were comparable. Several issues were addressed. The 1st issue was that of establishing a common measurement scale across groups, which involves full or partial measurement invariance of trait indicators. It is shown that using CFA or IRT models, test items that function differentially as trait indicators across groups need not interfere with comparing examinees on the same trait dimension. Second, the issue of model fit was addressed. It is proposed that person-fit statistics be used to judge the practical fit of IRT models. Finally, topics for future research are suggested. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
Despite advances in psychometric theory and analytic techniques, a number of issues regarding the assessment of depression remain unresolved, including the relative effectiveness of response options (option effectiveness), the ability of existing measures to detect differences in depressive severity (scale discriminability), and the extent to which certain groups of individuals use items and options differently (differential item functioning). One part of the article introduces the fundamentals of nonparametric item response models; the 2nd part of the article illustrates how item response models can be applied to address specific psychometric issues. Although the article focuses on the assessment of depression, the problems addressed in this article are present in virtually every field of psychological research, and the techniques offered may be applied broadly. Analytic techniques based on item response models are not only helpful in identifying and ultimately resolving many of these issues, they are essential to ensure that traits, abilities, and conditions, such as depression, are assessed fairly and equitably. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
The authors use multiple-sample longitudinal data from different test batteries to examine propositions about changes in constructs over the life span. The data come from 3 classic studies on intellectual abilities in which, in combination, 441 persons were repeatedly measured as many as 16 times over 70 years. They measured cognitive constructs of vocabulary and memory using 8 age-appropriate intelligence test batteries and explore possible linkage of these scales using item response theory (IRT). They simultaneously estimated the parameters of both IRT and latent curve models based on a joint model likelihood approach (i.e., NLMIXED and WINBUGS). They included group differences in the model to examine potential interindividual differences in levels and change. The resulting longitudinal invariant Rasch test analyses lead to a few new methodological suggestions for dealing with repeated constructs based on changing measurements in developmental studies. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
Ambiguous response formats predict correlations from -.467 to -1 between opposite items, depending on whether the respondent's interpretation of the format is unipolar or bipolar. The authors present a procedure to investigate the proper interpretation in each case. It consists of applying nonparametric and parametric item response theory models (the Mokken and the graded response models) to pairs of opposite items in order to find the locations of the response options along the latent scale and, therefore, identify the response format construction. The authors tested this procedure on 4 samples (Ns=142-1,150) and 2 item pairs ("relaxed"-"tense" and "optimistic"-"pessimistic"). The results revealed that respondents constructed the formats as bipolar and supported the bipolarity of the item pairs. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
In this article, the authors developed a common strategy for identifying differential item functioning (DIF) items that can be implemented in both the mean and covariance structures method (MACS) and item response theory (IRT). They proposed examining the loadings (discrimination) and the intercept (location) parameters simultaneously using the likelihood ratio test with a free-baseline model and Bonferroni corrected critical p values. They compared the relative efficacy of this approach with alternative implementations for various types and amounts of DIF, sample sizes, numbers of response categories, and amounts of impact (latent mean differences). Results indicated that the proposed strategy was considerably more effective than an alternative approach involving a constrained-baseline model. Both MACS and IRT performed similarly well in the majority of experimental conditions. As expected, MACS performed slightly worse in dichotomous conditions but better than IRT in polytomous cases where sample sizes were small. Also, contrary to popular belief, MACS performed well in conditions where DIF was simulated on item thresholds (item means), and its accuracy was not affected by impact. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

15.
The population-dependent concept of reliability is used in test score models such as classical test theory and the binomial error model, whereas in item response models, the population-independent concept of information is used. Reliability and information apply to both test score and item response models. Information is a conditional definition of precision, that is, the precision for a given subject; reliability is an unconditional definition, that is, the precision for a population of subjects. Information and reliability do not distinguish test score and item response models. The main distinction is that the parameters are specific for the test and the subject in test score models, whereas in item response models, the item parameters are separated from the subject parameters. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
The purpose of this article is to illustrate the power of item response theory (IRT) for the item analysis of measurement instruments in psychology. Through illustration, we show that IRT latent variable models fit data from a wide variety of sources and that interpretation of the features of these fitted models leads to interesting insights into the psychology underlying the data. The illustrations involve personality and attitude measurement as well as the evaluation of cognitive proficiency. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
A conventional way to analyze item responses in multiple tests is to apply unidimensional item response models separately, one test at a time. This unidimensional approach, which ignores the correlations between latent traits, yields imprecise measures when tests are short. To resolve this problem, one can use multidimensional item response models that use correlations between latent traits to improve measurement precision of individual latent traits. The improvements are demonstrated using 2 empirical examples. It appears that the multidimensional approach improves measurement precision substantially, especially when tests are short and the number of tests is large. To achieve the same measurement precision, the multidimensional approach needs less than half of the comparable items required for the unidimensional approach. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

18.
The authors describe the initial development of the Wagner Assessment Test (WAT), an instrument designed to assess critical thinking, using the 5-faceted view popularized by the Watson-Glaser Critical Thinking Appraisal (WGCTA; G. B. Watson & E. M. Glaser, 1980). The WAT was designed to reduce the degree of successful guessing relative to the WGCTA by increasing the number of response alternatives (i.e., 80% of WGCTA items are 2-alternative, multiple-choice), a change that was hypothesized to result in more desirable test information and standard-error functions. Analyses using the 3-parameter logistic item response theory (IRT) model in a sample of undergraduates (N = 407) supported this prediction, even when the WAT item pool was shortened to match the length of the WGCTA. Convergent validity between full-pool IRT score estimates was r = .69. Implications for subsequent research on IRT-based measurement of critical thinking are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

19.
Exemplar-similarity models such as the exemplar-based random walk (EBRW) model (Nosofsky & Palmeri, 1997b) were designed to provide a formal account of multidimensional classification choice probabilities and response times (RTs). At the same time, a recurring theme has been to use exemplar models to account for old–new item recognition and to explain relations between classification and recognition. However, a major gap in research is that the models have not been tested on their ability to provide a theoretical account of RTs and other aspects of performance in the classic Sternberg (1966) short-term memory-scanning paradigm, perhaps the most venerable of all recognition-RT tasks. The present research fills that gap by demonstrating that the EBRW model accounts in natural fashion for a wide variety of phenomena involving diverse forms of short-term memory scanning. The upshot is that similar cognitive operating principles may underlie the domains of multidimensional classification and short-term old–new recognition. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

20.
The Psychopathy Checklist--Revised (PCL-R) is an important measure in both applied and research settings. Evidence for its validity is mostly derived from male Caucasian participants. PCL-R ratings of 359 Caucasian and 356 African American participants were compared using confirmatory factor analysis (CFA) and item response theory (IRT) analyses. Previous research has indicated that 13 items of the PCL-R can be described by a 3-factor hierarchical model. This model was replicated in this sample. No cross-group difference in factor structure could be found using CFA; the structure of psychopathy is the same in both groups. IRT methods indicated significant but small differences in the performance of 5 of the 20 PCL-R items. No significant differential test functioning was found, indicating that the item differences canceled each other out. It is concluded that the PCL-R can be used, in an unbiased way, with African American participants. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号