首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
The Rasch family of models displays several well-documented properties that distinguish them from the general item response theory (IRT) family of measurement models. This paper describes an additional unique property of Rasch models, referred to as the property of item information constancy. This property asserts that the area under the information function for Rasch models is always equal to the number of response categories minus one, regardless of the values of the item location parameters. The implication of the property of item information constancy is that, for a given number of response categories, all items following a Rasch model contribute equally to the height of the test information function across the entire latent continuum.  相似文献   

3.
A new class of parametric IRT models for dichotomous item scores   总被引:2,自引:0,他引:2  
A new class of parametric IRT models for dichotomously scored items is presented. The new class of models is a subclass of both the class of models defined by the four-parameter logistic item response function and the nonparametric Double Monotonicity (DM) model. Three special cases of this new class of models are discussed. One of these special cases is shown to be the one-parameter logistic Rasch model. Both specific objectivity at the interval level of measurement and the sufficiency of the total score for the latent trait are shown to be measurement properties of the whole new class of models. For maximum likelihood estimation of the model parameters, both a joint and a conditional likelihood function are proposed.  相似文献   

4.
Local item dependence (LID) can emerge when the test items are nested within common stimuli or item groups. This study proposes a three-level hierarchical generalized linear model (HGLM) to model LID when LID is due to such contextual effects. The proposed three-level HGLM was examined by analyzing simulated data sets and was compared with the Rasch-equivalent two-level HGLM that ignores such a nested structure of test items. The results demonstrated that the proposed model could capture LID and estimate its magnitude. Also, the two-level HGLM resulted in larger mean absolute differences between the true and the estimated item difficulties than those from the proposed three-level HGLM. Furthermore, it was demonstrated that the proposed three-level HGLM estimated the ability distribution variance unaffected by the LID magnitude, while the two-level HGLM with no LID consideration increasingly underestimated the ability variance as the LID magnitude increased.  相似文献   

5.
In this paper, we present a way to extend the Hierarchical Generalized Linear Model (HGLM; Kamata (2001), Raudenbush (1995)) to include the many forms of measurement models available under the formulation known as the Random Coefficients Multinomial Logit (MRCML) Model (Adams, Wilson and Wang, 1997), and apply that to growth modeling. First, we review two different traditions in modeling growth studies: the first is based in the hierarchical linear modeling (HLM) tradition, and the second, which is the topic of this paper, is rooted in the Rasch measurement tradition - this is the linear Latent Growth Item Response Model (LG-IRM). Going beyond the linear case, the LG-IRM approach allows us to considerably extend the range of models available in the HLM tradition to incorporate several of the extensions of IRT models that are used in creating explanatory item response models (EIRM; De Boeck and Wilson, 2004). We next present a number of extensions - including polynomial growth modeling, differential item functioning (DIF) effects, growth functions that can be approximated by polynomial expressions, provision for polytomous responses, person and item covariates (and time varying covariates), and multiple dimensions of growth. We provide two empirical examples to illustrate several of the models, using the ConQuest software (Wu, Adams, Wilson and Haldane, 2008) to carry out the analyses. We also provide several simulations to investigate the success of the estimation procedures.  相似文献   

6.
This paper describes the development and validation of a democratic learning style scale intended to fill a gap in Sternberg's theory of mental self-government and the associated learning style inventory (Sternberg, 1988, 1997). The scale was constructed as an 8-item scale with a 7-category response scale. The scale was developed following an adapted version of DeVellis' (2003) guidelines for scale development. The validity of the Democratic Learning Style Scale was assessed by items analysis using graphical loglinear Rasch models (Kreiner and Christensen, 2002, 2004, 2006) The item analysis confirmed that the full 8-item revised Democratic Learning Style Scale fitted a graphical loglinear Rasch model with no differential item functioning but weak to moderate uniform local dependence between two items. In addition, a reduced 6-item version of the scale fitted the pure Rasch model with a rating scale parameterization. The revised Democratic Learning Style Scale can therefore be regarded as a sound measurement scale meeting requirements of both construct validity and objectivity.  相似文献   

7.
This research describes some of the similarities and differences between additive conjoint measurement (a type of fundamental measurement) and the Rasch model. It seems that there are many similarities between the two frameworks, however, their differences are nontrivial. For instance, while conjoint measurement specifies measurement scales using a data-free, non-numerical axiomatic frame of reference, the Rasch model specifies measurement scales using a numerical frame of reference that is, by definition, data dependent. In order to circumvent difficulties that can be realistically imposed by this data dependence, this research formalizes new non-parametric item response models. These models are probabilistic measurement theory models in the sense that they explicitly integrate the axiomatic ideas of measurement theory with the statistical ideas of order-restricted inference and Markov Chain Monte Carlo. The specifications of these models are rather flexible, as they can represent any one of several models used in psychometrics, such as Mokken's (1971) monotone homogeneity model, Scheiblechner's (1995) isotonic ordinal probabilistic model, or the Rasch (1960) model. The proposed non-parametric item response models are applied to analyze both real and simulated data sets.  相似文献   

8.
9.
In test analysis involving the Rasch model, a large degree of importance is placed on the "objective" measurement of individual abilities and item difficulties. The degree to which the objectivity properties are attained, of course, depends on the degree to which the data fit the Rasch model. It is therefore important to utilize fit statistics that accurately and reliably detect the person-item response inconsistencies that threaten the measurement objectivity of persons and items. Given this argument, it is somewhat surprising that there is far more emphasis placed in the objective measurement of person and items than there is in the measurement quality of Rasch fit statistics. This paper provides a critical analysis of the residual fit statistics of the Rasch model, arguably the most often used fit statistics, in an effort to illustrate that the task of Rasch fit analysis is not as simple and straightforward as it appears to be. The faulty statistical properties of the residual fit statistics do not allow either a convenient or a straightforward approach to Rasch fit analysis. For instance, given a residual fit statistic, the use of a single minimum critical value for misfit diagnosis across different testing situations, where the situations vary in sample and test properties, leads to both the overdetection and underdetection of misfit. To improve this situation, it is argued that psychometricians need to implement residual-free Rasch fit statistics that are based on the number of Guttman response errors, or use indices that are statistically optimal in detecting measurement disturbances.  相似文献   

10.
The link between the hierarchical generalized linear model (HGLM) and the Rasch model's parameterization has already been demonstrated by several researchers. Extensions have been described that include higher clustering levels to model more appropriately the contextual effects that are frequently encountered in educational research. However, pure hierarchies are relatively rare and instead cross-classified data structures are more frequently encountered. Cross-classified random effect modeling (CCREM) is still not commonly used. Use of CCREM in combination with the multilevel measurement model (MMM) has been recently introduced and is described further in the current study. Specifically, the link between the MMM and the CCREM MMM (termed "CCMMM" model) is provided. A dataset was simulated to demonstrate interpretation of the CCMMM model's parameters and to compare results under a CCMMM versus HGLM analysis. An Appendix is provided to demonstrate SAS GLIMMIX code used to estimate HGLM and CCMMM models' parameters.  相似文献   

11.
Using data from the PISA 2006 field trial, Rasch item response models are used to demonstrate that extreme response tendency was exhibited differentially across culturally distinct countries when answering Likert type attitude items. A single attitude scale is examined across eight culturally distinct countries in this paper. Two avenues to ameliorate this tendency are explored: first using dichotomous variants of the items, and second incorporating the country specific response tendency into the Rasch item response model. Analysis of the item variants reveals similar scale outcomes and correlations with achievement but preference for the Likert variant when test information is considered. A hierarchical analysis using facet models reveals that the data fit significantly better in a model that incorporates an interaction effect between the country and the item delta parameters. The implications for reporting attitudes measured with Likert items across cultures are outlined.  相似文献   

12.
Ueno  Maomi  Fuchimoto  Kazuma  Tsutsumi  Emiko 《Behaviormetrika》2021,48(2):409-424

This paper presents a review of advanced technologies for e-testing using an artificial intelligence approach. First, this paper introduces state-of-the-art uniform test assembly methods to guarantee examinee test score equivalence even if different examinees with the same ability take different tests. More formally, each uniform test form has equivalent measurement accuracy but with a different set of items. To increase the number of assembled tests, some test assembly methods allow that any two tests of uniform tests can include fewer common items than a user allows as a test constraint. This situation is designated as the overlapping condition. However, these methods used with an overlapping condition are often adversely affected by bias of the item exposure frequency and decreased reliability of items and tests. Second, this paper introduces state-of-the-art uniform test form assembly with a constraint of item exposure. Most earlier studies of e-testing employ item response theory (IRT) to obtain each examinee’s test score. However, IRT has several strict assumptions. Recently, Deep-IRT, which employs deep learning to relax the assumptions, has attracted attention. Finally, this paper introduces Deep-IRT models.

  相似文献   

13.
14.
Despite its 55 year presence in the field of mathematical psychology, the theory of unidimensional unfolding remains an enigma for many psychometricians and applied practitioners. This paper is the first of a three part series; and it aims to introduce unidimensional unfolding theory. The paper begins with a simple hypothetical example presenting an idealised distinction between responses to cumulative and unfolding dichotomous items. This followed by an accessible presentation of the theory of unidimensional unfolding as first articulated by Clyde H. Coombs (1950, 1964). The concept of the single peaked preference function (Coombs and Avrunin, 1977) which underpins unfolding theory is then presented. The article then progresses to the class of Rasch (1960) based IRT models developed by Andrich (1995) and Luo (2001). It was shown these models propose arguments not inconsistent with Coombs's (1964) original theory. The presumption of additive structure in psychological attributes was concluded to be the key weakness of the theories of unidimensional unfolding discussed.  相似文献   

15.
16.
Local independence in the Rasch model can be violated in two generic ways that are generally not distinguished clearly in the literature. In this paper we distinguish between a violation of unidimensionality, which we call trait dependence, and a specific violation of statistical independence, which we call response dependence, both of which violate local independence. Distinct algebraic formulations for trait and response dependence are developed as violations of the dichotomous Rasch model, data are simulated with varying degrees of dependence according to these formulations, and then analysed according to the Rasch model assuming no violations. Relative to the case of no violation it is shown that trait and response dependence result in opposite effects on the unit of scale as manifested in the range and standard deviation of the scale and the standard deviation of person locations. In the case of trait dependence the scale is reduced; in the case of response dependence it is increased. Again, relative to the case of no violation, the two violations also have opposite effects on the person separation index (analogous to Cronbach's alpha reliability index of traditional test theory in value and construction): it decreases for data with trait dependence; it increases for data with response dependence. A standard way of accounting for dependence is to combine the dependent items into a higher-order polytomous item. This typically results in a decreased person separation index index and Cronbach's alpha, compared with analysing items as discrete, independent items. This occurs irrespective of the kind of dependence in the data, and so further contributes to the two violations not being distinguished clearly. In an attempt to begin to distinguish between them statistically this paper articulates the opposite effects of these two violations in the dichotomous Rasch model.  相似文献   

17.
The invariance of the estimated parameters across variation in the incidental parameters of a sample is one of the most important properties of Rasch measurement models. This is the property that allows the equating of test forms and the use of computer adaptive testing. It necessarily follows that in Rasch models if the data fit the model, than the estimation of the parameter of interest must be invariant across sub-samples of the items or persons. This study investigates the degree to which the INFIT and OUTFIT item fit statistics in WINSTEPS detect violations of the invariance property of Rasch measurement models. The test in this study is a 80 item multiple-choice test used to assess mathematics competency. The WINSTEPS analysis of the dichotomous results, based on a sample of 2000 from a very large number of students who took the exam, indicated that only 7 of the 80 items misfit using the 1.3 mean square criteria advocated by Linacre and Wright. Subsequent calibration of separate samples of 1,000 students from the upper and lower third of the person raw score distribution, followed by a t-test comparison of the item calibrations, indicated that the item difficulties for 60 of the 80 items were more than 2 standard errors apart. The separate calibration t-values ranged from +21.00 to -7.00 with the t-test value of 41 of the 80 comparisons either larger than +5 or smaller than -5. Clearly these data do not exhibit the invariance of the item parameters expected if the data fit the model. Yet the INFIT and OUTFIT mean squares are completely insensitive to the lack of invariance in the item parameters. If the OUTFIT ZSTD from WINSTEPS was used with a critical value of | t | > 2.0, then 56 of the 60 items identified by the separate calibration t-test would be identified as misfitting. A fourth measure of misfit, the between ability-group item fit statistic identified 69 items as misfitting when a critical value of t > 2.0 was used. Clearly relying solely on the INFIT and OUTFIT mean squares in WINSETPS to assess the fit of the data to the model would cause one to miss one of the most important threats to the usefulness of the measurement model.  相似文献   

18.
Fox  Jean-Paul  Pimentel  Jonald  Glas  Cees 《Behaviormetrika》2006,33(1):27-42

A fixed effect item response theory (IRT) model is developed for modeling group specific item parameters. Two applications are presented. The first application is that the proposed model can be used to detect whether a response mechanism is ignorable using the splitter item technique. The second application is the detection of differential item functioning. In the latter application, the fixed effect item parameters can model item parameter differences between groups. Simulation studies are presented to show the feasibility and performance of the method on both applications.

  相似文献   

19.
The current study investigates the performance of two Rasch measurement programs and their parameter estimations on the linear logistic test model (LLTM; Fischer, 1973). These two programs, LinLog (Whitely & Nieh, 1981) and FACETS (Linacre, 2002), are used to investigate within-item complexity factors in a spatial memory measure tool. LinLog uses conditional maximum likelihood to estimate person and item parameters and is an LLTM specific program. FACETS is usually reserved for the many-facet Rasch model (MFRM; Linacre, 1989), however in the case of specifically designed within-item solution processes, a multifaceted approach makes good sense. It is possible to consider each dimension within the item as a separate facet, just as if there were multiple raters for each item. Simulations of 500 and 1000 persons expand the original data set (114 persons) to better examine each estimation technique. LinLog and FACETS analyses show strikingly similar results in both the simulation and original data conditions, indicating that the FACETS program produces accurate LLTM parameter estimates.  相似文献   

20.
Variable construction requires careful attention to substantive issues; a theory guiding its development, a hierarchy of illustrative items constructed to define the variable, the subsequent production of item difficulties and person measures, and the analysis of fit. Rasch measurement practitioners should give careful attention to these matters so practical suggestions are given for designing variables based on theory, item construction, and Rasch models for the analysis of data. Variable maps are emphasized to guide variable construction and interpret the results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号