首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A measure of the tendency to mismanage money was developed in an evaluation of a representative payee program for individuals with serious mental illnesses. A conceptual model was composed to guide item development, and items were tested, revised, added, and rejected in three waves of data collection. Rasch analyses were used to examine measurement properties. The resulting Money Mismanagement Measure (M3) consisted of 28 items with a Rasch person reliability at .72. Restriction of range was likely responsible for the low Rasch reliability. Validity analyses supported the construct validity of the M3. Subsequently, a cross-validation study was conducted on an untreated sample not as susceptible to range restriction. The M3 produced a Rasch person reliability = .85 with good validity. The M3 fills a gap that can facilitate research in the understudied area of money mismanagement.  相似文献   

2.
The increasing survival rate of infants with a complicated birth and perinatal history generated the need for a test of functional motor performance with the capability of identifying children under four months of age with delayed development which could be addressed with physical therapy. This paper describes a Rasch analysis of the psychometric qualities of the Test of Infant Motor Performance (TIMP) for the purpose of reducing the length of the test while maintaining its precision as a measurement device. Following analysis of fit statistics, item-to-total correlations, redundancy of item difficulty measures, and consideration of clinically-relevant features of test items from analysis of 1732 tests, the TIMP was reduced from 59 to 42 items forming a functional motor scale for prematurely born infants. The resulting person separation index was 4.85 and the item separation index was 23.79.  相似文献   

3.
The Rasch measurement model using dichotomous scoring of item response data from a newly created Mobility Scale administered to elderly independent living individuals is presented. The dichotomous scoring model, item calibration, person calibration, logit scale, normative scale score, reliability, and validity are explained. Results indicated that additional activity statements need to be written and tested to improve the Mobility Scale instrument.  相似文献   

4.
The current study investigates the performance of two Rasch measurement programs and their parameter estimations on the linear logistic test model (LLTM; Fischer, 1973). These two programs, LinLog (Whitely & Nieh, 1981) and FACETS (Linacre, 2002), are used to investigate within-item complexity factors in a spatial memory measure tool. LinLog uses conditional maximum likelihood to estimate person and item parameters and is an LLTM specific program. FACETS is usually reserved for the many-facet Rasch model (MFRM; Linacre, 1989), however in the case of specifically designed within-item solution processes, a multifaceted approach makes good sense. It is possible to consider each dimension within the item as a separate facet, just as if there were multiple raters for each item. Simulations of 500 and 1000 persons expand the original data set (114 persons) to better examine each estimation technique. LinLog and FACETS analyses show strikingly similar results in both the simulation and original data conditions, indicating that the FACETS program produces accurate LLTM parameter estimates.  相似文献   

5.
The aim is to show that it is possible to parameterize discrimination for sets of items, rather than individual items, without destroying conditions for sufficiency in a form of the Rasch model. The form of the model is obtained by formalizing the relationship between discrimination and the unit of a metric. The raw score vector across item sets is the sufficient statistic for the person parameter. Simulation studies are used to show the implementation of conditional estimation solution equations based on the relevant form of the Rasch model. The model also applied to two numeracy tests attempted by a group of common persons in a large-scale testing program. The results show improved fit compared with the Rasch model in its standard form. They also show the units of the scales were more accurately equated. The paper discusses implications for applied measurement using Rasch models and contrasts the approach with the application of the two parameter logistic (2PL) model.  相似文献   

6.
The invariance of the estimated parameters across variation in the incidental parameters of a sample is one of the most important properties of Rasch measurement models. This is the property that allows the equating of test forms and the use of computer adaptive testing. It necessarily follows that in Rasch models if the data fit the model, than the estimation of the parameter of interest must be invariant across sub-samples of the items or persons. This study investigates the degree to which the INFIT and OUTFIT item fit statistics in WINSTEPS detect violations of the invariance property of Rasch measurement models. The test in this study is a 80 item multiple-choice test used to assess mathematics competency. The WINSTEPS analysis of the dichotomous results, based on a sample of 2000 from a very large number of students who took the exam, indicated that only 7 of the 80 items misfit using the 1.3 mean square criteria advocated by Linacre and Wright. Subsequent calibration of separate samples of 1,000 students from the upper and lower third of the person raw score distribution, followed by a t-test comparison of the item calibrations, indicated that the item difficulties for 60 of the 80 items were more than 2 standard errors apart. The separate calibration t-values ranged from +21.00 to -7.00 with the t-test value of 41 of the 80 comparisons either larger than +5 or smaller than -5. Clearly these data do not exhibit the invariance of the item parameters expected if the data fit the model. Yet the INFIT and OUTFIT mean squares are completely insensitive to the lack of invariance in the item parameters. If the OUTFIT ZSTD from WINSTEPS was used with a critical value of | t | > 2.0, then 56 of the 60 items identified by the separate calibration t-test would be identified as misfitting. A fourth measure of misfit, the between ability-group item fit statistic identified 69 items as misfitting when a critical value of t > 2.0 was used. Clearly relying solely on the INFIT and OUTFIT mean squares in WINSETPS to assess the fit of the data to the model would cause one to miss one of the most important threats to the usefulness of the measurement model.  相似文献   

7.
Variable construction requires careful attention to substantive issues; a theory guiding its development, a hierarchy of illustrative items constructed to define the variable, the subsequent production of item difficulties and person measures, and the analysis of fit. Rasch measurement practitioners should give careful attention to these matters so practical suggestions are given for designing variables based on theory, item construction, and Rasch models for the analysis of data. Variable maps are emphasized to guide variable construction and interpret the results.  相似文献   

8.
In test analysis involving the Rasch model, a large degree of importance is placed on the "objective" measurement of individual abilities and item difficulties. The degree to which the objectivity properties are attained, of course, depends on the degree to which the data fit the Rasch model. It is therefore important to utilize fit statistics that accurately and reliably detect the person-item response inconsistencies that threaten the measurement objectivity of persons and items. Given this argument, it is somewhat surprising that there is far more emphasis placed in the objective measurement of person and items than there is in the measurement quality of Rasch fit statistics. This paper provides a critical analysis of the residual fit statistics of the Rasch model, arguably the most often used fit statistics, in an effort to illustrate that the task of Rasch fit analysis is not as simple and straightforward as it appears to be. The faulty statistical properties of the residual fit statistics do not allow either a convenient or a straightforward approach to Rasch fit analysis. For instance, given a residual fit statistic, the use of a single minimum critical value for misfit diagnosis across different testing situations, where the situations vary in sample and test properties, leads to both the overdetection and underdetection of misfit. To improve this situation, it is argued that psychometricians need to implement residual-free Rasch fit statistics that are based on the number of Guttman response errors, or use indices that are statistically optimal in detecting measurement disturbances.  相似文献   

9.
One of the assumptions of many latent trait models is local independence. This assumption specifies that, after controlling for the underlying trait, item responses are independent. Given the lack of studies of model robustness against such violations, it appears that this assumption is frequently taken for granted. Therefore, this study investigated the robustness of Rasch item and person estimates with simulated data under varying number of items, sample sizes, and levels of item redundancy. Item and person reliabilities, the standard deviations of the person and item estimates, the root mean squared differences and mean signed differences among person and item estimates, the correlations between person estimates, and the percentage of person estimates shifting by more than .50 logits were used to evaluate the impact of item redundancy. Both norm and criterion-reference interpretations may be influenced by the imputation of redundancy into the data. However, it appears that the amount of redundancy needs to be considerable before such interpretations would be adversely impacted. Suggestions for further simulation research are provided.  相似文献   

10.
This article contains information on the Rasch measurement partial credit model: what it is, how it differs from other Rasch models, when to use it, and how to use it. The calibration of instruments with increasingly complex items is described, starting with dichotomous items and moving on to polychotomous items using a single rating scale, and mixed polychotomous items using multiple rating scales, and instruments in which each item has its own rating scale. It also introduces a procedure for aligning rating scale categories to be used when more than one rating scale is used in a single instrument. Pivot anchoring is defined and an illustration of its use with the mental health scale of the SF-36 that contains positive and negative worded items is provided. It finally describes the effect of pivot anchoring on step calibrations, the item hierarchy, and person measures.  相似文献   

11.
12.
This research describes some of the similarities and differences between additive conjoint measurement (a type of fundamental measurement) and the Rasch model. It seems that there are many similarities between the two frameworks, however, their differences are nontrivial. For instance, while conjoint measurement specifies measurement scales using a data-free, non-numerical axiomatic frame of reference, the Rasch model specifies measurement scales using a numerical frame of reference that is, by definition, data dependent. In order to circumvent difficulties that can be realistically imposed by this data dependence, this research formalizes new non-parametric item response models. These models are probabilistic measurement theory models in the sense that they explicitly integrate the axiomatic ideas of measurement theory with the statistical ideas of order-restricted inference and Markov Chain Monte Carlo. The specifications of these models are rather flexible, as they can represent any one of several models used in psychometrics, such as Mokken's (1971) monotone homogeneity model, Scheiblechner's (1995) isotonic ordinal probabilistic model, or the Rasch (1960) model. The proposed non-parametric item response models are applied to analyze both real and simulated data sets.  相似文献   

13.
A large number of papers and technical reports are published every year describing researches where Rasch models are used. It has been observed, however, that not all the authors describe the application of the Rasch measurement with the same thoroughness. Some authors may leave behind important bits of information e.g. they may fail to investigate the person or item fit or may even fail to discuss the reliability of measurement. As a result, editorial guidelines have been published in order to suggest an informal minimum of thoroughness with which the authors may describe the application of Rasch measurement in their papers. This study presents stages for the development of a scale to investigate the comprehensiveness with which individual papers describe the application of Rasch models in practical settings. The scale is used to evaluate how comprehensively the papers published by the Journal of Applied Measurement present the application of Rasch models.  相似文献   

14.
A multidimensional Rasch analysis of gender differences in PISA mathematics   总被引:1,自引:0,他引:1  
Since the 1970s, much attention has been devoted to the male advantage in standardized mathematics tests in the United States. Although girls are found to perform equally well as boys in math classes, they are consistently outperformed on standardized math tests. This study compared the males and females in the United States, all 15-year-olds, by their performance on the PISA 2003 mathematics assessment. A multidimensional Rasch model was used for item calibration and ability estimation on the basis of four math domains: Space and Shape, Change and Relationships, Quantity, and Uncertainty. Results showed that the effect sizes of performance differences are small, all below .20, but consistent, in favor of boys. Space and Shape displayed the largest gender gap, which supports the findings from many previous studies. Quantity showed the least amount of gender difference, which may be explained by the hypothesis that girls perform better on tasks that they are familiar with through classroom practice.  相似文献   

15.
In the Rasch model for items with more than two ordered response categories, the thresholds that define the successive categories are an integral part of the structure of each item in that the probability of the response in any category is a function of all thresholds, not just the thresholds between any two categories. This paper describes a method of estimation for the Rasch model that takes advantage of this structure. In particular, instead of estimating the thresholds directly, it estimates the principal components of the thresholds, from which threshold estimates are then recovered. The principal components are estimated using a pairwise maximum likelihood algorithm which specialises to the well known algorithm for dichotomous items. The method of estimation has three advantageous properties. First, by considering items in all possible pairs, sufficiency in the Rasch model is exploited with the person parameter conditioned out in estimating the item parameters, and by analogy to the pairwise algorithm for dichotomous items, the estimates appear to be consistent, though unlike for the dichotomous case, no formal proof has yet been provided. Second, the estimates of each item parameter is a function of frequencies in all categories of the item rather than just a function of frequencies of two adjacent categories. This stabilizes estimates in the presence of low frequency data. Third, the procedure accounts readily for missing data. All of these properties are important when the model is used for constructing variables from large scale data sets which must account for structurally missing data. A simulation study shows that the quality of the estimates is excellent.  相似文献   

16.
The purpose of this research is twofold. First is to extend the work of Smith (1992, 1996) and Smith and Miao (1991, 1994) in comparing item fit statistics and principal component analysis as tools for assessing the unidimensionality requirement of Rasch models. Second is to demonstrate methods to explore how violations of the unidimensionality requirement influence person measurement. For the first study, rating scale data were simulated to represent varying degrees of multidimensionality and the proportion of items contributing to each component. The second study used responses to a 24 item Attention Deficit Hyperactivity Disorder scale obtained from 317 college undergraduates. The simulation study reveals both an iterative item fit approach and principal component analysis of standardized residuals are effective in detecting items simulated to contribute to multidimensionality. The methods presented in Study 2 demonstrate the potential impact of multidimensionality on norm and criterion-reference person measure interpretations. The results provide researchers with quantitative information to help assist with the qualitative judgment as to whether the impact of multidimensionality is severe enough to warrant removing items from the analysis.  相似文献   

17.
This paper revisits a half-century long theoretical controversy associated with the use of magnitude estimation scaling (MES) and category rating scaling (CRS) procedures in measurement. The MES procedure in this study involved instructing participants to write a number that matched their impression of difficulty of a test item. Participants were not restricted in the range of numbers they could choose for their scale. They also had the choice of disclosing their individual scale. After the MES task was completed, participants were given a blank copy of the test to rate the perceived difficulty of each item using a researcher-imposed categorical rating scale from 1 (very easy) to 6 (very difficult). The MES and CRS data were both analyzed using Rasch Rating scale model. Additionally, the MES data were examined with Rasch Partial Credit model. Results indicate that knowing each person's scale is associated with smaller errors of measurement.  相似文献   

18.
Local independence in the Rasch model can be violated in two generic ways that are generally not distinguished clearly in the literature. In this paper we distinguish between a violation of unidimensionality, which we call trait dependence, and a specific violation of statistical independence, which we call response dependence, both of which violate local independence. Distinct algebraic formulations for trait and response dependence are developed as violations of the dichotomous Rasch model, data are simulated with varying degrees of dependence according to these formulations, and then analysed according to the Rasch model assuming no violations. Relative to the case of no violation it is shown that trait and response dependence result in opposite effects on the unit of scale as manifested in the range and standard deviation of the scale and the standard deviation of person locations. In the case of trait dependence the scale is reduced; in the case of response dependence it is increased. Again, relative to the case of no violation, the two violations also have opposite effects on the person separation index (analogous to Cronbach's alpha reliability index of traditional test theory in value and construction): it decreases for data with trait dependence; it increases for data with response dependence. A standard way of accounting for dependence is to combine the dependent items into a higher-order polytomous item. This typically results in a decreased person separation index index and Cronbach's alpha, compared with analysing items as discrete, independent items. This occurs irrespective of the kind of dependence in the data, and so further contributes to the two violations not being distinguished clearly. In an attempt to begin to distinguish between them statistically this paper articulates the opposite effects of these two violations in the dichotomous Rasch model.  相似文献   

19.
The purpose of this research was to develop survey instruments to evaluate diabetes knowledge and self-efficacy in a diverse population, and investigate the psychometric properties of data obtained with these instruments using Rasch measurement. Two-hundred and fifty-five urban-dwelling participants with diabetes were recruited to complete surveys through independent interviews. To evaluate the association of health literacy on metabolic control, formal literacy and hemoglobin A1c fingerstick testing were performed. Rasch analysis of the data yielded item and person calibrations for self-efficacy and knowledge, with variable maps created to provide both norm and criterion-referenced interpretations. Knowledge scale person separation reliability was 0.50 and item separation reliability was 0.98; while self-efficacy scale person separation reliability was 0.72 with item separation reliability of 0.92. Statistically significant partial correlations were observed between knowledge and health literacy (r = 0.41, p<.001), and self-efficacy and hemoglobin A1c (r = -0.33, p<.001). However, there was no correlation between diabetes knowledge and hemoglobin A1c (r = 0.035, p = 0.29), or health literacy and A1c (r = 0.022, p = 0.36). Diabetes knowledge varied, with non-English speaking individuals having lower measures than English speakers (t(252) = -4.86, p<.001). Non-English speaking individuals also had lower self-efficacy measures than English speakers (t(251) = -2.68, p = .008). Current knowledge deficits and perceptions of self-management may be estimated visually through variable mapping, which may help in individualizing informational needs for people with diabetes.  相似文献   

20.
The purpose of this study was to evaluate the effects of item grouping on local independence and item invariance, the characteristics of items scaled under the Rasch model that make them sample-free. Item fit and calibration for attitude items presented in a grouped versus random order were examined. It was hypothesized that grouping items to facilitate interpretation central to a construct may result in a failure of invariance. Data were 107 responses to a 40-item mail survey of teachers' opinions about the Ontario Ministry's grade 9 literacy test. Effects of grouping and item phrasing on invariance were found. Results, however, generally support the use of grouping of items to provide a higher person separation, and potentially higher quality data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号