共查询到20条相似文献,搜索用时 15 毫秒
1.
The invariance of the estimated parameters across variation in the incidental parameters of a sample is one of the most important properties of Rasch measurement models. This is the property that allows the equating of test forms and the use of computer adaptive testing. It necessarily follows that in Rasch models if the data fit the model, than the estimation of the parameter of interest must be invariant across sub-samples of the items or persons. This study investigates the degree to which the INFIT and OUTFIT item fit statistics in WINSTEPS detect violations of the invariance property of Rasch measurement models. The test in this study is a 80 item multiple-choice test used to assess mathematics competency. The WINSTEPS analysis of the dichotomous results, based on a sample of 2000 from a very large number of students who took the exam, indicated that only 7 of the 80 items misfit using the 1.3 mean square criteria advocated by Linacre and Wright. Subsequent calibration of separate samples of 1,000 students from the upper and lower third of the person raw score distribution, followed by a t-test comparison of the item calibrations, indicated that the item difficulties for 60 of the 80 items were more than 2 standard errors apart. The separate calibration t-values ranged from +21.00 to -7.00 with the t-test value of 41 of the 80 comparisons either larger than +5 or smaller than -5. Clearly these data do not exhibit the invariance of the item parameters expected if the data fit the model. Yet the INFIT and OUTFIT mean squares are completely insensitive to the lack of invariance in the item parameters. If the OUTFIT ZSTD from WINSTEPS was used with a critical value of | t | > 2.0, then 56 of the 60 items identified by the separate calibration t-test would be identified as misfitting. A fourth measure of misfit, the between ability-group item fit statistic identified 69 items as misfitting when a critical value of t > 2.0 was used. Clearly relying solely on the INFIT and OUTFIT mean squares in WINSETPS to assess the fit of the data to the model would cause one to miss one of the most important threats to the usefulness of the measurement model. 相似文献
2.
Empirically based item selection guidelines are presented for moving the cut score on equated tests consisting of n dichotomous items calibrated assuming the Rasch model. The cut score on a test form B, c(B), may be made higher than test form A's cut score, c(A), in the following ways: (1) select items for test form B such that the variance of test form B's item difficulties, sigma(2)(B), will be equal to test form A's sigma(2)(A), but test form B's mean item difficulty, mu(B), will be less that of test form A, mu(A); (2) given c(A) > n/2, select items for test form B such that mu(B) s(2)(A). To make c(B) lower than c(A), the direction of the changes listed above for the two tests item difficulties sigma(2) and mu should be reversed. Derivations of lemmas that underlie the guidelines are provided as well as a simulated example. 相似文献
3.
This paper examines the impact of differential item functioning (DIF), missing item values, and different methods for handling missing item values on theta estimates with data simulated from the partial credit model and Andrich's rating scale model. Both Rasch family models are commonly used when obtaining an estimate of a respondent's attitude. The degree of missing data, DIF magnitude, and the percentage of DIF items were varied in MCAR data conditions in which the focal group was 10% of the total population. Four methods for handling missing data were compared: complete-case analysis, mean substitution, hot-decking, and multiple imputation. Bias, RMSE, means, and standard errors of the theta estimates for the focal group were adversely affected by the amount and magnitude of DIF items. RMSE and fidelity coefficients for both the reference and focal group were adversely impacted by the amount of missing data. While all methods of handling missing data performed fairly similarly, multiple imputation and hot-decking showed slightly better performance. 相似文献
4.
5.
The Rasch model-based vertical scaling was evaluated by simulation study with respect to recovery of item parameter, linking constant, population mean (grade-to-grade growth), population standard deviation (grade-to-grade variability), and separation of grade distributions by effect size. The simulated vertical scale had five different grades with five different test levels. Controlled factors were data collection design, linking methods, and sample size. For item parameter, linking constant, and population mean, counter-balanced single group (CBSG) with mean/mean (or fixed item) method and concurrent calibration performed best. The population standard deviation recovery, as sample size increases, did not show systematic improvement across different data collection and linking methods. For the separation of grade distributions, CBSG with mean/mean (or fixed item) methods performed best. The average absolute differences from the true parameters were less than 0.1 in logit across different linking methods. In general the differences between different linking methods were less than those between different sample sizes. 相似文献
6.
Component parameter monitoring reliability is estimated when the errors of measurement are dependent on the parameters. The solution amounts to a standard classical evaluation of monitoring reliability with statistical independence between the errors of measurement and the parameters.Translated from Izmeritel'naya Tekhnika, No. 2, pp. 10–12, February, 1994. 相似文献
7.
Consider a heteroscedastic regression model Y=m(X)+σ(X)ε, where m(X)=E(Y|X) and σ 2(X)=Var (Y|X) are unknown, and the error ε is independent of the covariate X. We propose a new type of test statistic for testing whether the regression curve m(⋅) belongs to some parametric family of regression functions. The proposed test statistic measures the distance between the empirical distribution function of the parametric and of the nonparametric residuals. The asymptotic theory of the proposed test is developed, and the proposed testing procedure is illustrated by means of a small simulation study and the analysis of a data set. 相似文献
8.
Babiar TC 《Journal of applied measurement》2011,12(2):144-164
Traditionally, women and minorities have not been fully represented in science and engineering. Numerous studies have attributed these differences to gaps in science achievement as measured by various standardized tests. Rather than describe mean group differences in science achievement across multiple cultures, this study focused on an in-depth item-level analysis across two countries: Spain and the United States. This study investigated eighth-grade gender differences on science items across the two countries. A secondary purpose of the study was to explore the nature of gender differences using the many-faceted Rasch Model as a way to estimate gender DIF. A secondary analysis of data from the Third International Mathematics and Science Study (TIMSS) was used to address three questions: 1) Does gender DIF in science achievement exist? 2) Is there a relationship between gender DIF and characteristics of the science items? 3) Do the relationships between item characteristics and gender DIF in science items replicate across countries. Participants included 7,087 eight grade students from the United States and 3,855 students from Spain who participated in TIMSS. The Facets program (Linacre and Wright, 1992) was used to estimate gender DIF. The results of the analysis indicate that the content of the item seemed to be related to gender DIF. The analysis also suggests that there is a relationship between gender DIF and item format. No pattern of gender DIF related to cognitive demand was found. The general pattern of gender DIF was similar across the two countries used in the analysis. The strength of item-level analysis as opposed to group mean difference analysis is that gender differences can be detected at the item level, even when no mean differences can be detected at the group level. 相似文献
9.
10.
The paper discusses the pitch error of the tooth rotor in a magnetoelectric angle convert or the error of measurement when the stator has one, two, or four readout one tooth heads.Translated from Izmeritel'naya Tekhnika, No. 3, pp. 38–40, March, 1995. 相似文献
11.
On specimens of rail steel, experimental research on the propagation features of fatigue cracks of the mixed type is carried out. The direction of the growth of fatigue cracks at three kinds of loading was studied: sign-variable cross shear, and cross and longitudinal shear in the compressing stress field. One general feature of fatigue crack development for all researched specimens is established: the growth of crack direction coincides with a trajectory of the principal stresses, minimal on the module. 相似文献
12.
On specimens of rail steel, experimental research on the propagation features of fatigue cracks of the mixed type is carried out. The direction of the growth of fatigue cracks at three kinds of loading was studied: sign-variable cross shear, and cross and longitudinal shear in the compressing stress field. One general feature of fatigue crack development for all researched specimens is established: the growth of crack direction coincides with a trajectory of the principal stresses, minimal on the module. 相似文献
13.
铁电阴极几何参数对二极管电子发射的影响 总被引:1,自引:0,他引:1
主要讨论了触发电场的脉宽和阴极几何参数对铁电二极管电子发射的影响,通过对不同脉宽激励下的电子发射试验发现触发脉冲的最佳工作宽度范围是150-250ns。对不同几何参数铁电阴极片在正脉冲激励下的电子发射研究指出:一般实验条件下(触发电压小于4kV/mm),触发梯度电场相同时,阴极越厚,条栅电极越细密,发射电流密度越大;极化反转发射中前沿发射方式和后沿发射方式可以共时存在。 相似文献
14.
15.
16.
17.
The diffusion coefficients of ions and radionuclides in cementitious materials are the basic parameters to evaluate the state of the degradation of structures. In this article, three different tracers (two ions, and a radionuclide) were tested on the same formulations of mortars (sand volume fractions from 0 to 60%) in terms of the through-out diffusion, to determine the effective diffusion coefficients of each tracer and each formulation. The aim of this study is to prove the validity of the formation factor equation relating the effective diffusivity of a tracer in cementitious material to its diffusion coefficient in pure water. This result is extremely interesting because once the geometric formation factor of a material is known, it is possible to determine the values of the effective diffusion coefficients of any other diffusing species in this material. 相似文献
18.
19.
To determine the dynamic characteristics of the strength of steels used in manufacture of thick-walled pipes used in the conditions of pulsed internal loading the authors determined experimental equipment and a method of determining the strength characteristics of the material. The results of tests on a number of structural steels are presented.Translated from Problemy Prochnosti, No. 2, pp. 46–51, February, 1993. 相似文献