首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This research examined empirical evidence for a new construct, Functional Caregiving, which is a theory about mothers' caregiving of their adult children with intellectual disabilities. A sample of 108 biological mothers and primary caregivers rated survey items about their confidence to perform caregiving tasks. Rasch rating scale analysis found 61 items defined an empirical construct with three caregiving levels: Advocacy, Personal Caregiving, and Community. Results show item separation was 3.11 with high reliability, .91, and mother separation was 2.93 and reliability, .90. Both items and mothers showed adequate INFIT and OUTFIT values. Item invariance was confirmed between older and younger mothers, and principle components analysis of item residuals did not reveal any major dimensionality threats. Item decomposition analysis showed FC content theory to account for 58 percent of item calibration variance (R2 = .58, F = 42.3, p < .001). These results have important practical implications for health and social services, as well as family caregiving, interdisciplinary practices, and health policy development.  相似文献   

2.
In test analysis involving the Rasch model, a large degree of importance is placed on the "objective" measurement of individual abilities and item difficulties. The degree to which the objectivity properties are attained, of course, depends on the degree to which the data fit the Rasch model. It is therefore important to utilize fit statistics that accurately and reliably detect the person-item response inconsistencies that threaten the measurement objectivity of persons and items. Given this argument, it is somewhat surprising that there is far more emphasis placed in the objective measurement of person and items than there is in the measurement quality of Rasch fit statistics. This paper provides a critical analysis of the residual fit statistics of the Rasch model, arguably the most often used fit statistics, in an effort to illustrate that the task of Rasch fit analysis is not as simple and straightforward as it appears to be. The faulty statistical properties of the residual fit statistics do not allow either a convenient or a straightforward approach to Rasch fit analysis. For instance, given a residual fit statistic, the use of a single minimum critical value for misfit diagnosis across different testing situations, where the situations vary in sample and test properties, leads to both the overdetection and underdetection of misfit. To improve this situation, it is argued that psychometricians need to implement residual-free Rasch fit statistics that are based on the number of Guttman response errors, or use indices that are statistically optimal in detecting measurement disturbances.  相似文献   

3.
As a practical matter, Spirituality and Quality of Life in the health sciences are usually measured separately. Theoretical foundations for this distinction, however, are not strong. In this research, an empirical investigation was conducted into their joint calibration with a Rasch model. Functional Assessment of Cancer Therapy-General (28 items), a cancer health-related quality of life measure (HRQOL), and Functional Assessment of Chronic Illness - Spiritual Well-Being (12 items), a measure of religious and existential well-being (Spirituality), were co-calibrated with a Rasch model implemented with WINSTEPS software for ratings from 545 breast cancer patients. The results show a hierarchical integration of QOL and Spirituality items on a common variable, and both patient separation (2.66) and reliability (.88) improve after co-calibration. Principal Component Analysis of co-calibrated item residuals did not show major threats to dimensionality, and joint calibration explains item variance comparable to separate calibrations (51.9%). Although patient measures (logits) based on separate and co-calibration are within two standard errors, ethnic and racial group values shift after co-calibration.  相似文献   

4.
The purpose of this study was to evaluate the effects of item grouping on local independence and item invariance, the characteristics of items scaled under the Rasch model that make them sample-free. Item fit and calibration for attitude items presented in a grouped versus random order were examined. It was hypothesized that grouping items to facilitate interpretation central to a construct may result in a failure of invariance. Data were 107 responses to a 40-item mail survey of teachers' opinions about the Ontario Ministry's grade 9 literacy test. Effects of grouping and item phrasing on invariance were found. Results, however, generally support the use of grouping of items to provide a higher person separation, and potentially higher quality data.  相似文献   

5.
6.
There has been some discussion among researchers as to the benefits of using one calibration process over the other during equating. Although literature is rife with the pros and cons of the different methods, hardly any research has been done on anchoring (i.e., fixing item parameters to their pre-determined values on an established scale) as a method that is commonly used by psychometricians in large-scale assessments. This simulation research compares the fixed form of calibration with the concurrent method (where calibration of the different forms on the same scale is accomplished by a single run of the calibration process, treating all non-included items on the forms as missing or not reached), using the dichotomous Rasch (Rasch, 1960) and the Rasch partial credit (Masters, 1982) models, and the WINSTEPS (Linacre, 2003) computer program. Contrary to the belief and some researchers' contention that the concurrent run with larger n-counts for the common items would provide greater accuracy in the estimation of item parameters, the results of this paper indicate that the greater accuracy of one method over the other is confounded by the sample-size, the number of common items, etc., and there is no real benefit in using one method over the other in the calibration and equating of parallel tests forms.  相似文献   

7.
The analysis of fit, whether viewed from the prospective of the fit of the data to the measurement model, or the fit of the measurement model to the data, is an important part of using latent trait models. In the case of the Rasch model, all of the desirable characteristics of the model (interval item and person measures, asymptotic standard errors, parameter invariance across subsets of persons or items, to name a few) are predicated on the requirement that the data fit the model. To the extent that the data do not fit the model, these properties hold to a lesser degree. The analysis of fit is of primary importance if the interpretation of the calibration results is to be useful. This article explores the nature of fit and provides a historical overview of fit indices. It then focuses on a particular family of fit indices that are based on the Pearsonian chi-square approach to fit, in an attempt to show why it is necessary to use a family of standardized fit indices to completely understand the relationship between the data and the model.  相似文献   

8.
The item infit and outfit mean square errors (MSE) and their t-transformed statistics are widely used to screen poorly fitting items. The t-transformed statistics, however, do not follow the standard normal distribution so that hypothesis testing of item fit based on the conventional critical values is likely to be inaccurate (Wang and Chen, 2005). The MSE statistics are effect-size measures of misfit and have an expected value of unity when the data fit the model's expectation. Unfortunately, most computer programs for item response analysis do not report confidence intervals of the item infit and outfit MSE, mainly because their sampling distributions are analytically intractable. Hence, the user is left without interval estimates of the magnitudes of misfit. In this study, we developed a FORTRAN 90 computer program in conjunction with the commercial program WINSTEPS (Linacre, 2001) that yields confidence intervals of the item infit and outfit MSE using the bootstrap method. The utility of the program is demonstrated through three illustrations of simulated data sets.  相似文献   

9.
Occupational therapists do not have a comprehensive, objective method for measuring how persons with tetraplegia perform activities of daily living (ADL) in their homes and communities, because SCI ADL performance is usually determined in rehabilitation. The ADL Habits Survey (ADLHS) is designed specifically to address this knowledge gap by surveying performance on relevant and meaningful activities in homes and communities. After a comprehensive task analysis and pilot development, 30 activities were selected that emphasize a broad range of hand and wrist, reaching, and grasping movements in compound activities. A sample of 49 persons with cervical spinal cord injuries responded to items. The sample was predominantly male, median age was 41 years, and ASIA motor classification levels ranged from C2 through C8/T1 with majority concentration in C4, C5, or C6 (68%). Each participant report was rated by an occupational therapist using a seven category rating scale, and the item by participant response matrix (30 X 49) was analyzed with a Rasch model for rating scales. Results showed excellent participant separation (>4) and very high reliability (>.95), and both item and participant fit values were adequate (STANDARDIZED INFIT less than absolute value of 3). With only two exceptions, all participants fit the Rasch rating scale model, and only one item "Light housekeeping" presented significant fit issues. Principal Components Analysis an analysis of item residuals did not reveal serious threats to unidimensionality. A between group fit comparison of participants with more versus less movement found invariant item calibrations, and ANOVA of participant measures found statistically significant differences across ASIA motor classification levels. These ADLHS results offer occupational therapists a new method for measuring ADL that is potentially more sensitive to functional changes in tetraplegia than most instruments in common use. Accommodation of step disorder with a three category rating scale did not diminish measurement properties.  相似文献   

10.
Is it possible to establish a consistent, stable relationship between the structure of number and additive amounts of mindfulness practice? A bank of thirty items, constructed from a review of the literature and from novice practitioners' journal responses to mindfulness practice, comprised the instrument. A convenience sample of students in a teacher education program participated. The WINSTEPS Rasch measurement software was used for all analyses. Measurement separation reliability was 0.92 and item separation reliability was 0.98, with satisfactory model fit. The 30 items measure a single construct of mindfulness practice. Construct validity was supported by the meaningfulness of the items perceived as easy to hard. The same scale was produced when the items were calibrated separately on the T1 and T2 groups (Rsq = 0.83). The experimental group's T2 measures were significantly different from both its own T1 measures and the control group's T1 and T2 measures. ANOVA showed significance for variance between the experimental and control groups for T2 (F = 43.66, 151 d.f., p < .001) for a nearly two-logit (20 unit) difference (48.9 vs. 68.0). The study is innovative in its demonstration of mindfulness practice as a measurable variable.  相似文献   

11.
The partial credit model (PCM) is commonly employed to parameterize items and individuals using responses to a set of polytomous items. Because the PCM does not include a discrimination parameter, it may encounter substantial lack of fit to the data in certain situations. To determine the impact of model misfit on the estimation of person and item parameters using the PCM, a simulation study was conducted in which data were generated according to the generalized partial credit model, and the bias and efficiency of the resulting person and item parameter estimates were assessed. The results suggest that small amounts of unsystematic misfit do not lead to dramatic levels of bias or loss of efficiency of the estimators, but large levels of unsystematic misfit and moderate levels of systematic misfit result in substantial loss of efficiency and bias of the estimators.  相似文献   

12.
The Assessment of Growth Hormone Deficiency in Adults (AGHDA) questionnaire was previously designed, translated and validated in several European countries to evaluate the impact of the disease on Quality of Life. This study aimed to test the metric properties of the Spanish version by means of Rasch analysis. A sample of 356 consecutive adult patients with untreated GHD was included in the study. Patients responded to the questionnaire at baseline and 12 months apart. Answers were analyzed following the dichotomous logistic response model. Parameter estimates, model-data fit and separation statistics were computed. The invariance of the item parameters across time was tested in the follow-up. Rasch results were additionally employed to ascertain score changes through the calculation of the Reliable Change Index (RCI). Items varied in severity from 8.3 -16.8 units (SE= 0.4-0.5) and fit to define a unidimensional variable. The item separation index (SI)(5.2) indicates a good and reliable (0.9) separation of items along the variable that they define. Moreover, results showed the AGHDA conforms to the model expectation of item parameter invariance between administrations. The substantial (2.3) and reliable (0.8) person SI also suggests the sample was well targeted by the questionnaire. According to the RCI, 84% of the patients did not show a significant transition in their measures. Results denote the items of the AGHDA succeeded in defining a scale characterized by the interval-level of its measures, suggesting the questionnaire could be a useful complement of the clinical evaluation of GHD patients at both group and individual level.  相似文献   

13.
The classification of rheumatic diseases is challenging because these diseases have protean and frequently overlapping clinical and laboratory manifestations. This problem is typified by the difficulty of classification and differentiation of two prototypic multi-system autoimmune diseases, Systemic Lupus Erythematosus (SLE) and Mixed Connective Tissue Disease (MCTD). The researchers submitted medical risk factor data represented by instrument or laboratory measures and physician judgments (12 key features for SLE) from 43 patients diagnosed with SLE and 12 key features for MCTD from 51 patients diagnosed with MCTD to the WINSTEPS Rasch analysis program. Using Rasch model parameterization, and fit and residuals analyses, the researchers identified separate dimensions for MCTD and SLE, thereby lending support to the position that MCTD is its own separate disease, distinct from SLE.  相似文献   

14.
Using data from the PISA 2006 field trial, Rasch item response models are used to demonstrate that extreme response tendency was exhibited differentially across culturally distinct countries when answering Likert type attitude items. A single attitude scale is examined across eight culturally distinct countries in this paper. Two avenues to ameliorate this tendency are explored: first using dichotomous variants of the items, and second incorporating the country specific response tendency into the Rasch item response model. Analysis of the item variants reveals similar scale outcomes and correlations with achievement but preference for the Likert variant when test information is considered. A hierarchical analysis using facet models reveals that the data fit significantly better in a model that incorporates an interaction effect between the country and the item delta parameters. The implications for reporting attitudes measured with Likert items across cultures are outlined.  相似文献   

15.
The Rasch family of models displays several well-documented properties that distinguish them from the general item response theory (IRT) family of measurement models. This paper describes an additional unique property of Rasch models, referred to as the property of item information constancy. This property asserts that the area under the information function for Rasch models is always equal to the number of response categories minus one, regardless of the values of the item location parameters. The implication of the property of item information constancy is that, for a given number of response categories, all items following a Rasch model contribute equally to the height of the test information function across the entire latent continuum.  相似文献   

16.
This paper reports the use of a Rasch measurement model, the Extended Logistic Model of Rasch (Andrich, 1988), to explore the construct of a general motor ability in young children. Data were collected from 332 five and six year old children performing 24 motor skills, including run, hop, balance and ball skills. The data were categorised based on threshold estimates provided by the measurement model. Gender differences in performances on items were hypothesised to contribute to initial item and person misfit for the total sample. The data for boys and for girls were separated and independently analysed resulting in improved item and person fit. Two different, unidimensional scales for boys and for girls were created.  相似文献   

17.
This article contains information on the Rasch measurement partial credit model: what it is, how it differs from other Rasch models, when to use it, and how to use it. The calibration of instruments with increasingly complex items is described, starting with dichotomous items and moving on to polychotomous items using a single rating scale, and mixed polychotomous items using multiple rating scales, and instruments in which each item has its own rating scale. It also introduces a procedure for aligning rating scale categories to be used when more than one rating scale is used in a single instrument. Pivot anchoring is defined and an illustration of its use with the mental health scale of the SF-36 that contains positive and negative worded items is provided. It finally describes the effect of pivot anchoring on step calibrations, the item hierarchy, and person measures.  相似文献   

18.
The aim is to show that it is possible to parameterize discrimination for sets of items, rather than individual items, without destroying conditions for sufficiency in a form of the Rasch model. The form of the model is obtained by formalizing the relationship between discrimination and the unit of a metric. The raw score vector across item sets is the sufficient statistic for the person parameter. Simulation studies are used to show the implementation of conditional estimation solution equations based on the relevant form of the Rasch model. The model also applied to two numeracy tests attempted by a group of common persons in a large-scale testing program. The results show improved fit compared with the Rasch model in its standard form. They also show the units of the scales were more accurately equated. The paper discusses implications for applied measurement using Rasch models and contrasts the approach with the application of the two parameter logistic (2PL) model.  相似文献   

19.
Variable construction requires careful attention to substantive issues; a theory guiding its development, a hierarchy of illustrative items constructed to define the variable, the subsequent production of item difficulties and person measures, and the analysis of fit. Rasch measurement practitioners should give careful attention to these matters so practical suggestions are given for designing variables based on theory, item construction, and Rasch models for the analysis of data. Variable maps are emphasized to guide variable construction and interpret the results.  相似文献   

20.
The Rasch measurement model using dichotomous scoring of item response data from a newly created Mobility Scale administered to elderly independent living individuals is presented. The dichotomous scoring model, item calibration, person calibration, logit scale, normative scale score, reliability, and validity are explained. Results indicated that additional activity statements need to be written and tested to improve the Mobility Scale instrument.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号