首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
A conventional way to analyze item responses in multiple tests is to apply unidimensional item response models separately, one test at a time. This unidimensional approach, which ignores the correlations between latent traits, yields imprecise measures when tests are short. To resolve this problem, one can use multidimensional item response models that use correlations between latent traits to improve measurement precision of individual latent traits. The improvements are demonstrated using 2 empirical examples. It appears that the multidimensional approach improves measurement precision substantially, especially when tests are short and the number of tests is large. To achieve the same measurement precision, the multidimensional approach needs less than half of the comparable items required for the unidimensional approach. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

2.
[Correction Notice: An erratum for this article was reported in Vol 4(2) of Psychological Methods (see record 2008-09593-001). This article contained errors in a series of equations. The corrected equations are provided.] Models of the Wiener simplex type are described for a single participant's change in a multiwave study. Individual test score change models for continuous scores are based on classical test theory and for discrete scores on the binomial error model. Subsequently, these models are generalized to individual item response change models. In addition, specific models are specified for the item change parameters. The emphasis is on single-subject change models, but they can be extended to population models by making assumptions at the population level. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

3.
测验信息量是项目反应理论中的一个重要概念,是预测和评价一个测验的误差大小的重要方法.针对在测试中应用较广的逻辑斯特模型和正态曲线模型,本文指出了相应的测验信息函数的表达式,并将其应用于试卷质量的分析中,为试卷的量化分析和选择提供了参考。  相似文献   

4.
Generalized linear item response theory is discussed, which is based on the following assumptions: (1) A distribution of the response occurs according to given item format; (2) the item responses are explained by 1 continuous or nominal latent variable and p latent as well as observed variables that are continuous or nominal; (3) the responses to the different items of a test are independently distributed given the values of the explanatory variables; and (4) a monotone differentiable function g of the expected item response τ is needed such that a linear combination of the explanatory variables is a predictor of g(τ). It is shown that most of the well-known psychometric models are special cases of the generalized theory and that concepts such as differential item functioning, specific objectivity, reliability, and information can be subsumed under the generalized theory. (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

5.
Reports an error in "The measurement of individual change" by Gideon J. Mellenbergh and Wulfert P. van den Brink (Psychological Methods, 1998[Dec], Vol 3[4], 470-485). This article contained errors in a series of equations. The corrected equations are provided in the erratum. (The following abstract of the original article appeared in record 1998-11538-005.) Models of the Wiener simplex type are described for a single participant's change in a multiwave study. Individual test score change models for continuous scores are based on classical test theory and for discrete scores on the binomial error model. Subsequently, these models are generalized to individual item response change models. In addition, specific models are specified for the item change parameters. The emphasis is on single-subject change models, but they can be extended to population models by making assumptions at the population level. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

6.
7.
"Two forms of a 20-item test of creativity were developed through analyses of item response data of 345 engineering students at Purdue University. Three scores were developed for the test: Fluency score, Flexibility score, and Originality score. Investigations of the validity, reliability, interscorer agreement, relationships with other tests, and 'face validity' of the Creativity scores were made with 64 product development engineers and process engineers in a large automobile accessories manufacturing company." Significant validity was found (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

8.
Person-fit statistics have been proposed to investigate the fit of an item score pattern to an item response theory (IRT) model. The author investigated how these statistics can be used to detect different types of misfit. Intelligence test data were analyzed using person-fit statistics in the context of the G. Rasch (1960) model and R. J. Mokken's (1971, 1997) IRT models. The effect of the choice of an IRT model to detect misfitting item score patterns and the usefulness of person-fit statistics for diagnosis of misfit are discussed. Results showed that different types of person-fit statistics can be used to detect different kinds of person misfit. Parametric person-fit statistics had more power than nonparametric person-fit statistics. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

9.
Current interest in the assessment of measurement equivalence emphasizes 2 major methods of analysis. The authors offer a comparison of a linear method (confirmatory factor analysis) and a nonlinear method (differential item and test functioning using item response theory) with an emphasis on their methodological similarities and differences. The 2 approaches test for the equality of true scores (or expected raw scores) across 2 populations when the latent (or factor) score is held constant. Both approaches can provide information about when measurrment nonequivalence exists and the extent to which it is a problem. An empirical example is used to illustrate the 2 approaches. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

10.
"On the basis of the psychoanalytic concept of primary process, it was predicted that schizophrenics would show a greater tendency than normals to treat both antonyms and homonyms as synonyms. The instrument used was a multiple-choice paper-and-pencil test. Each item required the subject to select a synonym from among three alternatives, a synonym, either an antonym or a homonym, and a third word which was neither… . The schizophrenics exceeded the normal subjects on this corrected score for both the antonym task and the homonym task. A possible non-Freudian interpretation of the data in terms of learning theory generalization is discussed." From Psyc Abstracts 36:01:3JQ55B. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

11.
Source monitoring refers to the discrimination of the origin of information. Multinomial models of source monitoring (W. H. Batchelder & D. M. Riefer, 1990) are theories of the decision processes involved in source monitoring that provide separate parameters for source discrimination, item detection, and response biases. Three multinomial models of source monitoring based on different models of decision in a simple detection paradigm (one-high-threshold, low-threshold, and two-high-threshold models) were subjected to empirical tests. With a 3 (distractor similarity)?×?3 (source similarity) factorial design, the effect of difficulty of item detection and source discrimination on corresponding model parameters was examined. Only the source-monitoring model that is based on a two-high-threshold model of item recognition provides an accurate analysis of the data. Consequences for the use of multinomial models in the study of source monitoring are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
The time course of availability of associative and item information was examined by using a response signal procedure. Associative information discriminates between a studied pair of words and a pair with words from two different studied pairs. Item information is sufficient to discriminate between a studied pair and a pair not studied. In two experiments, discriminations that require associative information are delayed relative to those based on item information. Two additional experiments discount alternative explanations in terms of the time to encode the test items or task strategies. Examination of the global memory models of G. Gilland and R. M. Shiffrin, D. L. Hintzman, and B. B. Murdock (see PA, Vols 71:8340, 76:10832, and 69:4936, respectively) shows that the models treat item and associative information inseparably. Modifications to these models which can produce separate contributions for item and associative information do not predict any difference in their availablility. Two possible mechanisms for the delayed availability of associative information are considered: the involvement of recall in recognition and the time required to form a compound cue. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

13.
Short tests containing at most 15 items are used in clinical and health psychology, medicine, and psychiatry for making decisions about patients. Because short tests have large measurement error, the authors ask whether they are reliable enough for classifying patients into a treatment and a nontreatment group. For a given certainty level, proportions of correct classifications were computed for varying test length, cut-scores, item scoring, and choices of item parameters. Short tests were found to classify at most 50% of a group consistently. Results were much better for tests containing 20 or 40 items. Small differences were found between dichotomous and polytomous (5 ordered scores) items. It is recommended that short tests for high-stakes decision making be used in combination with other information so as to increase reliability and classification consistency. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

14.
We have developed a set of naming and recognition tests for evaluating the retrieval of lexical and conceptual knowledge for actions. As a first step, normative information about 280 items was collected for the following variables: (1) the naming responses elicited by each item, (2) the degree to which the image of each item agreed with a target name, (3) the familiarity to each depicted action, and (4) the visual complexity of each item. This information was used to develop administration and scoring procedures for a standardized test of action naming. The effectiveness and reliability of these procedures were evaluated in a second experiment. In a third experiment, five tests were developed to probe the retrieval of conceptual knowledge: (1) independently of the production of a naming response, (2) in response to pictorial and nonpictorial stimuli, (3) in terms of the attributes associated with specific actions, and (4) in terms of similarities and differences between various actions.  相似文献   

15.
Two models that can be used for exploratory factor analysis of items with a dichotomous response format are discussed: threshold models and multidimensional item response models. The models arise from different traditions: The threshold model is rooted in the factor analytic tradition, the multidimensional item response model had its foundation in item response theory. Despite the different origins, it can be proved that both models are the same. Subsequently, the generalized multidimensional Rasch model is introduced. This model can be used for confirmatory factor analysis of items with a dichotomous response format. Stated otherwise, it is the confirmatory counterpart of the (exploratory) threshold and multidimensional item response models. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

16.
In assessments of attitudes, personality, and psychopathology, unidimensional scale scores are commonly obtained from Likert scale items to make inferences about individuals' trait levels. This study approached the issue of how best to combine Likert scale items to estimate test scores from the practitioner's perspective: Does it really matter which method is used to estimate a trait? Analyses of 3 data sets indicated that commonly used methods could be classified into 2 groups: methods that explicitly take account of the ordered categorical item distributions (i.e., partial credit and graded response models of item response theory, factor analysis using an asymptotically distribution-free estimator) and methods that do not distinguish Likert-type items from continuously distributed items (i.e., total score, principal component analysis, maximum-likelihood factor analysis). Differences in trait estimates were found to be trivial within each group. Yet the results suggested that inferences about individuals' trait levels differ considerably between the 2 groups. One should therefore choose a method that explicitly takes account of item distributions in estimating unidimensional traits from ordered categorical response formats. Consequences of violating distributional assumptions were discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
Evaluated the effects of increasing scale reliability and of priming dimensional activation on the construct validity of personality test item differential response latencies. 93 undergraduates were computer-administered items from the 3 lengthened scales of the Personality Research Form. Findings show that differential response latencies can demonstrate remarkably strong evidence for their construct validity. In addition, the priming of personality traits may enhance the construct validity of their corresponding differential response latencies. Results support a model of test item responding where differential item response latencies reflect the interaction of an individual's schema with test item content. (French abstract) (PsycINFO Database Record (c) 2011 APA, all rights reserved)  相似文献   

18.
Examined, using item response theory, the measurement qualities of the Mississippi Scale for Combat-Related Posttraumatic Stress Disorder, with data taken from the 2,348 veteran participants in the National Vietnam Veterans Readjustment Study. Using F. Samejima's (1969) graded response model, estimates of each item's discrimination and difficulty parameters were derived, and item and test information functions were then computed. Various item information patterns and sample items are discussed in terms of improved assessment of posttraumatic stress disorder (PTSD). (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

19.
RATIONALE AND OBJECTIVES: Evaluation of uncued multiple-choice questions (UMCQ) was compared with traditional multiple-choice questions (MCQ) for assessing medical student performance during radiology electives. Methods for analyzing and improving the quality of UMCQ examinations are described. METHODS: The authors compared the performance of radiology medical students on similarly constructed MCQ and UMCQ tests. For the UMCQ examination, the reliability (coefficient alpha), standard error of measurement, item difficulty index, and corrected item-to-total test coefficient (point biserial correlation) were analyzed. RESULTS: Students' level of performance was lower on UMCQs (mean percent correct score = 68.9 +/- 10.2 standard deviation [SD]) than on MCQs (mean percent correct score = 75.6 +/- 12.4 SD). Coefficient alpha for the UMCQ format was .7690 (standard error of measurement mean = 4.89%). Analysis of the item difficulty index and point biserial correlation for each test item provided information for improving the quality of the UMCQ examination. CONCLUSIONS: Because the UMCQ measures students' abilities to recall critical information without providing cues, this format can be used to overcome some of the limitations of conventional MCQs. With simple computerization, analysis of UMCQ testing instruments provides important feedback to both the examinees and the examiner.  相似文献   

20.
The Mattis Dementia Rating Scale (MDRS) is a commonly used cognitive measure designed to assess the course of decline in progressive dementias. However, little information is available about possible systematic racial bias on the items presented in this test. We investigated race as a potential source of test bias and differential item functioning in 40 pairs of African American and Caucasian dementia patients (N = 80), matched on age, education, and gender. Principal component analysis revealed similar patterns and magnitudes across component loadings for each racial group, indicating no clear evidence of test bias on account of race. Results of an item analysis of the MDRS revealed differential item functioning across groups on only 4 of 36 items, which may potentially be dropped to produce a modified MDRS that may be less sensitive to cultural factors. Given the absence of test bias because of race, the observed racial differences on the total MDRS score are most likely associated with group differences in dementia severity. We conclude that the MDRS shows no appreciable evidence of test bias and minimal differential item functioning (item bias) because of race, suggesting that the MDRS may be used in both African American and Caucasian dementia patients to assess dementia severity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号