共查询到20条相似文献,搜索用时 984 毫秒
1.
2.
四分位法和迭代法均是计算检测能力验证数据目标标准偏差的稳健统计分析方法, 其计算结果将直接用于判断实验室上报数据是否合格。利用195套(100多个检测项目)检测能力验证数据, 比较了两种稳健统计分析方法计算结果。模拟和实际数据计算结果均表明, 当数据分布趋势是正态分布时, 两种方法结果基本吻合, 但对于数据分布明显偏离正态分布检测项目, 四分位法计算得到的标准偏差过严, 导致实验室上报数据满意率明显下降, 出现统计学“弃真”错误。以删掉离群值后的经典统计法结果作为依据, 对两种方法合理性进行了判断。对于固体样品, 四分法与迭代法结果(标准偏差)比较接近, 两者相对偏差平均值为5.7%;液体样品为13.8%, 两者差别较大。迭代法既减小了离群值对统计分析结果的影响, 又避免了判断离群值时采用不同规则可能带来的人为因素影响。建议尽快将迭代法纳入国内检测能力验证数据评价标准体系。 相似文献
3.
4.
5.
6.
7.
8.
本史通过对有色建设行业实验检测机构水泥物理性能的实验室比对能力验证结果进行统计计算,客观分析了行业实验室的检测能力和检测项目需要注意的问题,对今后提高有色建设行业实验检测能力具有很好的指导作用。 相似文献
9.
《冶金分析》2008,28(8)
中实国金国际实验室能力验证研究中心(NIL)系专业从事实验室能力验证及测量审核工作的研究中心。中心奉行科学、严谨、客观、公正的质量方针,于2006年8月通过中国合格评定国家认可委员会关于能力验证提供者的认可(No.CNASPT0002),成为国内首家具有能力验证提供者资格的第三方独立法人机构。中实国金依据ISO/IEC导则43—1997《利用实验室间比对的能力验证》及ILAC G13:2000的通用要求组织实施国内外能力验证活动,客观、公正地为参加实验室出具评价报告,所开展的能力验证和测量审核结果可作为认可及资质认定机构判定实验室和检查机构检测能力的重要依据之一。 相似文献
10.
能力验证是指利用实验室间比对,按照预先制定的准则评价参加者能力的活动。四分位法和迭代法均是分析能力验证数据的稳健统计分析方法,可以用来计算数据目标标准偏差,评价实验室上报数据是否合格。通过分析多项检测能力验证数据,比较了两种稳健统计分析方法的计算结果。分析数据的结果表明,当上报数据的标准化四分位距与基于经验模型的再现性标准差之比(H值)大于2或稳健变异系数CV大于0.05时,各实验室间检测能力相差较大,能力验证数据分布较分散,迭代计算次数大于1次,标准化四分位距与迭代法计算的稳健标准差比值大于1,四分位法放宽了能力验证的评价标准。 相似文献
11.
用能力验证(包括测量审核和比对实验)进行检测项目的技术能力评估是实验室质量控制一种手段;国内外有许多标准和规范涉及能力验证结果的分析和评定,其中采用En数进行结果判断是一种基础方法。由于实际对象的复杂性,在采用En数进行结果判断时,一些情况往往被忽视,判断的合理性也需要关注;相关问题与测量测量不确定度的应用是联系在一起的。本文结合冶金检测的实例,说明采用En数进行结果判断应注意的问题,同时对实际检测中测量不确定度的应用的问题作一些分析和介绍,以期提供有益的思路。 相似文献
12.
根据2013—2018年期间土壤重金属元素(Ni、Cu、Zn、Pb、Cd、Cr、Hg、As)检测能力验证项目实验室参与情况,对各重金属元素检测的评价结果及采用的标准和检测方法进行了综合比较,分析了我国土壤中重金属检测的水平和发展趋势。结果表明,2013—2018年,我国土壤重金属检测实验室的检测水平逐年提高,参与能力验证的评价结果不满意率从2013年的8.6%降到了2018年的3.8%;获得认可的实验室能力验证平均满意率为92.9%,而非认可实验室平均满意率为88.3%;实验室检测所采用的标准以国家标准和环境标准为主,分别有57%和20%的实验室采用,其次是地矿、城建和农业等行业标准,国际标准的采用渐呈增长趋势,能达到4%。 相似文献
13.
根据2013—2018年期间土壤重金属元素(Ni、Cu、Zn、Pb、Cd、Cr、Hg、As)检测能力验证项目实验室参与情况,对各重金属元素检测的评价结果及采用的标准和检测方法进行了综合比较,分析了我国土壤中重金属检测的水平和发展趋势。结果表明,2013—2018年,我国土壤重金属检测实验室的检测水平逐年提高,参与能力验证的评价结果不满意率从2013年的8.6%降到了2018年的3.8%;获得认可的实验室能力验证平均满意率为92.9%,而非认可实验室平均满意率为88.3%;实验室检测所采用的标准以国家标准和环境标准为主,分别有57%和20%的实验室采用,其次是地矿、城建和农业等行业标准,国际标准的采用渐呈增长趋势,能达到4%。 相似文献
14.
在环境空气检测能力验证工作中,由于气体容器、充装方法等影响,气体样品的制备通常会采用逐瓶制备的方式,每瓶气体样品的指定值会存在一定的差异,因此气体检测能力验证结果往往采用En值法进行评价。采用En值法时,实验室测量不确定度直接影响能力验证评价结果,实验室正确评定其测量不确定度是En值法得以正确合理使用的必要条件。以空气中二氧化硫检测能力验证计划为例,通过分析En值与不确定度的关系,确定实验室测量不确定度的有效范围,并据此给出两组实验室的能力评定标准差分别为0.64 μmol/mol和1.23 μmol/mol;有效的不确定度范围分别为0.34~1.92 μmol/mol和0.66~3.69 μmol/mol,为有效采用En值法评价实验室结果和指导实验室正确评定测量不确定度提供参考。 相似文献
15.
JB Bjorner S Kreiner JE Ware MT Damsgaard P Bech 《Canadian Metallurgical Quarterly》1998,51(11):1189-1202
Statistical analyses of Differential Item Functioning (DIF) can be used for rigorous translation evaluations. DIF techniques test whether each item functions in the same way, irrespective of the country, language, or culture of the respondents. For a given level of health, the score on any item should be independent of nationality. This requirement can be tested through contingency-table methods, which are efficient for analyzing all types of items. We investigated DIF in the Danish translation of the SF-36 Health Survey, using two general population samples (USA, n = 1,506; Denmark, n = 3,950). DIF was identified for 12 out of 35 items. These results agreed with independent ratings of translation quality, but the statistical techniques were more sensitive. When included in scales, the items exhibiting DIF had only a little impact on conclusions about cross-national differences in health in the general population. However, if used as single items, the DIF items could seriously bias results from cross-national comparisons. Also, the DIF items might have larger impact on cross-national comparison of groups with poorer health status. We conclude that analysis of DIF is useful for evaluating questionnaire translations. 相似文献
16.
Change in adult intellectual performance was assessed with longitudinal data from the Intergenerational Studies at the Institute of Human Development. Wechsler Intelligence data from two age cohorts spanning ages 18 to 61 were analyzed at the subtest and item level. Hotelling T–2 analyses on sets of equivalent items from Wechsler subtests were studied to determine if change in response occurred between pairwise combinations of occasions of test administrations. We used Bowker's test to analyze data at the item level to determine the direction of change in performance. Consistent improvement in performance occurred between the ages of 18–40 and 18–54. Between the ages of 40 and 61, results showed mostly improved performance on the Information, Comprehension, and Vocabulary subtests, mixed change on the Picture Completion subtest, and decline on the Digit Symbol and Block Design subtests. The pattern of mixed change on the Picture Completion subtest indicated improvement on the easy items and decline on the difficult items. Decline in performance on the Block Design test occurred only for the most difficult items. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
17.
针对特钢企业指导生产的技术标准种类众多 ,关系复杂 ,使用不便等特点 ,提出了面向MES的特钢企业标准化数字化管理解决方案 ,该方案实现了与生产和检验密切相关的标准内容的数字化管理 ,同时还可以实时监控和反馈标准的执行情况 ;对系统进行了设计及实现 ,该系统的实施减少了生产过程中漏检、错检和错判情况的发生 ,从而使标准得到了良好的贯彻执行。 相似文献
18.
Oltman Philip K.; Stricker Lawrence J.; Barrows Thomas S. 《Canadian Metallurgical Quarterly》1990,75(1):21
Multidimensional scaling was used to analyze item response data for the Test of English as a Foreign Language (TOEFL) to uncover the dimensions underlying the test. Four dimensions were identified for samples varying in native language and level of English proficiency: 3 corresponded to the test's sections and 1 was an end-of-test phenomenon. Dimensions were predominantly defined by easy items and were most important for low-scoring examinees. The dimensions' importance did not differ across language groups, except for the end-of-test dimension. Major conclusions were that (a) the TOEFL measures the intended constructs; (b) the test assesses the same constructs in each language group, but the constructs are more differentiated for low-scorers; and (c) easy and difficult items differ in what they measure. Multidimensional scaling appears to be a useful method for item-level analyses of test structure. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
19.
A rating technique measuring feelings of knowing and a guilty knowledge polygraph test were used to distinguish between laboratory Ss who were either simulating amnesia or were genuinely amnesic to information contained in an account of a rape. Ss were 40 university students. Each S denying knowledge of an item rated from 1 to 7 the likelihood of recalling the item if given more time, a hint, or the item amongst similar items. In a 2nd interview with the polygraph, questions of which that S had denied memory were asked. Analyses revealed no differences between the feeling-of-knowing ratings given by genuine or simulating amnesics but found that skin resistance changes occurred more frequently to critical items on the guilty knowledge test with simulators than with those who were genuinely amnesic. (French abstract) (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
20.
Reise Steven P.; Ventura Joseph; Keefe Richard S. E.; Baade Lyle E.; Gold James M.; Green Michael F.; Kern Robert S.; Mesholam-Gately Raquelle; Nuechterlein Keith H.; Seidman Larry J.; Bilder Robert 《Canadian Metallurgical Quarterly》2011,23(1):245
A psychometric analysis of 2 interview-based measures of cognitive deficits was conducted: the 21-item Clinical Global Impression of Cognition in Schizophrenia (CGI-CogS; Ventura et al., 2008), and the 20-item Schizophrenia Cognition Rating Scale (SCoRS; Keefe et al., 2006), which were administered on 2 occasions to a sample of people with schizophrenia. Traditional psychometrics, bifactor analysis, and item response theory methods were used to explore item functioning and dimensionality and to compare instruments. Despite containing similar item content, responses to the CGI-CogS demonstrated superior psychometric properties (e.g., higher item intercorrelations, better spread of ratings across response categories) relative to the SCoRS. The authors argue that these differences arise mainly from the differential use of prompts and how the items are phrased and scored. Bifactor analysis demonstrated that although both measures capture a broad range of cognitive functioning (e.g., working memory, social cognition), the common variance on each is overwhelmingly explained by a single general factor. Item response theory analyses of the combined pool of 41 items showed that measurement precision is peaked in the mild to moderate range of cognitive impairment. Finally, simulated adaptive testing revealed that only about 10 to 12 items are necessary to achieve latent trait level estimates with reasonably small standard errors for most individuals. This suggests that these interview-based measures of cognitive deficits could be shortened without loss of measurement precision. (PsycINFO Database Record (c) 2011 APA, all rights reserved) 相似文献