期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Benchmarking Kappa: Interrater Agreement in Software Process Assessments

Khaled El Emam 《Empirical Software Engineering》1999,4(2):113-133

Software process assessments are by now a prevalent tool for process improvement and contract risk assessment in the software industry. Given that scores are assigned to processes during an assessment, a process assessment can be considered a subjective measurement procedure. As with any subjective measurement procedure, the reliability of process assessments has important implications on the utility of assessment scores, and therefore the reliability of assessments can be taken as a criterion for evaluating an assessment's quality. The particular type of reliability of interest in this paper is interrater agreement. Thus far, empirical evaluations of the interrater agreement of assessments have used Cohen's Kappa coefficient. Once a Kappa value has been derived, the next question is “how good is it?” Benchmarks for interpreting the obtained values of Kappa are available from the social sciences and medical literature. However, the applicability of these benchmarks to the software process assessment context is not obvious. In this paper we develop a benchmark for interpreting Kappa values using data from ratings of 70 process instances collected from assessments of 19 different projects in 7 different organizations in Europe during the SPICE Trials (this is an international effort to empirically evaluate the emerging ISO/IEC 15504 International Standard for Software Process Assessment). The benchmark indicates that Kappa values below 0.45 are poor, and values above 0.62 constitute substantial agreement and should be the minimum aimed for. This benchmark can be used to decide how good an assessment's reliability is. 相似文献

2.

Validation of an instrument for patient handling assessment

Radovanovic CA Alexandre NM 《Applied ergonomics》2004,35(4):321-328

Nursing personnel are at high risk from work-related musculoskeletal disorders, especially back symptoms. Handling patients has been established as one of the factors playing an important role in the etiology of occupational low back pain. The aim of this study was to develop an instrument for patient handling assessment and to determine its validity and reliability. Instrument validity was established based on content and construct validity. Reliability was estimated through homogeneity, stability (test-retest) and equivalence (interrater) tests. Reliability estimated by internal consistency reached a Cronbach's Alpha coefficient of 0.81. Pearson's correlation coefficient for test-retest reliability was r = 0.92. There was an excellent agreement between observers, according to the k values (Kappa = 0.92). Interobserver (interrater) reliability was assessed by Pearson's correlation coefficient, reaching an r value of 0.84. The agreement between both observers was also fairly good (Kappa = 0.84). The results of the current study show that the instrument seems to be reliable and valid for patient handling assessment. 相似文献

3.

Evaluation of interrater reliability for posture observations in a field study

Burt S Punnett L 《Applied ergonomics》1999,30(2):121-135

This paper examines the interrater reliability of a quantitative observational method of assessing non-neutral postures required by work tasks. Two observers independently evaluated 70 jobs in an automotive manufacturing facility, using a procedure that included observations of 18 postures of the upper extremities and back. Interrater reliability was evaluated using percent agreement, kappa, intraclass correlation coefficients and generalized linear mixed modeling. Interrater agreement ranged from 26% for right shoulder elevation to 99 for left wrist flexion, but agreement was at best moderate when using kappa. Percent agreement is an inadequate measure, because it does not account for chance, and can lead to inflated measures of reliability. The use of more appropriate statistical methods may lead to greater insight into sources of variability in reliability and validity studies and may help to develop more effective ergonomic exposure assessment methods. Interrater reliability was acceptable for some of the postural observations in this study. 相似文献

4.

K-Means Clustering Versus Validation Measures: A Data-Distribution Perspective 总被引：1，自引：0，他引：1

《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2009,39(2):318-331

K-means is a well-known and widely used partitional clustering method. While there are considerable research efforts to characterize the key features of the K-means clustering algorithm, further investigation is needed to understand how data distributions can have impact on the performance of K-means clustering. To that end, in this paper, we provide a formal and organized study of the effect of skewed data distributions on K-means clustering. Along this line, we first formally illustrate that K-means tends to produce clusters of relatively uniform size, even if input data have varied “true” cluster sizes. In addition, we show that some clustering validation measures, such as the entropy measure, may not capture this uniform effect and provide misleading information on the clustering performance. Viewed in this light, we provide the coefficient of variation (CV) as a necessary criterion to validate the clustering results. Our findings reveal that K-means tends to produce clusters in which the variations of cluster sizes, as measured by CV, are in a range of about 0.3–1.0. Specifically, for data sets with large variation in “true” cluster sizes (e.g., $ hbox{CV} ≫ 1.0$), K-means reduces variation in resultant cluster sizes to less than 1.0. In contrast, for data sets with small variation in “true” cluster sizes (e.g., $hbox{CV} ≪ 0.3$), K-means increases variation in resultant cluster sizes to greater than 0.3. In other words, for the earlier two cases, K-means produces the clustering results which are away from the “true” cluster distributions. 相似文献

5.

A heuristic index for selecting similar categories in multiple correspondence analysis applied to living donor kidney transplantation

Costa JC Almeida RM Infantosi AF Suassuna JH 《Computer methods and programs in biomedicine》2008,90(3):217-229

This work introduces a heuristic index (the “tolerance distance”) to define the “closeness” of two variable categories in multiple correspondence analysis (MCA). This index is a weighted Euclidean distance where weightings are based on the “importance” of each MCA axis, and variable categories were considered to be associated when their distances were below the tolerance distance. This approach was applied to a renal transplantation data. The analysed variables were allograft survival and 13 of its putative predictors. A bootstrap-based stability analysis was employed for assessing result reliability. The method identified previously detected associations within the database, such as that between race of donors and recipients, and that between HLA match and Cyclosporine use. A hierarchical clustering algorithm was also applied to the same data, allowing for interpretations similar to those based on MCA. The defined tolerance distance could thus be used as an index of “closeness” in MCA, hence decreasing the subjectivity of interpreting MCA results. 相似文献

6.

PROPOV-K: a FORTRAN program for computing a kappa coefficient using a proportional overlap procedure

C W Ahn J E Mezzich 《Computers and biomedical research》1989,22(5):415-423

The computer program PROPOV-K allows the computation of an unweighted kappa coefficient for expressing interrater agreement in the general case in which multiple raters (not necessarily fixed in number) formulate a variable number of multiple diagnoses for each subject. PROPOV-K assesses agreement among lists of multiple diagnoses composed of nonordered categories. PROPOV-K calculates a kappa coefficient on the basis of estimating proportion of agreement between two diagnostic formulations as the ratio of the number of agreements between specific categories over the number of different specific categories mentioned in the two diagnostic lists. When multiple raters formulate a variable number of multiple diagnoses for each subject, the use of a kappa coefficient has been limited to researchers since there are no generally available computer programs. The purpose of this paper is to present a FORTRAN computer program allowing the computation of a kappa coefficient for the case mentioned above and to illustrate its use with examples respectively involving multiple psychiatric and multiple physical diagnoses. 相似文献

7.

Simulation and automation of calculations of buildings (structures) on seismic effects

V.K. Yegupov K.V. Yegupov V.I. Starodub P.P. Mazur A.V. Kostrjitskiy 《Computers & Structures》1997,63(6):1065-1083

In many countries there are standards as one-dimensional, console systems for calculations of buildings, which stay in contradiction with the experience of destructive earthquakes. This scientific work offers an explanation for the transfer from one-dimensional to three-dimensional models of different complexity in standards. Discretely-continuous and discrete models of buildings have already been worked out as unified three-dimensional systems with floors deforming in their own plane.

We have considered models of seismic effects, which take into account the effect of the running seismic wave under the ground and the effect of non-uniformity of oscillation field along an extended building or a structure. We pointed out paradoxes in calculations while using three-dimensional models of buildings and “zero-dimensional” (normative) models of soil seismic effects on their grounds. In relation to this problem, we have executed the correction of the formula for determination of seismic forces.

Variational methods of making up the solving equations have been developed for calculation of oscillations of buildings as dynamic systems of large dimension. The structural analysis of buildings, of hydraulic structures, of bridges and of ship hulls has expediently proved to choose own vectors of rigidity matrixes separated out of a three-dimensional object of flat elements as possible displacements. It is the key to “rolling-up” extensive solving equations with thousands of the unknowns and “compressing” the three-dimensional object in one or two directions.

In order to simplify the calculations, we use so-called principle of partial symmetry, connected with mechanisms of deformation of cross- or longitudinal-sections of a three-dimensional construction. The principle can be considered as transformations leaving mathematical objects (tensors) invariant. In mathematical plan the symmetry causes break-up of solving equations into the independent blocks.

In an effort to adjust sequentially solutions, a hierarchical chain of mathematical simulation models of different levels was built, in which own vectors of flat elements are considered as hypotheses of deformation. We have developed program complex “PRIS” to automize calculations of buildings by three-dimensional models. The spectral methods of calculations are generalized for universal three-dimensional models of buildings to use standards of different countries in the developed program complex. 相似文献

8.

Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy 总被引：10，自引：0，他引：10

Kuncheva Ludmila I. Whitaker Christopher J. 《Machine Learning》2003,51(2):181-207

Diversity among the members of a team of classifiers is deemed to be a key issue in classifier combination. However, measuring diversity is not straightforward because there is no generally accepted formal definition. We have found and studied ten statistics which can measure diversity among binary classifier outputs (correct or incorrect vote for the class label): four averaged pairwise measures (the Q statistic, the correlation, the disagreement and the double fault) and six non-pairwise measures (the entropy of the votes, the difficulty index, the Kohavi-Wolpert variance, the interrater agreement, the generalized diversity, and the coincident failure diversity). Four experiments have been designed to examine the relationship between the accuracy of the team and the measures of diversity, and among the measures themselves. Although there are proven connections between diversity and accuracy in some special cases, our results raise some doubts about the usefulness of diversity measures in building classifier ensembles in real-life pattern recognition problems. 相似文献

9.

An interactive simulator for statistical process control

M. Jeya Chandra Brian J. Melloy 《Computers & Industrial Engineering》1989,17(1-4):186-190

An interactive program has been developed which simulates several representative industrial processes. Specifically, the program generates product quality characteristic values which are concurrently monitored by standard control charting methods. The program requires the user to specify initial process parameter values and subsequent process adjustments; the latter is necessary in the event the process is deemed to be “out-of-control”. The effectiveness of these decisions are measured by economic criteria. The use of the software promotes a “hands-on” approach, which will better prepare the students to achieve quality improvements in an industrial environment through systematic and scientific evaluation. 相似文献

10.

Digital integrator design incorporating an output scaler

R. E. H. Bywater 《Automatica》1971,7(6):735-739

A digital integrator design is proposed which has one or more built in “potentiometers”. It is intended primarily for use as the elementary module of a digital differential analyser (DDA) for process control and the solution of differential equations. By so reducing the number of machine elements, the “patching” problem is considerably eased, especially when electrically programable interconnections are necessary. 相似文献

11.

Accentuating the rank positions in an agreement index with reference to a consensus order

下载免费PDF全文

Juan C. Leyva López Pavel A. Alvarez Carrillo 《International Transactions in Operational Research》2015,22(6):969-995

In this paper, we propose an index that measures the agreement level between an individual opinion and a collective opinion when both are expressed by rankings of a set of alternatives. This index constitutes an interesting weighted version of the well‐known Kendall's ranks correlation index. The originality of the proposed index arises from the fact that it accounts for the relevance of the specific position of the alternatives in an individual order to quantify the agreement level of the individual order with respect to a collective temporary order. The paper also introduces a new consensus measure model. The core of the consensus model is the proposed agreement index. We present an illustrative example to describe the consensus process. We can obtain a faster convergence to a consensus solution using this new index compared to Kendall's index. 相似文献

12.

An evaluation of different navigational tools in using hypertext

Manu Gupta Anand K. Gramopadhye 《Computers & Industrial Engineering》1995,29(1-4):437-441

Hypertext is a widely used form of information presentation which has gained wide popularity in the last few years due to the ease with which information can be portrayed and accessed. However, the very flexibility of hypertext creates problems of user disorientation and cognitive overhead. Designers in the past have tried to combat these problems by providing users with a myriad of navigational tools, with the most widely employed tools being a map and an index. The primary objective of this study was to evaluate the usefulness of the above mentioned navigational tools in alleviating the problems associated with using a hypertext system. For the purpose of this study, a hypertext package was developed using Hypercard on “Ancient Civilizations.” The experiment explored three variables: types of navigational tool available to the user (Map, Index and Combination of map and index); hypertext size (Small stack, Large stack) and trials (Before, After). The results of the study indicate significant differences in the use and effect of the navigational tools. 相似文献

13.

Drill wear sensing and failure prediction for untended machining

A. Thangaraj P.K. Wright 《Robotics and Computer》1988,4(3-4):429-435

Real-time control of drilling was carried out by measuring the thrust force and determining its gradient. Using a microcomputer-based feedback control system, experiments were carried out under different cutting conditions to test the effectiveness of the thrust force gradient in predicting failure. The system was able to predict failure due to excessive wear commonly encountered with 5 and 8 mm drills. With such drills, excessive weat at the outer corner led to an increase in the local temperature which in turn increased the wear. This led to very high temperatures (>600°C), causing local welding of the drill material to the peripheral surface of the hole being drilled. Furthermore, the high temperatures reduced the compressive yield strength of the drill material, causing sub-surface fracture to occur under the influence of the cutting loads. This cyclic phenomenon of “seizure” due to local welding and “release” due to shear fracture (i.e. “stick-slip”) caused sharp fluctuations in the thrust force under constant feed.

This paper discusses the effectiveness of the control system described above in predicting failure due to the excessive wear common to large drills. This system is also contrasted with another based on vibration measurements which has been successfully used to predict failure due to fracture common with small drills. This paper also presents other experimental sensor schemes in the literature. Finally, this paper proposes a framework for an “intelligent” machining process control system driven by multiple sensors, which would facilitate untended machining. 相似文献

14.

An optimal solution for byzantine agreement under a hierarchical cluster-oriented mobile ad hoc network 总被引：1，自引：0，他引：1

Shun-Sheng Wang Shu-Ching Wang 《Computers & Electrical Engineering》2010,36(1):100-113

Mobile ad hoc NETworks (MANETs) are becoming more popular due to the advantage that they do not require any fixed infrastructure, and that communication among processors can be established quickly. For this reason, potential MANET applications include military uses, search and rescue and meetings or conferences. Therefore, the fault-tolerance and reliability of the MANET is an important issue, which needs to be considered. The problem of reaching agreement in the distributed system is one of the most important areas of research to design a fault-tolerant system. With an agreement, each correct processor can cope with the influence from other faulty components in the network to provide a reliable solution. In this research, a potential MANET with a dual failure mode is considered. The proposed protocol can use the minimum number of rounds of message exchange to reach a common agreement and can tolerate a maximum number of allowable faulty components to induce all correct processors to reach a common agreement within the MANET. 相似文献

15.

Some approaches to integration

J. Hatvany 《Robotics and Computer》1984,1(3-4):227-230

The integration of CAD and CAM is one of the weightiest of the so far unsolved (or only partially solved) problems that are proving to be grave obstacles to the computer-integrated manufacturing systems that we have all envisaged over a number of years. There are two main reasons for this. One is the failure to apply a design methodology conducive to integration; the other is the lack of a clearly suitable principle around which the integration should take place. The lack of a methodical, overall system design directed towards integration from the outset is due not so much to the absence of suitable methodologies (in fact, quite a few have been developed), as to their failure to gain acceptance in industrial practice.

As regards the integrative principle that can provide the core around which a CAD/CAM system can be built, opinions differ widely. One fashionable trend considers geometric modellingto be the “all-saving” principle. Others allot this role to process planning, family-of-parts classification, databases and their management, distributed systems architectures and their implementations. Following their various traditions, various countries are pursuing different courses based on these and other principles.

It is apparent (and appreciated by all the countries concerned) that none of the methods that they have separately or jointly developed is as yet suitable for the fool-proof design and implementation of the “factory of the future”. However, they all have something to offer and have allowed spectacular progress to be made. There is widespread agreement that it is only the synthesis of the extent approaches, the deepening of our theoretical understanding, and above all the acquisition and sharing of much practical experience that can lead to a usable “science” of integration. 相似文献

16.

A hybrid method for robust car plate character recognition 总被引：2，自引：0，他引：2

Xiang Pan Xiuzi Ye Sanyuan Zhang 《Engineering Applications of Artificial Intelligence》2005,18(8):963-972

Image-based car plate recognition is an indispensable part of an intelligent traffic system. The quality of the images taken for car plates, especially for Chinese car plates, however, may sometimes be very poor, due to the operating conditions and distortion because of poor photographical environments. Furthermore, there exist some “similar” characters, such as “8” and “B”, “7” and “T” and so on. They are less distinguishable because of noises and/or distortions. To achieve robust and high recognition performance, in this paper, a two-stage hybrid recognition system combining statistical and structural recognition methods is proposed. Car plate images are skew corrected and normalized before recognition. In the first stage, four statistical sub-classifiers recognize the input character independently, and the recognition results are combined using the Bayes method. If the output of the first stage contains characters that belong to prescribed sets of similarity characters, structure recognition method is used to further classify these character images: they are preprocessed once more, structure features are obtained from them and these structure features are fed into a decision tree classifier. Finally, genetic algorithm is employed to achieve optimum system parameters. Experiments show that our recognition system is very efficient and robust. As part of an intelligent traffic system, the system has been in successful commercial use. 相似文献

17.

G-networks with resets 总被引：2，自引：0，他引：2

Erol Jean-Michel 《Performance Evaluation》2002,49(1-4):179-191

Gelenbe networks (G-networks) are product form queuing networks which, in addition to ordinary customers, contain unusual entities such as “negative customers” which eliminate normal customers and “triggers” which move other customers from some queue to another. These models have generated much interest in the literature since the early 1990s. The present paper discusses a novel model for a reliable system composed of N unreliable systems, which can hinder or enhance each other’s reliability. Each of the N systems also tests other systems at random; it is able to reset another system if it is itself in working condition and discovers that the other system has failed, so that the global reliability of the system is enhanced. This paper shows how an extension of G-networks that includes novel “reset” customers can be used to model this behavior. We then show that a general G-network model with resets has product form, and prove existence and uniqueness of its solution. 相似文献

18.

From case-based reasoning to traces-based reasoning

Alain 《Annual Reviews in Control》2006,30(2):223-232

CBR is an original AI paradigm based on the adaptation of solutions of past problems in order to solve new similar problems. Hence, a case is a problem with its solution and cases are stored in a case library. The reasoning process follows a cycle that facilitates “learning” from new solved cases. This approach can be also viewed as a lazy learning method when applied for task classification. CBR is applied for various tasks as design, planning, diagnosis, information retrieval, etc. The paper is the occasion to go a step further in reusing past unstructured experience, by considering traces of computer use as experience knowledge containers for situation based problem solving. 相似文献

19.

CIM—the integration of manufacturing and information systems

Michael C. Pelletier 《Computers & Industrial Engineering》1991,21(1-4):211-215

The integration of computers within the manufacturing environment has long been a method of enhancing productivity. Their use in many facets of a manufacturing enterprise has given industries the ability to deliver low-cost, high-quality competitive products. As computer technology advances, we find more and more uses for new hardware and software in the enterprise. Over a period of time, we have seen many “islands” of computer integration. Distinct, fully functional hardware and software installations are a common base for many industries. Unfortunately, these islands are just that, separate, distinct and functional but non-integrated. The lack of integration within these information systems make it difficult for end users to see the same manufacturing data. We are finding the need for a “single image” real-time information system to provide the enterprise with the data that is required to plan, justify, design, manufacture and deliver products to the customer. Unfortunately, many industries have a large installed base of hardware and software. Replacement of current systems is not a cost-justified business decision. An alternative would be the migration of current systems to a more integrated solution. The migration to a computer-integrated manufacturing (CIM)-based architecture would provide that single image real-time information system.

The effort and skills necessary for the implementation of a CIM-based architecture would require active participation from two key organizations: Manufacturing and information systems (I/S). The manufacturing engineers, process engineers and other manufacturing resource would be the cornerstone for obtaining requirements. The ability to effectively use I/S is a critical success factor in the implementation of CIM. I/S has to be viewed as an equal partner, not just as a service organization. Manufacturing management needs to understand the justification process of integrating computer systems and the “real” cost of integration versus the cost of non-integrated manufacturing systems. The active participation of both organizations during all phases of CIM implementation will result in a effective and useful integrated information system. 相似文献

20.

Unsupervised texture segmentation using Gabor filters 总被引：88，自引：0，他引：88

Anil K. Jain Farshid Farrokhnia 《Pattern recognition》1991,24(12):1167-1186

This paper presents a texture segmentation algorithm inspired by the multi-channel filtering theory for visual information processing in the early stages of human visual system. The channels are characterized by a bank of Gabor filters that nearly uniformly covers the spatial-frequency domain, and a systematic filter selection scheme is proposed, which is based on reconstruction of the input image from the filtered images. Texture features are obtained by subjecting each (selected) filtered image to a nonlinear transformation and computing a measure of “energy” in a window around each pixel. A square-error clustering algorithm is then used to integrate the feature images and produce a segmentation. A simple procedure to incorporate spatial information in the clustering process is proposed. A relative index is used to estimate the “true” number of texture categories. 相似文献