Learning often occurs through comparing. In classification learning, in order to compare data groups, most existing methods compare either raw instances or learned classification rules against each other. This paper takes a different approach, namely conceptual equivalence, that is, groups are equivalent if their underlying concepts are equivalent while their instance spaces do not necessarily overlap and their rule sets do not necessarily present the same appearance. A new methodology of comparing is proposed that learns a representation of each group’s underlying concept and respectively cross-exams one group’s instances by the other group’s concept representation. The innovation is fivefold. First, it is able to quantify the degree of conceptual equivalence between two groups. Second, it is able to retrace the source of discrepancy at two levels: an abstract level of underlying concepts and a specific level of instances. Third, it applies to numeric data as well as categorical data. Fourth, it circumvents direct comparisons between (possibly a large number of) rules that demand substantial effort. Fifth, it reduces dependency on the accuracy of employed classification algorithms. Empirical evidence suggests that this new methodology is effective and yet simple to use in scenarios such as noise cleansing and concept-change learning. 相似文献
The population of health care costs is typically skewed, heteroscedastic, and may include zero costs. Without proper accounting for these special distributional features, resulting prediction may be biased, and wrong inferences about the distribution of patients’ health care costs may be made. Welsh and Zhou [A.H. Welsh, X.H. Zhou, Estimating the retransformed mean in a heteroscedastic two-part model, J. Stat. Plan. Inference 136 (2006) 860–881] proposed a semi-parametric regression model, which addressed these special features. In this paper we developed a software program to implement this statistical method, which would provide better prediction of health care costs for clinical researchers.
Our program computed two mean estimators, their asymptotical standard deviation, confidence interval, and optional bootstrap confidence interval. Our program included user-friendly interactive mode and more efficient and flexible batch mode. It was written in free statistical computing language R and could be run on a wide variety of platforms. 相似文献
Tractor driving imposes a lot of physical and mental stress upon the operator. If the operator's seat is not comfortable, his work performance may be poor and there is also a possibility of accidents. The optimal design of tractor seat may be achieved by integrating anthropometric data with other technical features of the design. This paper reviews the existing information on the tractor seat design that considers anthropometry and biomechanical factors and gives an approach for seat design based on anthropometric data. The anthropometric dimensions, i.e. popliteal height sitting (5th percentile), hip breadth sitting (95th percentile), buttock popliteal length (5th percentile), interscye breadth (5th and 95th percentile) and sitting acromion height (5th percentile) of agricultural workers need to be taken into consideration for design of seat height, seat pan width, seat pan length, seat backrest width and seat backrest height, respectively, of a tractor. The seat dimensions recommended for tractor operator's comfort based on anthropometric data of 5434 Indian male agricultural workers were as follows: seat height of 380 mm, seat pan width of 420–450 mm, seat backrest width of 380–400 mm (bottom) and 270–290 mm (top), seat pan length of 370±10 mm, seat pan tilt of 5–7° backward and seat backrest height of 350 mm.
Relevance to industry
The approach presented in this paper for tractor seat design based on anthropometric considerations will help the tractor seat designers to develop and introduce seats suiting to the requirements of the user population. This will not only enhance the comfort of the tractor operators but may also help to reduce the occupational health problems of tractor operators. 相似文献
Large-scale simulation of separation phenomena in solids such as fracture, branching, and fragmentation requires a scalable
data structure representation of the evolving model. Modeling of such phenomena can be successfully accomplished by means
of cohesive models of fracture, which are versatile and effective tools for computational analysis. A common approach to insert
cohesive elements in finite element meshes consists of adding discrete special interfaces (cohesive elements) between bulk
elements. The insertion of cohesive elements along bulk element interfaces for fragmentation simulation imposes changes in
the topology of the mesh. This paper presents a unified topology-based framework for supporting adaptive fragmentation simulations,
being able to handle two- and three-dimensional models, with finite elements of any order. We represent the finite element
model using a compact and “complete” topological data structure, which is capable of retrieving all adjacency relationships
needed for the simulation. Moreover, we introduce a new topology-based algorithm that systematically classifies fractured
facets (i.e., facets along which fracture has occurred). The algorithm follows a set of procedures that consistently perform
all the topological changes needed to update the model. The proposed topology-based framework is general and ensures that
the model representation remains always valid during fragmentation, even when very complex crack patterns are involved. The
framework correctness and efficiency are illustrated by arbitrary insertion of cohesive elements in various finite element
meshes of self-similar geometries, including both two- and three-dimensional models. These computational tests clearly show
linear scaling in time, which is a key feature of the present data-structure representation. The effectiveness of the proposed
approach is also demonstrated by dynamic fracture analysis through finite element simulations of actual engineering problems.
Nowadays, the manufacturing industry is adopting modern information technologies in order to optimize their business process and to achieve integration with supply chain partners that are geographically dispersed, by expanding the physical limits of their business globally. The growth of the World Wide Web and the advantages of the software technologies arising from it render this goal possible. This paper discusses how modern information technology, particularly the ISO 10303—STEP and eXtensible Markup Language—XML, can be utilized jointly to support the communication of different partners. An approach to support enterprise operation through efficient data exchanges, by adopting modern data modeling techniques for integration among partners worldwide and using the web as a communication layer, is described. The paper deals particularly with the ship repair industry as a case study of these problems and suggests a feasible solution. 相似文献
Data services via wireless networks and mobile devices have experienced rapid growth worldwide. We investigated the factors influencing adoption of wireless mobile data services (WMDS) in China and tested our model for explaining adoption intentions there. We argued that individuals form their intention to adopt WMDS under the influence of wireless mobile technology, the social environment, personal innovativeness of IT, trust awareness, and the facilitating conditions. We examined the simultaneous effects of these five influences on beliefs in the context of wireless Internet data services via mobile phones. Survey data were collected from 1432 participants in several metro cities across China. The findings suggest that WMDS adoption intention in China is determined by consumers’ perceived usefulness and perceived ease of use of WMDS. Theoretical and practical implications are included in our paper. 相似文献
The aim of our study was to further develop an understanding of social capital in organizational-knowledge-sharing. We first developed a measurement tool and then a theoretical framework in which three social capital factors (social network, social trust, and shared goals) were combined with the theory of reasoned action; their relationships were then examined using confirmatory factoring analysis. We then surveyed of 190 managers from Hong Kong firms, we confirm that a social network and shared goals significantly contributed to a person's volition to share knowledge, and directly contributed to the perceived social pressure of the organization. The social trust has however showed no direct effect on the attitude and subjective norm of sharing knowledge. 相似文献
To enable the evolution towards electronically assisted healthcare, future medical implants require sensors and processing circuitry to inform patient and doctor on the rehabilitation status. An important branch of systems are those where implant strain is monitored through strain gauges. Since batteries inside the human body are avoided as much as possible, a transcutaneous power link is used to wirelessly power the implant. The same RF link provides an elegant way of establishing bi-directional data communication between the external base station and the medical device. This paper describes a front-end IC that manages both power reception and bi-directional data communication. It has a clock generation circuit on board to drive additional digital processing circuits. A new architecture that uses a current driven data demodulation principle is introduced. It is able to detect an AM signal with modulation depth of a mere 4%, which is better than recent similar systems in the field. The IC is fabricated in a solid 0.35 μm HVCMOS technology and consumes only 0.56 mA. 相似文献
This paper investigates a new method to solve the inverse problem of Rutherford backscattering (RBS) data. The inverse problem
is to determine the sample structure information from measured spectra, which can be defined as a function approximation problem.
We propose using radial basis function (RBF) neural networks to approximate an inverse function. Each RBS spectrum, which
may contain up to 128 data points, is compressed by the principal component analysis, so that the dimensionality of input
data and complexity of the network are reduced significantly. Our theoretical consideration is tested by numerical experiments
with the example of the SiGe thin film sample and corresponding backscattering spectra. A comparison of the RBF method with
multilayer perceptrons reveals that the former has better performance in extracting structural information from spectra. Furthermore,
the proposed method can handle redundancies properly, which are caused by the constraint of output variables. This study is
the first method based on RBF to deal with the inverse RBS data analysis problem. 相似文献
The problem of missing values in software measurement data used in empirical analysis has led to the proposal of numerous
potential solutions. Imputation procedures, for example, have been proposed to ‘fill-in’ the missing values with plausible
alternatives. We present a comprehensive study of imputation techniques using real-world software measurement datasets. Two
different datasets with dramatically different properties were utilized in this study, with the injection of missing values
according to three different missingness mechanisms (MCAR, MAR, and NI). We consider the occurrence of missing values in multiple
attributes, and compare three procedures, Bayesian multiple imputation, k Nearest Neighbor imputation, and Mean imputation. We also examine the relationship between noise in the dataset and the performance
of the imputation techniques, which has not been addressed previously. Our comprehensive experiments demonstrate conclusively
that Bayesian multiple imputation is an extremely effective imputation technique.
Jason Van HulseEmail:
Taghi M. Khoshgoftaar
is a professor of the Department of Computer Science and Engineering, Florida Atlantic University and the Director of the
Empirical Software Engineering and Data Mining and Machine Learning Laboratories. His research interests are in software engineering,
software metrics, software reliability and quality engineering, computational intelligence, computer performance evaluation,
data mining, machine learning, and statistical modeling. He has published more than 300 refereed papers in these areas. He
is a member of the IEEE, IEEE Computer Society, and IEEE Reliability Society. He was the program chair and General Chair of
the IEEE International Conference on Tools with Artificial Intelligence in 2004 and 2005 respectively. He has served on technical
program committees of various international conferences, symposia, and workshops. Also, he has served as North American Editor
of the Software Quality Journal, and is on the editorial boards of the journals Software Quality and Fuzzy systems.
Jason Van Hulse
received the Ph.D. degree in Computer Engineering from the Department of Computer Science and Engineering at Florida Atlantic
University in 2007, the M.A. degree in Mathematics from Stony Brook University in 2000, and the B.S. degree in Mathematics
from the University at Albany in 1997. His research interests include data mining and knowledge discovery, machine learning,
computational intelligence, and statistics. He has published numerous peer-reviewed research papers in various conferences
and journals, and is a member of the IEEE, IEEE Computer Society, and ACM. He has worked in the data mining and predictive
modeling field at First Data Corp. since 2000, and is currently Vice President, Decision Science.
相似文献