共查询到20条相似文献,搜索用时 13 毫秒
1.
Oscar Marbán Javier Segovia Ernestina Menasalvas Covadonga Fernández-Baizán 《Information Systems》2009
The number, variety and complexity of projects involving data mining or knowledge discovery in databases activities have increased just lately at such a pace that aspects related to their development process need to be standardized for results to be integrated, reused and interchanged in the future. Data mining projects are quickly becoming engineering projects, and current standard processes, like CRISP-DM, need to be revisited to incorporate this engineering viewpoint. This is the central motivation of this paper that makes the point that experience gained about the software development process over almost 40 years could be reused and integrated to improve data mining processes. Consequently, this paper proposes to reuse ideas and concepts underlying the IEEE Std 1074 and ISO 12207 software engineering model processes to redefine and add to the CRISP-DM process and make it a data mining engineering standard. 相似文献
2.
3.
A step toward STEP-compatible engineering data management: the data models of product structure and engineering changes 总被引:1,自引:0,他引:1
In an iterative design process, there is a large amount of engineering data to be processed. Well-managed engineering data can ensure the competitiveness of companies in the competitive market. It has been recognized that a product data model is the basis for establishing engineering database. To fully support the complete product data representation in its life cycle, an international product data representation and exchange standard, STEP, is applied to model the representation of a product. In this paper, the architecture of an engineering data management (EDM) system is described, which consists of an integrated product database. There are six STEP-compatible data models constructed to demonstrate the integratibility of EDM system using common data modeling format. These data models are product definition, product structure, shape representation, engineering change, approval, and production scheduling. These data models are defined according to the integrated resources of STEP/ISO 10303 (Parts 41-44), which support a complete product information representation and a standard data format. Thus, application systems, such as CAD/CAM and MRP systems, can interact with the EDM system by accessing the database based on the STEP data exchange standard. 相似文献
4.
Variant design for mechanical artifacts: A state-of-the-art survey 总被引:11,自引:0,他引:11
J. E. Fowler 《Engineering with Computers》1996,12(1):1-15
Variant design refers, to the technique of adapting existing design specifications to satisfy new design goals and constraints. Specific support of variant design techniques in current computer aided design systems would help to realize a rapid response manufacturing environment. A survey of approaches supporting variant design is presented. Capabilities used in current commercial computer aided design systems are discussed along with approaches used in recent research efforts. Information standards applicable to variant design are also identified. Barriers to variant design in current systems are identified and ideas are presented for augmentation of current systems to support variant design. 相似文献
5.
《Ergonomics》2012,55(1-3):188-196
Hand signs are considered as one of the important ways to enter information into computers for certain tasks. Computers receive sensor data of hand signs for recognition. When using hand signs as computer inputs, we need to (1) train computer users in the sign language so that their hand signs can be easily recognized by computers, and (2) design the computer interface to avoid the use of confusing signs for improving user input performance and user satisfaction. For user training and computer interface design, it is important to have a knowledge of which signs can be easily recognized by computers and which signs are not distinguishable by computers. This paper presents a data mining technique to discover distinct patterns of hand signs from sensor data. Based on these patterns, we derive a group of indistinguishable signs by computers. Such information can in turn assist in user training and computer interface design. 相似文献
6.
This paper describes data mining and data warehousing techniques that can improve the performance and usability of Intrusion
Detection Systems (IDS). Current IDS do not provide support for historical data analysis and data summarization. This paper
presents techniques to model network traffic and alerts using a multi-dimensional data model and star schemas. This data model was used to perform network security analysis and detect denial of service attacks. Our data model can also
be used to handle heterogeneous data sources (e.g. firewall logs, system calls, net-flow data) and enable up to two orders
of magnitude faster query response times for analysts as compared to the current state of the art. We have used our techniques
to implement a prototype system that is being successfully used at Army Research Labs. Our system has helped the security
analyst in detecting intrusions and in historical data analysis for generating reports on trend analysis.
Recommended by: Ashfaq Khokhar 相似文献
7.
A Kansei mining system for affective design 总被引:4,自引:0,他引:4
Affective design has received much attention from both academia and industries. It aims at incorporating customers' affective needs into design elements that deliver customers' affective satisfaction. The main challenge for affective design originates from difficulties in mapping customers' subjective impressions, namely Kansei, to perceptual design elements. This paper intends to develop an explicit decision support to improve the Kansei mapping process by reusing knowledge from past sales records and product specifications. As one of the important applications of data mining, association rule mining lends itself to the discovery of useful patterns associated with the mapping of affective needs. A Kansei mining system is developed to utilize valuable affect information latent in customers' impressions of existing affective designs. The goodness of association rules is evaluated according to their achievements of customers' expectations. Conjoint analysis is applied to measure the expected and achieved utilities of a Kansei mapping relationship. Based on goodness evaluation, mapping rules are further refined to empower the system with useful inference patterns. The system architecture and implementation issues are discussed in detail. An application of Kansei mining to mobile phone affective design is presented. 相似文献
8.
Kuo-Chen Hung Peter Julian Terence Chien Warren Tsu-huei Jin 《Expert systems with applications》2010,37(1):202-213
Design concept is an important wealth-creating activity in companies and infrastructure. However, the process of designing is very complex. Besides, the information required during the conceptual stage is incomplete, imprecise, and fuzzy. Hence, fuzzy set theory should be used to handle linguistic problem at this stage. This paper presents a fuzzy integrated approach to assess the performance of design concepts. And those criteria rating, relative weights and performance levels are captured by fuzzy numbers, and the overall performance of each alternative is calculated through an enhanced fuzzy weighted average (FWA) approach. A practical numerical example is provided to demonstrate the usefulness of this study. In addition, this paper, in order to make computing and ranking results easier to increase the recruiting productivity, develops a computer-based decision support system to help make decisions more efficiently. 相似文献
9.
This paper develops an efficient methodology to perform reliability-based design optimization (RBDO) by decoupling the optimization
and reliability analysis iterations that are nested in traditional formulations. This is achieved by approximating the reliability
constraints based on the reliability analysis results. The proposed approach does not use inverse first-order reliability
analysis as other existing decoupled approaches, but uses direct reliability analysis. This strategy allows a modular approach
and the use of more accurate methods, including Monte-Carlo-simulation (MCS)-based methods for highly nonlinear reliability
constraints where first-order reliability approximation may not be accurate. The use of simulation-based methods also enables
system-level reliability estimates to be included in the RBDO formulation. The efficiency of the proposed RBDO approach is
further improved by identifying the potentially active reliability constraints at the beginning of each reliability analysis.
A vehicle side impact problem is used to examine the proposed method, and the results show the usefulness of the proposed
method. 相似文献
10.
A regression-based approach for mining user movement patterns from random sample data 总被引:1,自引:0,他引:1
Mobile computing systems usually express a user movement trajectory as a sequence of areas that capture the user movement trace. Given a set of user movement trajectories, user movement patterns refer to the sequences of areas through which a user frequently travels. In an attempt to obtain user movement patterns for mobile applications, prior studies explore the problem of mining user movement patterns from the movement logs of mobile users. These movement logs generate a data record whenever a mobile user crosses base station coverage areas. However, this type of movement log does not exist in the system and thus generates extra overheads. By exploiting an existing log, namely, call detail records, this article proposes a Regression-based approach for mining User Movement Patterns (abbreviated as RUMP). This approach views call detail records as random sample trajectory data, and thus, user movement patterns are represented as movement functions in this article. We propose algorithm LS (standing for Large Sequence) to extract the call detail records that capture frequent user movement behaviors. By exploring the spatio-temporal locality of continuous movements (i.e., a mobile user is likely to be in nearby areas if the time interval between consecutive calls is small), we develop algorithm TC (standing for Time Clustering) to cluster call detail records. Then, by utilizing regression analysis, we develop algorithm MF (standing for Movement Function) to derive movement functions. Experimental studies involving both synthetic and real datasets show that RUMP is able to derive user movement functions close to the frequent movement behaviors of mobile users. 相似文献
11.
Christian Van der Velden Cees Bil Xinghuo Yu Adrian Smith 《Innovations in Systems and Software Engineering》2007,3(2):117-128
This paper discusses the development of an intelligent routing system for automating design of electrical wiring harnesses
and pipes in aircraft. The system employs knowledge based engineering (KBE) methods and technologies for capturing and implementing
rules and engineering knowledge relating to the routing process. The system reads a mesh of three dimensional structure and
obstacles falling within a given search space and connects source and target terminals satisfying a knowledge base of design
rules and best practices. Routed paths are output as computer aided design (CAD) readable geometry, and a finite element (FE)
mesh consisting of geometry, routed paths and a knowledge layer providing detail of the rules and knowledge implemented in
the process. Use of this intelligent routing system provides structure to the routing design process and has potential to
deliver significant savings in time and cost. 相似文献
12.
In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for achieving compliance with k-anonymity is to replace certain values with less specific but semantically consistent values. In this paper we propose a different approach for achieving k-anonymity by partitioning the original dataset into several projections such that each one of them adheres to k-anonymity. Moreover, any attempt to rejoin the projections, results in a table that still complies with k-anonymity. A classifier is trained on each projection and subsequently, an unlabelled instance is classified by combining the classifications of all classifiers.Guided by classification accuracy and k-anonymity constraints, the proposed data mining privacy by decomposition (DMPD) algorithm uses a genetic algorithm to search for optimal feature set partitioning. Ten separate datasets were evaluated with DMPD in order to compare its classification performance with other k-anonymity-based methods. The results suggest that DMPD performs better than existing k-anonymity-based algorithms and there is no necessity for applying domain dependent knowledge. Using multiobjective optimization methods, we also examine the tradeoff between the two conflicting objectives in PPDM: privacy and predictive performance. 相似文献
13.
The object-oriented approach has been the most popular software design methodology for the past twenty-five years. Several design patterns and principles are defined to improve the design quality of object-oriented software systems. In addition, designers can use unique design motifs that are designed for the specific application domains. Another commonly used technique is cloning and modifying some parts of the software while creating new modules. Therefore, object-oriented programs can include many identical design structures. This work proposes a sub-graph mining-based approach for detecting identical design structures in object-oriented systems. By identifying and analyzing these structures, we can obtain useful information about the design, such as commonly-used design patterns, most frequent design defects, domain-specific patterns, and reused design clones, which could help developers to improve their knowledge about the software architecture. Furthermore, problematic parts of frequent identical design structures are appropriate refactoring opportunities because they affect multiple areas of the architecture. Experiments with several open-source and industrial projects show that we can successfully find many identical design structures within a project (intra-project) and between different projects (inter-project). We observe that usually most of the detected identical structures are an implementation of common design patterns; however, we also detect various anti-patterns, domain-specific patterns, reused design parts and design-level clones. 相似文献
14.
A novel hash-based approach for mining frequent itemsets over data streams requiring less memory space 总被引:2,自引:1,他引:1
In recent times, data are generated as a form of continuous data streams in many applications. Since handling data streams
is necessary and discovering knowledge behind data streams can often yield substantial benefits, mining over data streams
has become one of the most important issues. Many approaches for mining frequent itemsets over data streams have been proposed.
These approaches often consist of two procedures including continuously maintaining synopses for data streams and finding
frequent itemsets from the synopses. However, most of the approaches assume that the synopses of data streams can be saved
in memory and ignore the fact that the information of the non-frequent itemsets kept in the synopses may cause memory utilization
to be significantly degraded. In this paper, we consider compressing the information of all the itemsets into a structure
with a fixed size using a hash-based technique. This hash-based approach skillfully summarizes the information of the whole
data stream by using a hash table, provides a novel technique to estimate the support counts of the non-frequent itemsets,
and keeps only the frequent itemsets for speeding up the mining process. Therefore, the goal of optimizing memory space utilization
can be achieved. The correctness guarantee, error analysis, and parameter setting of this approach are presented and a series
of experiments is performed to show the effectiveness and the efficiency of this approach. 相似文献
15.
Hemalatha Sathyanarayanamurthy Ratna Babu Chinnam 《Computers & Industrial Engineering》2009,57(3):996-1007
It is routine in probabilistic engineering design to conduct modeling studies to determine the influence of an input variable (or a combination) on the output variable(s). The output or the response can then be fine-tuned by changing the design parameters based on this information. However, simply fine-tuning the output to the desired or target value is not adequate. Robust design principles suggest that we not only study the mean response for a given input vector but also the variance in the output attributed to noise and other unaccounted factors. Given our desire to reduce variability in any process, it is also important to understand which of the input factors affect the variability in the output the most. Given the significant computational overhead associated with most Computer Aided Engineering models, it is becoming popular to conduct such analysis through surrogate models built using a variety of metamodeling techniques. In this regard, existing literature on metamodeling and sensitivity analysis techniques provides useful insights into the various scenarios that they suit the best. However, there has been a limitation of studies that simultaneously consider the combination of metamodeling and sensitivity analysis and the environments in which they operate the best. This paper aims at contributing to reduce this limitation by basing the study on multiple metrics and using two test problems. Two test functions have been used to build metamodels, using three popular metamodeling techniques: Kriging, Radial-Basis Function (RBF) networks, and Support Vector Machines (SVMs). The metamodels are then used for sensitivity analysis, using two popular sensitivity analysis methods, Fourier Amplitude Sensitivity Test (FAST) and Sobol, to determine the influence of variance in the input variables on the variance of the output variables. The advantages and disadvantages of the different metamodeling techniques, in combination with the sensitivity analysis methods, in determining the extent to which the variabilities in the input affect the variabilities in the output are analyzed. 相似文献
16.
In recent years, the influences of design patterns on software quality have attracted increasing attention in the area of
software engineering, as design patterns encapsulate valuable knowledge to resolve design problems, and more importantly to
improve the design quality. One of the key challenges in object-oriented design is how to apply appropriate design patterns
during the system development. In this paper, design pattern is analyzed from different perspectives to see how it can facilitate
design activities, handle non-functional requirement, solve design problems and resolve design conflicts. Based on the analysis,
various kinds of applicability of design patterns are explored and integrated with a goal-driven approach to guiding developers
to construct the object-oriented design model in a systematic manner. There are three benefits to the proposed approach: making
it easy to meet requirements, helping resolve design conflicts, and facilitating improvement of the design quality. 相似文献
17.
A novel multi-objective genetic algorithm (GA)-based rule-mining method for affective product design is proposed to discover a set of rules relating design attributes with customer evaluation based on survey data. The proposed method can generate approximate rules to consider the ambiguity of customer assessments. The generated rules can be used to determine the lower and upper limits of the affective effect of design patterns. For a rule-mining problem, the proposed multi-objective GA approach could simultaneously consider the accuracy, comprehensibility, and definability of approximate rules. In addition, the proposed approach can deal with categorical attributes and quantitative attributes, and determine the interval of quantitative attributes. Categorical and quantitative attributes in affective product design should be considered because they are commonly used to define the design profile of a product. In this paper, a two-stage rule-mining approach is proposed to generate rules with a simple chromosome design in the first stage of rule mining. In the second stage of rule mining, entire rule sets are refined to determine solutions considering rule interaction. A case study on mobile phones is used to demonstrate and validate the performance of the proposed rule-mining method. The method can discover rule sets with good support and coverage rates from the survey data. 相似文献
18.
Doh-Soon Kwak 《Expert systems with applications》2012,39(3):2590-2596
Due to the rapid development of information technologies, abundant data have become readily available. Data mining techniques have been used for process optimization in many manufacturing processes in automotive, LCD, semiconductor, and steel production, among others. However, a large amount of missing values occurs in the data set due to several causes (e.g., data discarded by gross measurement errors, measurement machine breakdown, routine maintenance, sampling inspection, and sensor failure), which frequently complicate the application of data mining to the data set. This study proposes a new procedure for optimizing processes called missing values-Patient Rule Induction Method (m-PRIM), which handles the missing-values problem systematically and yields considerable process improvement, even if a significant portion of the data set has missing values. A case study in a semiconductor manufacturing process is conducted to illustrate the proposed procedure. 相似文献
19.
Searching for simplified farmers' crop choice models for integrated watershed management in Thailand: A data mining approach 总被引:1,自引:0,他引:1
This study used the C4.5 data mining algorithm to model farmers' crop choice in two watersheds in Thailand. Previous attempts in the Integrated Water Resource Assessment and Management Project to model farmers' crop choice produced large sets of decision rules. In order to produce simplified models of farmers' crop choice, data mining operations were applied for each soil series in the study areas. The resulting decision trees were much smaller in size. Land type, water availability, tenure, capital, labor availability as well as non-farm and livestock income were found to be important considerations in farmers' decision models. Profitability was also found important although it was represented in approximate ranges. Unlike the general wisdom on farmers' crop choice, these decision trees came with threshold values and sequential order of the important variables. The decision trees were validated using the remaining unused set of data, and their accuracy in predicting farmers' decisions was around 84%. Because of their simple structure, the decision trees produced in this study could be useful to analysts of water resource management as they can be integrated with biophysical models for sustainable watershed management. 相似文献
20.
Modeling adiabatic temperature rise during concrete hydration: A data mining approach 总被引:1,自引:0,他引:1
Alexandre G. Evsukoff Eduardo M.R. Fairbairn tore F. Faria Marcos M. Silvoso Romildo D. Toledo Filho 《Computers & Structures》2006,84(31-32):2351-2362
This paper presents a data mining approach for modeling the adiabatic temperature rise during concrete hydration. The model was developed based on experimental data obtained in the last thirty years for several mass concrete constructions in Brazil, including some of the hugest hydroelectric power plants in operation in the world. The input of the model is a variable data set corresponding to the binder physical and chemical properties and concrete mixture proportions. The output is a set of three parameters that determine a function which is capable to describe the adiabatic temperature rise during concrete hydration. The comparison between experimental data and modeling results shows the accuracy of the proposed approach and that data mining is a potential tool to predict thermal stresses in the design of massive concrete structures. 相似文献