首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 13 毫秒
1.
The number, variety and complexity of projects involving data mining or knowledge discovery in databases activities have increased just lately at such a pace that aspects related to their development process need to be standardized for results to be integrated, reused and interchanged in the future. Data mining projects are quickly becoming engineering projects, and current standard processes, like CRISP-DM, need to be revisited to incorporate this engineering viewpoint. This is the central motivation of this paper that makes the point that experience gained about the software development process over almost 40 years could be reused and integrated to improve data mining processes. Consequently, this paper proposes to reuse ideas and concepts underlying the IEEE Std 1074 and ISO 12207 software engineering model processes to redefine and add to the CRISP-DM process and make it a data mining engineering standard.  相似文献   

2.
3.
In an iterative design process, there is a large amount of engineering data to be processed. Well-managed engineering data can ensure the competitiveness of companies in the competitive market. It has been recognized that a product data model is the basis for establishing engineering database. To fully support the complete product data representation in its life cycle, an international product data representation and exchange standard, STEP, is applied to model the representation of a product. In this paper, the architecture of an engineering data management (EDM) system is described, which consists of an integrated product database. There are six STEP-compatible data models constructed to demonstrate the integratibility of EDM system using common data modeling format. These data models are product definition, product structure, shape representation, engineering change, approval, and production scheduling. These data models are defined according to the integrated resources of STEP/ISO 10303 (Parts 41-44), which support a complete product information representation and a standard data format. Thus, application systems, such as CAD/CAM and MRP systems, can interact with the EDM system by accessing the database based on the STEP data exchange standard.  相似文献   

4.
Variant design for mechanical artifacts: A state-of-the-art survey   总被引:11,自引:0,他引:11  
Variant design refers, to the technique of adapting existing design specifications to satisfy new design goals and constraints. Specific support of variant design techniques in current computer aided design systems would help to realize a rapid response manufacturing environment. A survey of approaches supporting variant design is presented. Capabilities used in current commercial computer aided design systems are discussed along with approaches used in recent research efforts. Information standards applicable to variant design are also identified. Barriers to variant design in current systems are identified and ideas are presented for augmentation of current systems to support variant design.  相似文献   

5.
《Ergonomics》2012,55(1-3):188-196
Hand signs are considered as one of the important ways to enter information into computers for certain tasks. Computers receive sensor data of hand signs for recognition. When using hand signs as computer inputs, we need to (1) train computer users in the sign language so that their hand signs can be easily recognized by computers, and (2) design the computer interface to avoid the use of confusing signs for improving user input performance and user satisfaction. For user training and computer interface design, it is important to have a knowledge of which signs can be easily recognized by computers and which signs are not distinguishable by computers. This paper presents a data mining technique to discover distinct patterns of hand signs from sensor data. Based on these patterns, we derive a group of indistinguishable signs by computers. Such information can in turn assist in user training and computer interface design.  相似文献   

6.
This paper describes data mining and data warehousing techniques that can improve the performance and usability of Intrusion Detection Systems (IDS). Current IDS do not provide support for historical data analysis and data summarization. This paper presents techniques to model network traffic and alerts using a multi-dimensional data model and star schemas. This data model was used to perform network security analysis and detect denial of service attacks. Our data model can also be used to handle heterogeneous data sources (e.g. firewall logs, system calls, net-flow data) and enable up to two orders of magnitude faster query response times for analysts as compared to the current state of the art. We have used our techniques to implement a prototype system that is being successfully used at Army Research Labs. Our system has helped the security analyst in detecting intrusions and in historical data analysis for generating reports on trend analysis. Recommended by: Ashfaq Khokhar  相似文献   

7.
A Kansei mining system for affective design   总被引:4,自引:0,他引:4  
Affective design has received much attention from both academia and industries. It aims at incorporating customers' affective needs into design elements that deliver customers' affective satisfaction. The main challenge for affective design originates from difficulties in mapping customers' subjective impressions, namely Kansei, to perceptual design elements. This paper intends to develop an explicit decision support to improve the Kansei mapping process by reusing knowledge from past sales records and product specifications. As one of the important applications of data mining, association rule mining lends itself to the discovery of useful patterns associated with the mapping of affective needs. A Kansei mining system is developed to utilize valuable affect information latent in customers' impressions of existing affective designs. The goodness of association rules is evaluated according to their achievements of customers' expectations. Conjoint analysis is applied to measure the expected and achieved utilities of a Kansei mapping relationship. Based on goodness evaluation, mapping rules are further refined to empower the system with useful inference patterns. The system architecture and implementation issues are discussed in detail. An application of Kansei mining to mobile phone affective design is presented.  相似文献   

8.
Design concept is an important wealth-creating activity in companies and infrastructure. However, the process of designing is very complex. Besides, the information required during the conceptual stage is incomplete, imprecise, and fuzzy. Hence, fuzzy set theory should be used to handle linguistic problem at this stage. This paper presents a fuzzy integrated approach to assess the performance of design concepts. And those criteria rating, relative weights and performance levels are captured by fuzzy numbers, and the overall performance of each alternative is calculated through an enhanced fuzzy weighted average (FWA) approach. A practical numerical example is provided to demonstrate the usefulness of this study. In addition, this paper, in order to make computing and ranking results easier to increase the recruiting productivity, develops a computer-based decision support system to help make decisions more efficiently.  相似文献   

9.
This paper develops an efficient methodology to perform reliability-based design optimization (RBDO) by decoupling the optimization and reliability analysis iterations that are nested in traditional formulations. This is achieved by approximating the reliability constraints based on the reliability analysis results. The proposed approach does not use inverse first-order reliability analysis as other existing decoupled approaches, but uses direct reliability analysis. This strategy allows a modular approach and the use of more accurate methods, including Monte-Carlo-simulation (MCS)-based methods for highly nonlinear reliability constraints where first-order reliability approximation may not be accurate. The use of simulation-based methods also enables system-level reliability estimates to be included in the RBDO formulation. The efficiency of the proposed RBDO approach is further improved by identifying the potentially active reliability constraints at the beginning of each reliability analysis. A vehicle side impact problem is used to examine the proposed method, and the results show the usefulness of the proposed method.  相似文献   

10.
Mobile computing systems usually express a user movement trajectory as a sequence of areas that capture the user movement trace. Given a set of user movement trajectories, user movement patterns refer to the sequences of areas through which a user frequently travels. In an attempt to obtain user movement patterns for mobile applications, prior studies explore the problem of mining user movement patterns from the movement logs of mobile users. These movement logs generate a data record whenever a mobile user crosses base station coverage areas. However, this type of movement log does not exist in the system and thus generates extra overheads. By exploiting an existing log, namely, call detail records, this article proposes a Regression-based approach for mining User Movement Patterns (abbreviated as RUMP). This approach views call detail records as random sample trajectory data, and thus, user movement patterns are represented as movement functions in this article. We propose algorithm LS (standing for Large Sequence) to extract the call detail records that capture frequent user movement behaviors. By exploring the spatio-temporal locality of continuous movements (i.e., a mobile user is likely to be in nearby areas if the time interval between consecutive calls is small), we develop algorithm TC (standing for Time Clustering) to cluster call detail records. Then, by utilizing regression analysis, we develop algorithm MF (standing for Movement Function) to derive movement functions. Experimental studies involving both synthetic and real datasets show that RUMP is able to derive user movement functions close to the frequent movement behaviors of mobile users.  相似文献   

11.
This paper discusses the development of an intelligent routing system for automating design of electrical wiring harnesses and pipes in aircraft. The system employs knowledge based engineering (KBE) methods and technologies for capturing and implementing rules and engineering knowledge relating to the routing process. The system reads a mesh of three dimensional structure and obstacles falling within a given search space and connects source and target terminals satisfying a knowledge base of design rules and best practices. Routed paths are output as computer aided design (CAD) readable geometry, and a finite element (FE) mesh consisting of geometry, routed paths and a knowledge layer providing detail of the rules and knowledge implemented in the process. Use of this intelligent routing system provides structure to the routing design process and has potential to deliver significant savings in time and cost.  相似文献   

12.
In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for achieving compliance with k-anonymity is to replace certain values with less specific but semantically consistent values. In this paper we propose a different approach for achieving k-anonymity by partitioning the original dataset into several projections such that each one of them adheres to k-anonymity. Moreover, any attempt to rejoin the projections, results in a table that still complies with k-anonymity. A classifier is trained on each projection and subsequently, an unlabelled instance is classified by combining the classifications of all classifiers.Guided by classification accuracy and k-anonymity constraints, the proposed data mining privacy by decomposition (DMPD) algorithm uses a genetic algorithm to search for optimal feature set partitioning. Ten separate datasets were evaluated with DMPD in order to compare its classification performance with other k-anonymity-based methods. The results suggest that DMPD performs better than existing k-anonymity-based algorithms and there is no necessity for applying domain dependent knowledge. Using multiobjective optimization methods, we also examine the tradeoff between the two conflicting objectives in PPDM: privacy and predictive performance.  相似文献   

13.
The object-oriented approach has been the most popular software design methodology for the past twenty-five years. Several design patterns and principles are defined to improve the design quality of object-oriented software systems. In addition, designers can use unique design motifs that are designed for the specific application domains. Another commonly used technique is cloning and modifying some parts of the software while creating new modules. Therefore, object-oriented programs can include many identical design structures. This work proposes a sub-graph mining-based approach for detecting identical design structures in object-oriented systems. By identifying and analyzing these structures, we can obtain useful information about the design, such as commonly-used design patterns, most frequent design defects, domain-specific patterns, and reused design clones, which could help developers to improve their knowledge about the software architecture. Furthermore, problematic parts of frequent identical design structures are appropriate refactoring opportunities because they affect multiple areas of the architecture. Experiments with several open-source and industrial projects show that we can successfully find many identical design structures within a project (intra-project) and between different projects (inter-project). We observe that usually most of the detected identical structures are an implementation of common design patterns; however, we also detect various anti-patterns, domain-specific patterns, reused design parts and design-level clones.  相似文献   

14.
In recent times, data are generated as a form of continuous data streams in many applications. Since handling data streams is necessary and discovering knowledge behind data streams can often yield substantial benefits, mining over data streams has become one of the most important issues. Many approaches for mining frequent itemsets over data streams have been proposed. These approaches often consist of two procedures including continuously maintaining synopses for data streams and finding frequent itemsets from the synopses. However, most of the approaches assume that the synopses of data streams can be saved in memory and ignore the fact that the information of the non-frequent itemsets kept in the synopses may cause memory utilization to be significantly degraded. In this paper, we consider compressing the information of all the itemsets into a structure with a fixed size using a hash-based technique. This hash-based approach skillfully summarizes the information of the whole data stream by using a hash table, provides a novel technique to estimate the support counts of the non-frequent itemsets, and keeps only the frequent itemsets for speeding up the mining process. Therefore, the goal of optimizing memory space utilization can be achieved. The correctness guarantee, error analysis, and parameter setting of this approach are presented and a series of experiments is performed to show the effectiveness and the efficiency of this approach.  相似文献   

15.
It is routine in probabilistic engineering design to conduct modeling studies to determine the influence of an input variable (or a combination) on the output variable(s). The output or the response can then be fine-tuned by changing the design parameters based on this information. However, simply fine-tuning the output to the desired or target value is not adequate. Robust design principles suggest that we not only study the mean response for a given input vector but also the variance in the output attributed to noise and other unaccounted factors. Given our desire to reduce variability in any process, it is also important to understand which of the input factors affect the variability in the output the most. Given the significant computational overhead associated with most Computer Aided Engineering models, it is becoming popular to conduct such analysis through surrogate models built using a variety of metamodeling techniques. In this regard, existing literature on metamodeling and sensitivity analysis techniques provides useful insights into the various scenarios that they suit the best. However, there has been a limitation of studies that simultaneously consider the combination of metamodeling and sensitivity analysis and the environments in which they operate the best. This paper aims at contributing to reduce this limitation by basing the study on multiple metrics and using two test problems. Two test functions have been used to build metamodels, using three popular metamodeling techniques: Kriging, Radial-Basis Function (RBF) networks, and Support Vector Machines (SVMs). The metamodels are then used for sensitivity analysis, using two popular sensitivity analysis methods, Fourier Amplitude Sensitivity Test (FAST) and Sobol, to determine the influence of variance in the input variables on the variance of the output variables. The advantages and disadvantages of the different metamodeling techniques, in combination with the sensitivity analysis methods, in determining the extent to which the variabilities in the input affect the variabilities in the output are analyzed.  相似文献   

16.
In recent years, the influences of design patterns on software quality have attracted increasing attention in the area of software engineering, as design patterns encapsulate valuable knowledge to resolve design problems, and more importantly to improve the design quality. One of the key challenges in object-oriented design is how to apply appropriate design patterns during the system development. In this paper, design pattern is analyzed from different perspectives to see how it can facilitate design activities, handle non-functional requirement, solve design problems and resolve design conflicts. Based on the analysis, various kinds of applicability of design patterns are explored and integrated with a goal-driven approach to guiding developers to construct the object-oriented design model in a systematic manner. There are three benefits to the proposed approach: making it easy to meet requirements, helping resolve design conflicts, and facilitating improvement of the design quality.  相似文献   

17.
A novel multi-objective genetic algorithm (GA)-based rule-mining method for affective product design is proposed to discover a set of rules relating design attributes with customer evaluation based on survey data. The proposed method can generate approximate rules to consider the ambiguity of customer assessments. The generated rules can be used to determine the lower and upper limits of the affective effect of design patterns. For a rule-mining problem, the proposed multi-objective GA approach could simultaneously consider the accuracy, comprehensibility, and definability of approximate rules. In addition, the proposed approach can deal with categorical attributes and quantitative attributes, and determine the interval of quantitative attributes. Categorical and quantitative attributes in affective product design should be considered because they are commonly used to define the design profile of a product. In this paper, a two-stage rule-mining approach is proposed to generate rules with a simple chromosome design in the first stage of rule mining. In the second stage of rule mining, entire rule sets are refined to determine solutions considering rule interaction. A case study on mobile phones is used to demonstrate and validate the performance of the proposed rule-mining method. The method can discover rule sets with good support and coverage rates from the survey data.  相似文献   

18.
Due to the rapid development of information technologies, abundant data have become readily available. Data mining techniques have been used for process optimization in many manufacturing processes in automotive, LCD, semiconductor, and steel production, among others. However, a large amount of missing values occurs in the data set due to several causes (e.g., data discarded by gross measurement errors, measurement machine breakdown, routine maintenance, sampling inspection, and sensor failure), which frequently complicate the application of data mining to the data set. This study proposes a new procedure for optimizing processes called missing values-Patient Rule Induction Method (m-PRIM), which handles the missing-values problem systematically and yields considerable process improvement, even if a significant portion of the data set has missing values. A case study in a semiconductor manufacturing process is conducted to illustrate the proposed procedure.  相似文献   

19.
This study used the C4.5 data mining algorithm to model farmers' crop choice in two watersheds in Thailand. Previous attempts in the Integrated Water Resource Assessment and Management Project to model farmers' crop choice produced large sets of decision rules. In order to produce simplified models of farmers' crop choice, data mining operations were applied for each soil series in the study areas. The resulting decision trees were much smaller in size. Land type, water availability, tenure, capital, labor availability as well as non-farm and livestock income were found to be important considerations in farmers' decision models. Profitability was also found important although it was represented in approximate ranges. Unlike the general wisdom on farmers' crop choice, these decision trees came with threshold values and sequential order of the important variables. The decision trees were validated using the remaining unused set of data, and their accuracy in predicting farmers' decisions was around 84%. Because of their simple structure, the decision trees produced in this study could be useful to analysts of water resource management as they can be integrated with biophysical models for sustainable watershed management.  相似文献   

20.
This paper presents a data mining approach for modeling the adiabatic temperature rise during concrete hydration. The model was developed based on experimental data obtained in the last thirty years for several mass concrete constructions in Brazil, including some of the hugest hydroelectric power plants in operation in the world. The input of the model is a variable data set corresponding to the binder physical and chemical properties and concrete mixture proportions. The output is a set of three parameters that determine a function which is capable to describe the adiabatic temperature rise during concrete hydration. The comparison between experimental data and modeling results shows the accuracy of the proposed approach and that data mining is a potential tool to predict thermal stresses in the design of massive concrete structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号