首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Data analysis often involves finding models that can explain patterns in data, and reduce possibly large data sets to more compact model‐based representations. In Statistics, many methods are available to compute model information. Among others, regression models are widely used to explain data. However, regression analysis typically searches for the best model based on the global distribution of data. On the other hand, a data set may be partitioned into subsets, each requiring individual models. While automatic data subsetting methods exist, these often require parameters or domain knowledge to work with. We propose a system for visual‐interactive regression analysis for scatter plot data, supporting both global and local regression modeling. We introduce a novel regression lens concept, allowing a user to interactively select a portion of data, on which regression analysis is run in interactive time. The lens gives encompassing visual feedback on the quality of candidate models as it is interactively navigated across the input data. While our regression lens can be used for fully interactive modeling, we also provide user guidance suggesting appropriate models and data subsets, by means of regression quality scores. We show, by means of use cases, that our regression lens is an effective tool for user‐driven regression modeling and supports model understanding.  相似文献   

2.
Object models or class diagrams are widely used for capturing information system requirements in terms of classes with attributes and operations, and relationships among those classes. Although numerous guidelines are available for object modeling as part of requirements modeling, developing quality object models has always been considered a challenging task, especially for novice systems analysts in business environments. This paper presents an approach that can be used to support the development of quality object models. The approach is implemented as a knowledge-based system extension to an open source CASE tool to offer recommendations for improving the quality of object models. The knowledge component of this system incorporates an ontology of quality problems that is based on a conceptual model quality framework commonly found in object models, the findings of related empirical studies, and a set of analysis patterns. The results obtained from an empirical evaluation of the prototype demonstrate the utility of this system, especially with respect to recommendations related to the model completeness aspect of semantic quality.  相似文献   

3.
Fast advancement of technology has led to an increased interest for using information technology to provide feedback based on learning behavior observations. This work outlines a novel approach for analyzing behavioral learner data through the application of process mining techniques specifically targeting a complex problem solving process. We realize this in the context of one particular learning case, namely, domain modeling. This work extends our previous research on process-mining analysis of domain modeling behavior of novices by elaborating with new insights from a replication study enhanced with an extra observation on how novices verify/validate models. The findings include a set of typical modeling and validation patterns that can be used to improve teaching guidance for domain modeling courses. From a scientific viewpoint, the results contribute to improving our knowledge on the cognitive aspects of problem-solving behavior of novices in the area of domain modeling, specifically regarding process-oriented feedback as opposed to traditional outcome feedback (is a solution correct? Why (not)?) usually applied in this type of courses. Ultimately, the outcomes of the work can be inspirational outside of the area of domain modeling as learning event data is becoming readily available through virtual learning environments and other information systems.  相似文献   

4.
There are types of information systems, like those that produce group recommendations or a market segmentation, in which it is necessary to aggregate big amounts of data about a group of users in order to filter the data. Group modeling is the process that combines multiple user models into a single model that represents the knowledge available about the preferences of the users in a group. In group recommendation, group modeling allows a system to derive a group preference for each item. Different strategies lead to completely different group models, so the strategy used to model a group has to be evaluated in the domain in which the group recommender system operates. This paper evaluates group modeling strategies in a group recommendation scenario in which groups are detected by clustering the users. Once users are clustered and groups are formed, different strategies are tested, in order to find the one that allows a group recommender system to get the best accuracy. Experimental results show that the strategy used to build the group models strongly affects the performance of a group recommender system. An interesting property derived by our study is that clustering and group modeling are strongly connected. Indeed, the modeling strategy takes the same role that the centroid has when users are clustered, by producing group preferences that are equally distant from the preferences of every user. This “continuity” among the two tasks is essential in order to build accurate group recommendations.  相似文献   

5.
Computer-integrated manufacturing requires models of manufacturing processes to be implemented on the computer. Process models are required for designing adaptive control systems and selecting optimal parameters during process planning. Mechanistic models developed from the principles of machining science are useful for implementing on a computer. However, in spite of the progress being made in mechanistic process modeling, accurate models are not yet available for many manufacturing processes. Empirical models derived from experimental data still play a major role in manufacturing process modeling. Generally, statistical regression techniques are used for developing such models. However, these techniques suffer from several disadvantages. The structure (the significant terms) of the regression model needs to be decided a priori. These techniques cannot be used for incrementally improving models as new data becomes available. This limitation is particularly crucial in light of the advances in sensor technology that allow economical on-line collection of manufacturing data. In this paper, we explore the use of artificial neural networks (ANN) for developing empirical models from experimental data for a machining process. These models are compared with polynomial regression models to assess the applicability of ANN as a model-building tool for computer-integrated manufacturing.Operated for the United States Department of Energy under contract No. DE-AC04-76-DP00613.  相似文献   

6.
System identification using Kautz models   总被引:1,自引:0,他引:1  
In this paper, the problem of approximating a linear time-invariant stable system by a finite weighted sum of given exponentials is considered. System identification schemes using Laguerre models are extended and generalized to Kautz models, which correspond to representations using several different possible complex exponentials. In particular, linear regression methods to estimate this sort of model from measured data are analyzed. The advantages of the proposed approach are the simplicity of the resulting identification scheme and the capability of modeling resonant systems using few parameters. The subsequent analysis is based on the result that the corresponding linear regression normal equations have a block Toeplitz structure. Several results on transfer function estimation are extended to discrete Kautz models, for example, asymptotic frequency domain variance expressions  相似文献   

7.
Information sources such as relational databases, spreadsheets, XML, JSON, and Web APIs contain a tremendous amount of structured data that can be leveraged to build and augment knowledge graphs. However, they rarely provide a semantic model to describe their contents. Semantic models of data sources represent the implicit meaning of the data by specifying the concepts and the relationships within the data. Such models are the key ingredients to automatically publish the data into knowledge graphs. Manually modeling the semantics of data sources requires significant effort and expertise, and although desirable, building these models automatically is a challenging problem. Most of the related work focuses on semantic annotation of the data fields (source attributes). However, constructing a semantic model that explicitly describes the relationships between the attributes in addition to their semantic types is critical.We present a novel approach that exploits the knowledge from a domain ontology and the semantic models of previously modeled sources to automatically learn a rich semantic model for a new source. This model represents the semantics of the new source in terms of the concepts and relationships defined by the domain ontology. Given some sample data from the new source, we leverage the knowledge in the domain ontology and the known semantic models to construct a weighted graph that represents the space of plausible semantic models for the new source. Then, we compute the top k candidate semantic models and suggest to the user a ranked list of the semantic models for the new source. The approach takes into account user corrections to learn more accurate semantic models on future data sources. Our evaluation shows that our method generates expressive semantic models for data sources and services with minimal user input. These precise models make it possible to automatically integrate the data across sources and provide rich support for source discovery and service composition. They also make it possible to automatically publish semantic data into knowledge graphs.  相似文献   

8.
9.
To ensure a consistent design representation for serving multidisciplinary analysis, this research study proposes an intelligent modeling system to automatically generate multiphysics simulation models to support multidisciplinary design optimization processes by using a knowledge based engineering approach. A key element of this system is a multiphysics information model (MIM), which integrates the design and simulation knowledge from multiple engineering domains. The intelligent modeling system defines classes with attributes to represent various aspects of physical entities. Moreover, it uses functions to capture the non-physical information, such as control architecture, simulation test maneuvers and simulation procedures. The challenge of system coupling and the interactions among the disciplines are taken into account during the process of knowledge acquisition. Depending on the domain requirements, the intelligent modeling system extracts the required knowledge from the MIM and uses this first to instantiate submodels and second to construct the multiphysics simulation model by combining all submodels. The objective of this research is to reduce the time and effort for modeling complex systems and to provide a consistent and concurrent design environment to support multidisciplinary design optimization. The development of an unstable and unmanned aerial vehicle, a multirotor UAV, is selected as test case. The intelligent modeling system is demonstrated by modeling thirty-thousand multirotor UAV designs with different topologies and by ensuring the automatic development of a consistent control system dedicated for each individual design. Moreover, the resulting multiphysics simulation model of the multirotor UAV is validated by comparing with the flight data of an actual quadrotor UAV. The results show that the multiphysics simulation model matches test data well and indicate that high fidelity models can be generated with the automatic model generation process.  相似文献   

10.
Model management research investigates the formulation, analysis and interpretation of models. This paper focuses on the formulation aspects of modeling so that the task can be supported by decision support systems (DSS) environments. Given the knowledge intensive nature of the formulation process, the development of a modeling tool requires explicating the knowledge pertaining to modeling. This involves comprehending not only the static knowledge about model components (e.g. decision variables, coefficients, associated indices and constants), but also the process knowledge required to construct models from model pieces. The proposed top-down approach configures equations by exploiting the structural modeling knowledge inherent in equation components. The possible representation of equations at various abstraction levels is introduced, the aim being to uncover the structural model components together with the process knowledge required for their appropriate configuration. As part of developing this conceptual model, the role of semantic and syntactic information in model building is investigated. The paper proposes an approach where the formulation semantics are captured as a simple 'action-resource' view which composes models by identifying and piecing together the equation components. The process of equation construction is illustrated with examples from the linear programming (LP) modeling domain. The proposed top-down approach is contrasted with a bottom-up method.  相似文献   

11.
This paper presents an expert system-based procedure for the creation of airframe finite element models from the geometric model available in a computer-aided design system. The objectives of the approach presented is the computerization of a process that is currently carried out in a semimanual manner, include previous modeling knowledge into the system, and provide a clear path for an ever-increasing level of automation in the process of creating analysis models for this class of structure. A main feature of the system developed is the combined use of algorithmic procedures and expert knowledge operating on data provided by previously run design software to produce an entirely different form of model to be used as input to a numerical analysis.  相似文献   

12.
Kave: a tool for knowledge acquisition to support artificial ventilation   总被引:1,自引:0,他引:1  
A decision support system for artificial ventilation is being developed. One of the fundamental goals for this system is the application of the system when a domain expert is not present. Such a system requires a rich knowledge base. The knowledge acquisition process is often considered to be the bottleneck in acquiring such a complete knowledge base. Since no single available method, for example interviewing domain experts, is sufficient for removing this bottleneck, we have chosen a combination of different methods. The different backgrounds of knowledge engineers and domain experts could cause communication restrictions and difficulties between them, e.g. they might not understand each others knowledge domain and this will affect formulation of the knowledge. To solve this problem we needed a tool which supports both the knowledge engineer and the domain expert already from the initial phase of developing the knowledge base. We have developed a knowledge acquisition system called KAVE to elicit knowledge from domain experts and storing it in the knowledge base. KAVE is based on a domain specific conceptual model which is a result of cooperation between knowledge engineers and domain experts during identification, design and structuring of knowledge for this domain. KAVE includes a patient simulator to help validate knowledge in the knowledge base and a knowledge editor to facilitate refinement and maintenance of the knowledge base.  相似文献   

13.
In software product line engineering, the customers mostly concentrate on the functionalities of the target product during product configuration. The quality attributes of a target product, such as security and performance, are often assessed until the final product is generated. However, it might be very costly to fix the problem if it is found that the generated product cannot satisfy the customers’ quality requirements. Although the quality of a generated product will be affected by all the life cycles of product development, feature-based product configuration is the first stage where the estimation or prediction of the quality attributes should be considered. As we know, the key issue of predicting the quality attributes for a product configured from feature models is to measure the interdependencies between functional features and quality attributes. The current existing approaches have several limitations on this issue, such as requiring real products for the measurement or involving domain experts’ efforts. To overcome these limitations, we propose a systematic approach of modeling quality attributes in feature models based on domain experts’ judgments using the analytic hierarchical process (AHP) and conducting quality aware product configuration based on the captured quality knowledge. Domain experts’ judgments are adapted to avoid generating the real products for quality evaluation, and AHP is used to reduce domain experts’ efforts involved in the judgments. A prototype tool is developed to implement the concepts of the proposed approach, and a formal evaluation is carried out based on a large-scale case study.  相似文献   

14.
One of the most impressive characteristics of human perception is its domain adaptation capability. Humans can recognize objects and places simply by transferring knowledge from their past experience. Inspired by that, current research in robotics is addressing a great challenge: building robots able to sense and interpret the surrounding world by reusing information previously collected, gathered by other robots or obtained from the web. But, how can a robot automatically understand what is useful among a large amount of information and perform knowledge transfer? In this paper we address the domain adaptation problem in the context of visual place recognition. We consider the scenario where a robot equipped with a monocular camera explores a new environment. In this situation traditional approaches based on supervised learning perform poorly, as no annotated data are provided in the new environment and the models learned from data collected in other places are inappropriate due to the large variability of visual information. To overcome these problems we introduce a novel transfer learning approach. With our algorithm the robot is given only some training data (annotated images collected in different environments by other robots) and is able to decide whether, and how much, this knowledge is useful in the current scenario. At the base of our approach there is a transfer risk measure which quantifies the similarity between the given and the new visual data. To improve the performance, we also extend our framework to take into account multiple visual cues. Our experiments on three publicly available datasets demonstrate the effectiveness of the proposed approach.  相似文献   

15.
内容分发网络(content delivery network,CDN)是互联网上的重要基础设施,目前识别CDN域名的方法主要利用域名字符特征、HTTP关键字和DNS记录等,识别范围有限.针对大规模识别CDN域名的问题,提出了基于域名系统知识图谱的CDN域名识别技术.根据域名系统的特征进行本体建模、数据获取、知识图谱构...  相似文献   

16.
Many problems in machine learning and computer vision consist of predicting multi-dimensional output vectors given a specific set of input features. In many of these problems, there exist inherent temporal and spatial dependencies between the output vectors, as well as repeating output patterns and input–output associations, that can provide more robust and accurate predictors when modeled properly. With this intrinsic motivation, we propose a novel Output-Associative Relevance Vector Machine (OA-RVM) regression framework that augments the traditional RVM regression by being able to learn non-linear input and output dependencies. Instead of depending solely on the input patterns, OA-RVM models output covariances within a predefined temporal window, thus capturing past, current and future context. As a result, output patterns manifested in the training data are captured within a formal probabilistic framework, and subsequently used during inference. As a proof of concept, we target the highly challenging problem of dimensional and continuous prediction of emotions, and evaluate the proposed framework by focusing on the case of multiple nonverbal cues, namely facial expressions, shoulder movements and audio cues. We demonstrate the advantages of the proposed OA-RVM regression by performing subject-independent evaluation using the SAL database that constitutes naturalistic conversational interactions. The experimental results show that OA-RVM regression outperforms the traditional RVM and SVM regression approaches in terms of accuracy of the prediction (evaluated using the Root Mean Squared Error) and structure of the prediction (evaluated using the correlation coefficient), generating more accurate and robust prediction models.  相似文献   

17.
Establishing interschema semantic knowledge between corresponding elements in a cooperating OWL-based multi-information server grid environment requires deep knowledge, not only about the structure of the data represented in each server, but also about the commonly occurring differences in the intended semantics of this data. The same information could be represented in various incompatible structures, and more importantly the same structure could be used to represent data with many diverse and incompatible semantics. In a grid environment interschema semantic knowledge can only be detected if both the structural and semantic properties of the schemas of the cooperating servers are made explicit and formally represented in a way that a computer system can process. Unfortunately, very often there is lack of such knowledge and the underlying grid information servers (ISs) schemas, being semantically weak as a consequence of the limited expressiveness of traditional data models, do not help the acquisition of this knowledge. The solution to overcome this limitation is primarily to upgrade the semantic level of the IS local schemas through a semantic enrichment process by augmenting the local schemas of grid ISs to semantically enriched schema models, then to use these models in detecting and representing correspondences between classes belonging to different schemas. In this paper, we investigate the possibility of using OWL-based domain ontologies both for building semantically rich schema models, and for expressing interschema knowledge and reasoning about it. We believe that the use of OWL/RDF in this setting has two important advantages. On the one hand, it enables a semantic approach for interschema knowledge specification, by concentrating on expressing conceptual and semantic correspondences between both the conceptual (intensional) definition and the set of instances (extension) of classes represented in different schemas. On the other hand, it is exactly this semantic nature of our approach that allows us to devise reasoning mechanisms for discovering and reusing interschema knowledge when the need arises to compare and combine it.  相似文献   

18.
Software metrics-based quality classification models predict a software module as either fault-prone (fp) or not fault-prone (nfp). Timely application of such models can assist in directing quality improvement efforts to modules that are likely to be fp during operations, thereby cost-effectively utilizing the software quality testing and enhancement resources. Since several classification techniques are available, a relative comparative study of some commonly used classification techniques can be useful to practitioners. We present a comprehensive evaluation of the relative performances of seven classification techniques and/or tools. These include logistic regression, case-based reasoning, classification and regression trees (CART), tree-based classification with S-PLUS, and the Sprint-Sliq, C4.5, and Treedisc algorithms. The use of expected cost of misclassification (ECM), is introduced as a singular unified measure to compare the performances of different software quality classification models. A function of the costs of the Type I (a nfp module misclassified as fp) and Type II (a fp module misclassified as nfp) misclassifications, ECM is computed for different cost ratios. Evaluating software quality classification models in the presence of varying cost ratios is important, because the usefulness of a model is dependent on the system-specific costs of misclassifications. Moreover, models should be compared and preferred for cost ratios that fall within the range of interest for the given system and project domain. Software metrics were collected from four successive releases of a large legacy telecommunications system. A two-way ANOVA randomized-complete block design modeling approach is used, in which the system release is treated as a block, while the modeling method is treated as a factor. It is observed that predictive performances of the models is significantly different across the system releases, implying that in the software engineering domain prediction models are influenced by the characteristics of the data and the system being modeled. Multiple-pairwise comparisons are performed to evaluate the relative performances of the seven models for the cost ratios of interest to the case study. In addition, the performance of the seven classification techniques is also compared with a classification based on lines of code. The comparative approach presented in this paper can also be applied to other software systems.  相似文献   

19.
The analysis and mining of traffic video sequences to discover important but previously unknown knowledge such as vehicle identification, traffic flow, queue detection, incident detection, and the spatio-temporal relations of the vehicles at intersections, provide an economic approach for daily traffic monitoring operations. To meet such demands, a multimedia data mining framework is proposed in this paper. The proposed multimedia data mining framework analyzes the traffic video sequences using background subtraction, image/video segmentation, vehicle tracking, and modeling with the multimedia augmented transition network (MATN) model and multimedia input strings, in the domain of traffic monitoring over traffic intersections. The spatio-temporal relationships of the vehicle objects in each frame are discovered and accurately captured and modeled. Such an additional level of sophistication enabled by the proposed multimedia data mining framework in terms of spatio-temporal tracking generates a capability for automation. This capability alone can significantly influence and enhance current data processing and implementation strategies for several problems vis-à-vis traffic operations. Three real-life traffic video sequences obtained from different sources and with different weather conditions are used to illustrate the effectiveness and robustness of the proposed multimedia data mining framework by demonstrating how the proposed framework can be applied to traffic applications to answer the spatio-temporal queries.  相似文献   

20.
This paper addresses the task of identification of nonlinear dynamic systems from measured data. The discrete-time variant of this task is commonly reformulated as a regression problem. As tree ensembles have proven to be a successful predictive modeling approach, we investigate the use of tree ensembles for solving the regression problem. While different variants of tree ensembles have been proposed and used, they are mostly limited to using regression trees as base models. We introduce ensembles of fuzzified model trees with split attribute randomization and evaluate them for nonlinear dynamic system identification.Models of dynamic systems which are built for control purposes are usually evaluated by a more stringent evaluation procedure using the output, i.e., simulation error. Taking this into account, we perform ensemble pruning to optimize the output error of the tree ensemble models. The proposed Model-Tree Ensemble method is empirically evaluated by using input–output data disturbed by noise. It is compared to representative state-of-the-art approaches, on one synthetic dataset with artificially introduced noise and one real-world noisy data set. The evaluation shows that the method is suitable for modeling dynamic systems and produces models with comparable output error performance to the other approaches. Also, the method is resilient to noise, as its performance does not deteriorate even when up to 20% of noise is added.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号