首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In recent years, generalization-based data mining techniques have become an interesting topic for many data scientists. Generalized itemset mining is an exploration technique that focuses on extracting high-level abstractions and correlations in a database. However, the problem that domain experts must always deal with is how to manage and interpret a large number of extracted patterns from a massive database of transactions. In generalized pattern mining, taxonomies that contain abstraction information for each dataset are defined, so the number of frequent patterns can grow enormously. Therefore, exploiting knowledge turns into a difficult and costly process. In this article, we introduce an approach that uses cardinality-based constraints with transaction id and numeric encoding to mine generalized patterns. We applied transaction id to support the computation of each frequent itemset as well as to encode taxonomies into a numeric type using two simple rules. We also attempted to apply the combination of cardinality cons- traints and closed or maximal patterns. Experiments show that our optimizations significantly improve the performance of the original method, and the importance of comprehensive information within closed and maximal patterns is worth considering in generalized frequent pattern mining.  相似文献   

2.
3.
In our previous work, we introduced a computational architecture that effectively supports the tasks of continuous monitoring and of aggregation querying of complex domain meaningful time-oriented concepts and patterns (temporal abstractions), in environments featuring large volumes of continuously arriving and accumulating time-oriented raw data. Examples include provision of decision support in clinical medicine, making financial decisions, detecting anomalies and potential threats in communication networks, integrating intelligence information from multiple sources, etc. In this paper, we describe the general, domain-independent but task-specific problem-solving method underling our computational architecture, which we refer to as incremental knowledge-based temporal abstraction (IKBTA). The IKBTA method incrementally computes temporal abstractions by maintaining persistence and validity of continuously computed temporal abstractions from arriving time-stamped data. We focus on the computational framework underlying our reasoning method, provide well-defined semantic and knowledge requirements for incremental inference, which utilizes a logical model of time, data, and high-level abstract concepts, and provide a detailed analysis of the computational complexity of our approach.  相似文献   

4.
Additive ensembles of admissible heuristics constitute the most general form of exploiting the individual strengths of numerous admissible heuristics in optimal planning. However, the same set of heuristics can be additively composed in infinitely many ways and the quality of the resulting heuristic estimate depends directly on the choice of the composition. Focusing on abstraction heuristics, we describe a procedure that takes a deterministic planning problem, a forward-search state, and a set of abstraction-based admissible heuristics, and derives an optimal additive composition of these heuristics with respect to the given state. Most importantly, we show that this procedure is polynomial-time for arbitrary sets of all abstraction heuristics with which we are acquainted, including explicit abstractions such as pattern databases (regular or constrained) and merge-and-shrink, and implicit abstractions such as fork-decomposition and abstractions based on tractable constraint optimization over tree-shaped constraint networks.  相似文献   

5.
In this paper we investigate how standard model checkers can be applied to checking refinement relationships between Z specifications. The major obstacle to such a use are the (potentially) infinite data domains in specifications. Consequently, we examine the application of data abstraction techniques for reducing the infinite to a finite state space. Since data abstractions do, however, decrease the amount of information in a specification, refinement can—in general—not be proven on the abstractions anymore, it can only be disproved. The model checker can thus be used to generate counter examples to a refinement relationship. Here, we show how abstract specifications can be systematically constructed (from a given data abstraction) and how a standard model checker (FDR) can be applied to find counter examples in case when refinement is absent. We especially discuss the applicability of the construction method: it constructs abstract specifications which are either upward or downward simulations of the original specifications, and depending on the operations in the specification and the data abstraction chosen, such a construction might succeed or fail. The construction abstracts both the input/output as well as the state.  相似文献   

6.
Internet users may suffer the empty or too little answer problem when they post a strict query to the Web database. To address this problem, we develop a general framework to enable automatically query relaxation and top-k result ranking. Our framework consists of two processing steps. The first step is query relaxation. Based on the user original query, we speculate how much the user cares about each specified attribute by measuring its specified value distribution in the database. The rare distribution of the specified value of the attribute indicates the attribute may important for the user. According to the attribute importance, the original query is then rewritten as a relaxed query by expanding each query criterion range. The relaxed degree on each specified attribute is varied with the attribute weight adaptively. The most important attribute is relaxed with the minimum degree so that the answer returned by the relaxed query can be most relevant to the user original intention. The second step is top-k result ranking. In this step, we first generate user contextual preferences from query history and then use them to create a priori orders of tuples during the off-line pre-processing. Only a few representative orders are saved, each corresponding to a set of contexts. Then, these orders and associated contexts are used at querying time to expeditiously provide top-k relevant answers by using the top-k evaluation algorithm. Results of a preliminary user study demonstrate our query relaxation, and top-k result ranking methods can capture the users preferences effectively. The efficiency and effectiveness of our approach is also demonstrated.  相似文献   

7.
In this article we investigate an attribute-oriented induction approach for acquisition of abstract knowledge from data stored in a fuzzy database environment. We utilize a proximity-based fuzzy database schema as the medium carrying the original information, where lack of precise information about an entity can be reflected via multiple attribute values, and the classical equivalence relation is replaced with the broader fuzzy proximity relation. We analyze in detail the process of attribute-oriented induction by concept hierarchies, utilizing the original properties of fuzzy databases to support this established data mining technique. In our approach we take full advantage of the implicit knowledge about the similarity of original attribute values, included by default in the investigated fuzzy database schemas. © 2007 Wiley Periodicals, Inc. Int J Int Syst 22: 763–779, 2007.  相似文献   

8.
Directed model checking is a well-established approach for detecting error states in concurrent systems. A popular variant to find shortest error traces is to apply the A\(^*\) search algorithm with distance heuristics that never overestimate the real error distance. An important class of such distance heuristics is the class of pattern database heuristics. Pattern database heuristics are built on abstractions of the system under consideration. In this paper, we propose downward pattern refinement, a systematic approach for the construction of pattern database heuristics for concurrent systems of timed automata. First, we propose a general framework for pattern databases in the context of timed automata and show that desirable theoretical properties hold for the resulting pattern database. Afterward, we formally define a concept to measure the accuracy of abstractions. Based on this concept, we propose an algorithm for computing succinct abstractions that are still accurate to produce informed pattern databases. We evaluate our approach on large and complex industrial problems. The experiments show the practical potential of the resulting pattern database heuristic.  相似文献   

9.
Predicate abstraction and counterexample-guided abstraction refinement (CEGAR) have enabled finite-state model checking of software written in mainstream programming languages. This combination of techniques has been successful in analysing system-level sequential C code. In contrast, there is little evidence of fruitful applications of CEGAR to shared-variable concurrent software. We attribute this gap to the lack of abstraction strategies that permit a scalable analysis of the resulting multi-threaded Boolean programs. The goal of this paper is to close this gap. We have developed a symmetry-aware CEGAR technique: it takes into account the replicated structure of programs that consist of many threads executing the same procedure, and generates a Boolean program template whose multi-threaded execution soundly overapproximates the original concurrent program. State explosion during model checking parallel instantiations of this template can now be absorbed by exploiting symmetry. We have implemented our method in a tool, SymmPa, and demonstrate its superior performance over alternative approaches on a range of synchronisation programs.  相似文献   

10.
Efficient attribute reduction in large, incomplete decision systems is a challenging problem; existing approaches have time complexities no less than O(∣C2U2). This paper derives some important properties of incomplete information systems, then constructs a positive region-based algorithm to solve the attribute reduction problem with a time complexity no more than O(∣C2U∣log∣U∣). Furthermore, our approach does not change the size of the original incomplete system. Numerical experiments show that the proposed approach is indeed efficient, and therefore of practical value to many real-world problems. The proposed algorithm can be applied to both consistent and inconsistent incomplete decision systems.  相似文献   

11.
A large number of design decisions are made during the conceptual design of a part. However, there are few representation and reasoning tools for decision support during conceptual design. The conceptual design stage is characterized by a lack of complete geometric information. Existing geometric modelers require complete geometric information, while a functional reasoning methodology using a <verb, noun > representation is typically too terse. In this paper, we present a new representation called sketching abstraction for conceptual design, using the function-form relations in a design. The functionally critical part of the geometry is presented using a set of functional features, while the rest of the geometry is abstracted as a set of linkages. Part functionality is correlated with the sketching abstraction using data structures called function-form matrices. The sketching abstraction is annotated using a set of primitives, and a set of grammar rules are used to extract canonical relationships between the functional features. The sketching abstraction can be used for extracting designs that are geometrically dissimilar but functionally similar, thus providing the designer with ideas for design alternatives. The sketching abstraction can also be used to carry out domain-dependent manufacturability evaluation checks. A further use of sketching abstractions is to initiate the development of a process plan for manufacturing. Sketching abstractions are related to the solid model of a part. Thus, this representation provides a link between pure functional and pure geometric representations. The domain of application is stamped metal parts. We present the part functionality and the features used in this domain. We also illustrate the use of sketching abstractions for conceptual design, manufacturability evaluation and preliminary process planning.  相似文献   

12.
Given a 3-valued abstraction of a program (possibly generated using static program analysis and predicate abstraction) and a temporal logic formula, generalized model checking (GMC) checks whether there exists a concretization of that abstraction that satisfies the formula. In this paper, we revisit generalized model checking for linear time (LTL) properties. First, we show that LTL GMC is 2EXPTIME-complete in the size of the formula and polynomial in the model, where the degree of the polynomial depends on the formula, instead of EXPTIME-complete and quadratic as previously believed. The standard definition of GMC depends on a definition of concretization which is tailored for branching-time model checking. We then study a simpler linear completeness preorder for relating program abstractions. We show that LTL GMC with this weaker preorder is only EXPSPACE-complete in the size of the formula, and can be solved in linear time and logarithmic space in the size of the model. Finally, we identify classes of formulas for which the model complexity of standard GMC is reduced.  相似文献   

13.
Three prevalent abstractions in temporal information are examined by using the machinery of first order logic. The abstraction of time allows one to concentrate on temporal objects only as they relate to other temporal objects in time. It is represented by a functional relationship between temporal objects and time intervals. The abstraction of identity allows one to concentrate on how an observed phenomenon relates to other phenomena in terms of their being manifestations of the same object. It is represented by a functional relationship between temporal phenomena and “completed” temporal objects. The abstraction of circumstance embodies a focus of attention on particular configurations or states of groups of temporal phenomena. It is represented by functional relationships between thesis groups and other objects called “events” or “states”.A novel concept, called absolute/relative abstraction, is used to formalize the abstractions of time and identity. The abstraction of circumstance, on the other hand, is an example of aggregation. The significance and use of thesis abstractions in the representation and processing of historical information is discussed.  相似文献   

14.
We are considering knowledge discovery from data describing a piece of real or abstract world. The patterns being induced put in evidence some laws hidden in the data. The most natural representation of patterns-laws is by “if..., then...” decision rules relating some conditions with some decisions. The same representation of patterns is used in multi-attribute classification, thus the data searched for discovery of these patterns can be seen as classification data. We adopt the classification perspective to present an original methodology of inducing general laws from data and representing them by so-called monotonic decision rules. Monotonicity concerns relationships between values of condition and decision attributes, e.g. the greater the mass (condition attribute), the greater the gravity (decision attribute), which is a specific feature of decision rules discovered from data using the Dominance-based Rough Set Approach (DRSA). While in DRSA one has to suppose a priori the presence or absence of positive or negative monotonicity relationships which hold in the whole evaluation space, in this paper, we show that DRSA can be adapted to discover rules from any kind of input classification data, exhibiting monotonicity relationships which are unknown a priori and hold in some parts of the evaluation space only. This requires a proper non-invasive transformation of the classification data, permitting representation of both positive and negative monotonicity relationships that are to be discovered by the proposed methodology. Reported results of a computational experiment confirm that the proposed methodology leads to decision rules whose predictive ability is similar to the best classification predictors. It has, however, a unique advantage over all competitors because the monotonic decision rules can be read as laws characterizing the analyzed phenomena in terms of easily understandable “if..., then...” decision rules, while other predictor models have no such straightforward interpretation.  相似文献   

15.
There has been considerable progress in the domain of software verification over the last few years. This advancement has been driven, to a large extent, by the emergence of powerful yet automated abstraction techniques such as predicate abstraction. However, the state-space explosion problem in model checking remains the chief obstacle to the practical verification of real-world distributed systems. Even in the case of purely sequential programs, a crucial requirement to make predicate abstraction effective is to use as few predicates as possible. This is because, in the worst case, the state-space of the abstraction generated (and consequently the time and memory complexity of the abstraction process) is exponential in the number of predicates involved. In addition, for concurrent programs, the number of reachable states could grow exponentially with the number of components. We attempt to address these issues in the context of verifying concurrent (message-passing) C programs against safety specifications. More specifically, we present a fully automated compositional framework which combines two orthogonal abstraction techniques (predicate abstraction for data and action-guided abstraction for events) within a counterexample-guided abstraction refinement scheme. In this way, our algorithm incrementally increases the granularity of the abstractions until the specification is either established or refuted. Additionally, a key feature of our approach is that if a property can be proved to hold or not hold based on a given finite set of predicates $\mathcal{P}$ , the predicate refinement procedure we propose in this article finds automatically a minimal subset of $\mathcal{P}$ that is sufficient for the proof. This, along with our explicit use of compositionality, delays the onset of state-space explosion for as long as possible. We describe our approach in detail, and report on some very encouraging experimental results obtained with our tool MAGIC.  相似文献   

16.

Learning from patient records may aid medical knowledge acquisition and decision making. Decision tree induction, based on ID3, is a well-known approach of learning from examples. In this article we introduce a new data representation formalism that extends the original ID3 algorithm. We propose a new algorithm, ID+, which adopts this representation scheme. ID+ provides the capability of modeling dependencies between attributes or attribute values and of handling multiple values per attribute. We demonstrate our work via a series of medical knowledge acquisition experiments that are based on a ''real-world'' application of acute abdominal pain in children. In the context of these experiments, we compare ID+ with C4.5, NewId, and a Naive Bayesian classifier. Results demonstrate that the rules acquired via ID+ improve decision tree clinical comprehensibility and complement explanations supported by the Naive Bayesian classifier, while in terms of classification, accuracy decrease is marginal.  相似文献   

17.
In this paper we present an approach for supporting the semi-automated architectural abstraction of architectural models throughout the software life-cycle. It addresses the problem that the design and implementation of a software system often drift apart as software systems evolve, leading to architectural knowledge evaporation. Our approach provides concepts and tool support for the semi-automatic abstraction of architecture component and connector views from implemented systems and keeping the abstracted architecture models up-to-date during software evolution. In particular, we propose architecture abstraction concepts that are supported through a domain-specific language (DSL). Our main focus is on providing architectural abstraction specifications in the DSL that only need to be changed, if the architecture changes, but can tolerate non-architectural changes in the underlying source code. Once the software architect has defined an architectural abstraction in the DSL, we can automatically generate architectural component views from the source code using model-driven development (MDD) techniques and check whether architectural design constraints are fulfilled by these models. Our approach supports the automatic generation of traceability links between source code elements and architectural abstractions using MDD techniques to enable software architects to easily link between components and the source code elements that realize them. It enables software architects to compare different versions of the generated architectural component view with each other. We evaluate our research results by studying the evolution of architectural abstractions in different consecutive versions of five open source systems and by analyzing the performance of our approach in these cases.  相似文献   

18.
Temporal abstraction is the task of abstracting higher‐level concepts from time‐stamped data in a context‐sensitive manner. We have developed and implemented a formal knowledge‐based framework for decomposing and solving that task that supports acquisition, maintenance, reuse, and sharing of temporal‐abstraction knowledge. We present the logical model underlying the representation and runtime formation of interpretation contexts. Interpretation contexts are relevant for abstraction of time‐oriented data and are induced by input data, concluded abstractions, external events, goals of the temporal‐abstraction process, and certain combinations of interpretation contexts. Knowledge about interpretation contexts is represented as a context ontology and as a dynamic induction relation over interpretation contexts and other proposition types. Induced interpretation contexts are either basic, composite, generalized, or nonconvex. We provide two examples of applying our model using an implemented system; one in the domain of clinical medicine (monitoring of diabetes patients) and one in the domain of traffic engineering (evaluation of traffic‐control actions). We discuss several distinct advantages to the explicit separation of interpretation‐context propositions from the propositions inducing them and from the abstractions created within them. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

19.
We present a theoretical framework and a case study for reusing the same conceptual and computational methodology for both temporal abstraction and linear (unidimensional) space abstraction, in a domain (evaluation of traffic-control actions) significantly different from the one (clinical medicine) in which the method was originally used. The method, known asknowledge-based temporal abstraction, abstracts high-level concepts and patterns from time-stamped raw data using a formal theory of domain-specific temporal-abstraction knowledge. We applied this method, originally used to interpret time-oriented clinical data, to the domain of traffic control, in which the monitoring task requires linear pattern matching along both space and time. First we reused the method for creation of unidimensional spatial abstractions over highways, given sensor measurements along each highway measured at the same time point. Second, we reused the method to create temporal abstractions of the traffic behaviour, for the same space segments, but during consecutive time points. We defined the corresponding temporal-abstraction and spatial-abstraction domain-specific knowledge. Our results suggest that (1) the knowledge-base temporal-abstraction method is reusable over time and unidimensional space as well as over significantly different domains; (2) the method can be generalised into a knowledge-based linear-abstraction method, which solves tasks requiring abstraction of data along any linear distance measure; and (3) a spatiotemporal-abstraction method can be assembled, from two copies of the generalised method and a spatial-decomposition mechanism, and is applicable to tasks requiring abstraction of time-oriented data into meaningful spatiotemporal patterns over a linear, decomposable space, such as traffic over a set of highways.  相似文献   

20.
This paper studies compositional reasoning theories for stochastic systems. A specification theory combines notions of specification and implementation with satisfaction and refinement relations, and a set of operators that together support stepwise design. One of the first behavioral specification theories introduced for stochastic systems is the one of Interval Markov Chains (IMCs), which are Markov Chains whose probability distributions are replaced by a conjunction of intervals. In this paper, we show that IMCs are not closed under conjunction, which gives a formal proof of a conjecture made in several recent works.In order to leverage this problem, we suggested to work with Constraint Markov Chains (CMCs) that is another specification theory where intervals are replaced with general constraints. Contrary to IMCs, one can show that CMCs enjoy the closure properties of a specification theory. In addition, we propose aggressive abstraction procedures for CMCs. Such abstractions can be used either to combat the state-space explosion problem, or to simplify complex constraints. In particular, one can show that, under some assumptions, the behavior of any CMC can be abstracted by an IMC.Finally, we propose an algorithm for counter-example generation, in case a refinement of two CMCs does not hold. We present a tool that implements our results. Implementing CMCs is a complex process and relies on recent advances made in decision procedures for theory of reals.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号