首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 125 毫秒
1.
Pushing Convertible Constraints in Frequent Itemset Mining   总被引:1,自引:0,他引:1  
Recent work has highlighted the importance of the constraint-based mining paradigm in the context of frequent itemsets, associations, correlations, sequential patterns, and many other interesting patterns in large databases. Constraint pushing techniques have been developed for mining frequent patterns and associations with antimonotonic, monotonic, and succinct constraints. In this paper, we study constraints which cannot be handled with existing theory and techniques in frequent pattern mining. For example, avg(S)v, median(S)v, sum(S)v (S can contain items of arbitrary values, {<, <, , } and v is a real number.) are customarily regarded as tough constraints in that they cannot be pushed inside an algorithm such as Apriori. We develop a notion of convertible constraints and systematically analyze, classify, and characterize this class. We also develop techniques which enable them to be readily pushed deep inside the recently developed FP-growth algorithm for frequent itemset mining. Results from our detailed experiments show the effectiveness of the techniques developed.  相似文献   

2.
We propose a method for constructing regression trees with range and region splitting. We present an efficient algorithm for computing the optimal two-dimensional region that minimizes the mean squared error of an objective numeric attribute in a given database. As two-dimensional regions, we consider a class R of grid-regions, such as x-monotone, rectilinear-convex, and rectangular, in the plane associated with two numeric attributes. We compute the optimal region R. We propose to use a test that splits data into those that lie inside the region R and those that lie outside the region in the construction of regression trees. Experiments confirm that the use of region splitting gives compact and accurate regression trees in many domains.  相似文献   

3.
Our starting point is a definition of conditional event EH which differs from many seemingly similar ones adopted in the relevant literature since 1935, starting with de Finetti. In fact, if we do not assign the same third value u (undetermined) to all conditional events, but make it depend on EH, it turns out that this function t(EH) can be taken as a general conditional uncertainty measure, and we get (through a suitable – in a sense, compulsory – choice of the relevant operations among conditional events) the natural axioms for many different (besides probability) conditional measures.  相似文献   

4.
We consider the parallel time complexity of logic programs without function symbols, called logical query programs, or Datalog programs. We give a PRAM algorithm for computing the minimum model of a logical query program, and show that for programs with the polynomial fringe property, this algorithm runs in time that is logarithmic in the input size, assuming that concurrent writes are allowed if they are consistent. As a result, the linear and piecewise linear classes of logic programs are inN C. Then we examine several nonlinear classes in which the program has a single recursive rule that is an elementary chain. We show that certain nonlinear programs are related to GSM mappings of a balanced parentheses language, and that this relationship implies the polynomial fringe property; hence such programs are inN C Finally, we describe an approach for demonstrating that certain logical query programs are log space complete forP, and apply it to both elementary single rule programs and nonelementary programs.Supported by NSF Grant IST-84-12791, a grant of IBM Corporation, and ONR contract N00014-85-C-0731.  相似文献   

5.
Exploratory data mining and analysis requires a computing environment which provides facilities for the user-friendly expression and rapid execution of scientific queries. In this paper, we address research issues in the parallelization of scientific queries containing complex user-defined operations. In a parallel query execution environment, parallelizing a query execution plan involves determining how input data streams to evaluators implementing logical operations can be divided to be processed by clones of the same evaluator in parallel. We introduced the concept of relevance window that characterizes data lineage and data partitioning opportunities available for an user-defined evaluator. In addition, we developed a query parallelization framework by extending relational parallel query optimization algorithms to allow the parallelization characteristics of user-defined evaluators to guide the process of query parallelization in an extensible query processing environment. We demonstrated the utility of our system by performing experiments mining cyclonic activity, blocking events, and the upward wave-energy propagation features from several observational and model simulation datasets.  相似文献   

6.
We construct a nearest-neighbor interaction whose ground states encode the solutions to the NP-complete problem independent set for cubic planar graphs. The important difference to previously used Hamiltonians in adiabatic quantum computing is that our Hamiltonian is spatially local. Due to its special structure our Hamiltonian can be easily simulated by Ising interactions between adjacent particles on a 2D rectangular lattice. We describe the required pulse sequences. Our methods could help to implement adiabatic quantum computing by physically reasonable Hamiltonians like short-range interactions. Therefore, this universal resource Hamiltonian can be used for different graphs by applying suitable control operations. This is in contrast to a previous proposal where the Hamiltonians have to be wired in hardware for each graph. PACS: 03.67.Lx  相似文献   

7.
A well-known problem in default logic is the ability of naive reasoners to explain bothg and ¬g from a set of observations. This problem is treated in at least two different ways within that camp.One approach is examination of the various explanations and choosing among them on the basis of various explanation comparators. A typical comparator is choosing the explanation that depends on the most specific observation, similar to the notion of narrowest reference class.Others examine default extensions of the observations and choose whatever is true in any extension, or what is true in all extensions or what is true in preferred extensions. Default extensions are sometimes thought of as acceptable models of the world that are discarded as more knowledge becomes available.We argue that the notions of specificity and extension lack clear semantics. Furthermore, we show that the problems these ideas were supposed to solve can be handled easily within a probabilistic framework.  相似文献   

8.
ART: A Hybrid Classification Model   总被引:1,自引:0,他引:1  
This paper presents a new family of decision list induction algorithms based on ideas from the association rule mining context. ART, which stands for Association Rule Tree, builds decision lists that can be viewed as degenerate, polythetic decision trees. Our method is a generalized Separate and Conquer algorithm suitable for Data Mining applications because it makes use of efficient and scalable association rule mining techniques.  相似文献   

9.
Summary For a family of languages , CAL() is defined as the family of images of under nondeterministic two-way finite state transducers, while FINITE · VISIT() is the closure of under deterministic two-way finite state transducers; CAL0()= and for n0, CAL n+1()=CAL n (CAL()). For any semiAFL , if FINITE · VISIT() CAL(), then CAL n () forms a proper hierarchy and for every n0, FINITE · VISIT(CALn()) CAL n+1() FINITE · VISIT(CAL n+1()). If is a SLIP semiAFL or a weakly k-iterative full semiAFL or a semiAFL contained in any full bounded AFL, then FINITE · VISIT() CAL() and in the last two cases, FINITE · VISIT(). If is a substitution closed full principal semiAFL and FINITE · VISIT(), then FINITE · VISIT() CAL(). If is a substitution closed full principal semiAFL generated by a language without an infinite regular set and 1 is a full semiAFL, then is contained in CALm(1) if and only if it is contained in 1. Among the applications of these results are the following. For the following families , CAL n () forms a proper hierarchy: =INDEXED, =ETOL, and any semiAFL contained in CF. The family CF is incomparable with CAL m (NESA) where NESA is the family of one-way nonerasing stack languages and INDEXED is incomparable with CAL m (STACK) where STACK is the family of one-way stack languages.This work was supported in part by the National Science Foundation under Grants No. DCR74-15091 and MCS-78-04725  相似文献   

10.
The mobile communication revolution has led to pervasive connectedness—as evidenced by the explosive growth of instant messaging in the home, and more recently, the enterprise–and, together with the convergence of mobile computing, provides a basis for extending collaborative environments toward truly ubiquitous immersion. Leveraging the true anytime/anywhere access afforded by mobile computing, it becomes possible to develop applications that not only are capable of responding to users whenever/wherever, on demand, but that also may actively seek out and engage users when the need arises. Thus, immersive environments need no longer be thought of strictly in terms of physical immersion with clearly discernable enter and exit events, but rather they may be extended, through mobile-enabled computing, toward ubiquity in terms of both time and space. Based on Media Synchronicity Theory, potential benefits are envisioned, particularly in the case of collaborative learning environments, from shortened response cycles and increased real time interaction opportunities. At the same time, a number of challenging issues must be addressed in designing such an environment to ensure user acceptance and to maximize realization of the potential. Third Generation (3G) Threaded Discussion has been conceptualized as an environment, well suited to mobile learning (m-learning) that could leverage mobile-enabled ubiquity to achieve a degree of extended immersion and thereby accrue the associated collaboration benefits. Exploring this conceptualization serves to help surface both the opportunities and the challenges associated with such environments and to identify promising design approaches, such as the use of intelligent agents.This revised version was published online in March 2005 with corrections to the cover date  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号