首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 953 毫秒
1.
Let Wn denote any bipartite graph obtained by adding some edges to the n-dimensional hypercube Qn, and let S and T be any two sets of k vertices in different partite sets of Wn. In this paper, we show that the graph Wn has k vertex-disjoint (S,T)-paths containing all vertices of Wn if and only if k=2n−1 or the graph Wn−(ST) has a perfect matching. Moreover, if the graph Wn−(ST) has a perfect matching M, then the graph Wn has k vertex-disjoint (S,T)-paths containing all vertices of Wn and all edges in M. And some corollaries are given.  相似文献   

2.
Mining spatial association rules in image databases   总被引:2,自引:0,他引:2  
In this paper, we propose a novel spatial mining algorithm, called 9DLT-Miner, to mine the spatial association rules from an image database, where every image is represented by the 9DLT representation. The proposed method consists of two phases. First, we find all frequent patterns of length one. Next, we use frequent k-patterns (k ? 1) to generate all candidate (k + 1)-patterns. For each candidate pattern generated, we scan the database to count the pattern’s support and check if it is frequent. The steps in the second phase are repeated until no more frequent patterns can be found. Since our proposed algorithm prunes most of impossible candidates, it is more efficient than the Apriori algorithm. The experiment results show that 9DLT-Miner runs 2-5 times faster than the Apriori algorithm.  相似文献   

3.
Given a set W of k sequences and a tree T with k leaves labeled with a unique sequence in W, a label tree is used to assign a sequence label to each internal node of the tree T. The cost of a tree edge is defined as the distance, such as the Hamming distance or the Levenshtein (edit) distance, between the sequence labels of a pair of nodes in the edge. The bottleneck edge of a label tree is the edge with the maximum cost in the label tree. The bottleneck tree alignment problem is concerned with the determination of a label tree with minimum cost of the bottleneck edge. A lifted label tree is a label tree in which the sequence label of each internal node is equal to the sequence label of some child of the node. The bottleneck lifted tree alignment problem involves the minimization of cost of the bottleneck edge among all possible lifted label trees of the tree T. This paper shows that when the distance function is a metric, the bottleneck tree alignment problem is NP-complete even when the tree structure resembles a binary tree. For special cases, an exact algorithm is used to solve the bottleneck lifted tree alignment problem in polynomial time. A 2(h-1)-approximation algorithm is used to solve the bottleneck tree alignment problem, where h is the height of the tree T. It is observed that the bound is existentially tight. Finally, this paper shows that any lifted label tree is an optimal solution to the bottleneck tree alignment problem if the distance function is an ultrametric.  相似文献   

4.
In this paper we study a dynamic version of capacity maximization in the physical model of wireless communication. In our model, requests for connections between pairs of points in Euclidean space of constant dimension d arrive iteratively over time. When a new request arrives, an online algorithm needs to decide whether or not to accept the request and to assign one out of k channels and a transmission power to the request. Accepted requests must satisfy constraints on the signal-to-interference-plus-noise (SINR) ratio. The objective is to maximize the number of accepted requests. Using competitive analysis we study algorithms using distance-based power assignments, for which the power of a request relies only on the distance between the points. Such assignments are inherently local and particularly useful in distributed settings. We first focus on the case of a single channel. For request sets with spatial lengths in [1,Δ] and duration in [1,Γ] we derive a lower bound of Ω(Γ d/2) on the competitive ratio of any deterministic online algorithm using a distance-based power assignment. Our main result is a near-optimal deterministic algorithm that is O(Γ(d/2)+ε )-competitive, for any constant ε>0. Our algorithm for a single channel can be generalized to k channels. It can be adjusted to yield a competitive ratio of O(k?Γ 1/k(d/2k″)+ε ) for any factorization (k′,k″) such that k′?k″=k. This illustrates the effectiveness of multiple channels when dealing with unknown request sequences. In particular, for Θ(log?Γ?log?Δ) channels this yields an O(log?Γ?log?Δ)-competitive algorithm. Additionally, we show how this approach can be turned into a randomized algorithm, which is O(log?Γ?log?Δ)-competitive even for a single channel.  相似文献   

5.
A problem open for many years is whether there is an FPT algorithm that given a graph G and parameter k, either: (1) determines that G has no k-Dominating Set, or (2) produces a dominating set of size at most g(k), where g(k) is some fixed function of k. Such an outcome is termed an FPT approximation algorithm. We describe some results that begin to provide some answers. We show that there is no such FPT algorithm for g(k) of the form k+c (where c is a fixed constant, termed an additive FPT approximation), unless FPT=W[2]. We answer the analogous problem completely for the related Independent Dominating Set (IDS) problem, showing that IDS does not admit an FPT approximation algorithm, for any g(k), unless FPT=W[2].  相似文献   

6.
The leading partitional clustering technique, k-modes, is one of the most computationally efficient clustering methods for categorical data. However, the performance of the k-modes clustering algorithm which converges to numerous local minima strongly depends on initial cluster centers. Currently, most methods of initialization cluster centers are mainly for numerical data. Due to lack of geometry for the categorical data, these methods used in cluster centers initialization for numerical data are not applicable to categorical data. This paper proposes a novel initialization method for categorical data which is implemented to the k-modes algorithm. The method integrates the distance and the density together to select initial cluster centers and overcomes shortcomings of the existing initialization methods for categorical data. Experimental results illustrate the proposed initialization method is effective and can be applied to large data sets for its linear time complexity with respect to the number of data objects.  相似文献   

7.
LetW itk(n) be the minimax complexity of selecting thek largest elements ofn numbersx 1,x 2,...,x n by pairwise comparisonsx i :x j . It is well known thatW 2(n) =n?2+ [lgn], andW k (n) = n + (k?1)lg n +O(1) for all fixed k ≥ 3. In this paper we studyW k (n), the minimax complexity of selecting thek largest, when tests of the form “Isx i the median of {x i ,x j ,x t }?” are also allowed. It is proved thatW2(n) =n?2+ [lgn], andW k (n) =n + (k?1)lg2 n +O(1) for all fixedk≥3.  相似文献   

8.
An optimal probabilistic-planning algorithm solves a problem, usually modeled by a Markov decision process, by finding an optimal policy. In this paper, we study the k best policies problem. The problem is to find the k best policies of a discrete Markov decision process. The k best policies, k?>?1, cannot be found directly using dynamic programming. Naïvely, finding the k-th best policy can be Turing reduced to the optimal planning problem, but the number of problems queried in the naïve algorithm is exponential in k. We show empirically that solving k best policies problem by using this reduction requires unreasonable amounts of time even when k?=?3. We then provide two new algorithms. The first is a complete algorithm, based on our theoretical contribution that the k-th best policy differs from the i-th policy, for some i?k, on exactly one state. The second is an approximate algorithm that skips many less useful policies. We show that both algorithms have good scalability. We also show that the approximate algorithms runs much faster and finds interesting, high-quality policies.  相似文献   

9.
A k-core Ck of a tree T is subtree with exactly k leaves for k?nl, where nl the number of leaves in T, and minimizes the sum of the distances of all nodes from Ck. In this paper first we propose a distributed algorithm for constructing a rooted spanning tree of a dynamic graph such that root of the tree is located near the center of the graph. Then we provide a distributed algorithm for finding k-core of that spanning tree. The spanning tree is constructed in two stages. In the first stage, a forest of trees is generated. In the next stage these trees are connected to form a single rooted tree. An interesting aspect of the first stage of proposed spanning algorithm is that it implicitly constructs the (convex) hull of those nodes which are not already included in the spanning forest. The process is repeated till all non root nodes of the graph have chosen a unique parent. We implemented the algorithms for finding spanning tree and its k-core. A core can be quite useful for routing messages in a dynamic network consisting of a set of mobile devices.  相似文献   

10.
Let P1,…,Pk be a collection of disjoint point sets in R2 in general position. We prove that for each 1?i?k we can find a plane spanning tree Ti of Pi such that the edges of T1,…,Tk intersect at most , where n is the number of points in P1∪?∪Pk. If the intersection of the convex hulls of P1,…,Pk is nonempty, we can find k spanning cycles such that their edges intersect at most (k−1)n times, this bound is tight. We also prove that if P and Q are disjoint point sets in general position, then the minimum weight spanning trees of P and Q intersect at most 8n times, where |PQ|=n (the weight of an edge is its length).  相似文献   

11.
Mining regional co-location patterns with kNNG   总被引:2,自引:0,他引:2  
Spatial co-location pattern mining discovers the subsets of features of which the events are frequently located together in geographic space. The current research on this topic adopts a distance threshold that has limitations in spatial data sets with various magnitudes of neighborhood distances, especially for mining of regional co-location patterns. In this paper, we propose a hierarchical co-location mining framework accounting for both variety of neighborhood distances and spatial heterogeneity. By adopting k-nearest neighbor graph (kNNG) instead of distance threshold, we propose “distance variation coefficient” as a new measure to drive the mining operations and determine an individual neighborhood relationship graph for each region. The proposed mining algorithm outputs a set of regions with each of them an individual set of regional co-location patterns. The experimental results on both synthetic and real world data sets show that our framework is effective to discover these regional co-location patterns.  相似文献   

12.
This paper presents an efficient algorithm, called pattern reduction (PR), for reducing the computation time of k-means and k-means-based clustering algorithms. The proposed algorithm works by compressing and removing at each iteration patterns that are unlikely to change their membership thereafter. Not only is the proposed algorithm simple and easy to implement, but it can also be applied to many other iterative clustering algorithms such as kernel-based and population-based clustering algorithms. Our experiments—from 2 to 1000 dimensions and 150 to 10,000,000 patterns—indicate that with a small loss of quality, the proposed algorithm can significantly reduce the computation time of all state-of-the-art clustering algorithms evaluated in this paper, especially for large and high-dimensional data sets.  相似文献   

13.
In this paper, we propose a novel algorithm, called 9DSPA-Miner, to mine frequent patterns from an image database, where every image is represented by the 9D-SPA representation. Our proposed method consists of three phases. First, we scan the database once and create an index structure. Next, the index structure is scanned to find all frequent patterns of length two. Finally, we use the frequent k-patterns (k ? 2) to generate candidate (k + 1)-patterns and check if the support of each candidate generated is not less than the user-specified minimum support threshold by using the index structure. Then, the steps in the third phase are repeated until no more frequent patterns can be found. Since the 9DSPA-Miner algorithm uses the characteristics of the 9D-SPA representation to prune most of impossible candidates, the experiment results demonstrate that it is more efficient and scalable than the modified Apriori method.  相似文献   

14.
We give an algorithm to compute the subset partial order (called the subset graph) for a family F of sets containing k sets with N elements in total and domain size n. Our algorithm requires O(nk2/logk) time and space on a Pointer Machine. When F is dense, i.e. N=Θ(nk), the algorithm requires O(N2/log2N) time and space. We give a construction for a dense family whose subset graph is of size Θ(N2/log2N), indicating the optimality of our algorithm for dense families. The subset graph can be dynamically maintained when F undergoes set insertions and deletions in O(nk/logk) time per update (that is sub-linear in N for the case of dense families). If we assume words of b?k bits, allow bits to be packed in words, and use bitwise operations, the above running time and space requirements can be reduced by a factor of blog(k/b+1)/logk and b2log(k/b+1)/logk respectively.  相似文献   

15.
杨皓  段磊  胡斌  邓松  王文韬  秦攀 《软件学报》2015,26(11):2994-3009
对比序列模式能够表达序列数据集合间的差异,在商品推荐、用户行为分析和电力供应预测等领域有广泛的应用.已有的对比序列模式挖掘算法需要用户设定正例支持度阈值和负例支持度阈值.在不具备足够先验知识的情况下,用户难以设定恰当的支持度阈值,从而可能错失一些对比显著的模式.为此,提出了带间隔约束的top-k对比序列模式挖掘算法kDSP-Miner(top-k distinguishing sequential patterns with gap constraint miner).kDSP-Miner中用户只需设置期望发现的对比最显著的模式个数,从而避免了直接设置对比支持度阈值.相应地,挖掘算法更容易使用,并且结果更易于解释.同时,为了提高算法执行效率,设计了若干剪枝策略和启发策略.进一步设计了kDSP-Miner的多线程版本,以提高其对高维序列元素情况的处理能力.通过在真实世界数据集上的详实实验,验证了算法的有效性和执行效率.  相似文献   

16.
The problem of optimal non-hierarchical clustering is addressed. A new algorithm combining differential evolution and k-means is proposed and tested on eight well-known real-world data sets. Two criteria (clustering validity indexes), namely TRW and VCR, were used in the optimization of classification. The classification of objects to be optimized is encoded by the cluster centers in differential evolution (DE) algorithm. It induced the problem of rearrangement of centers in the population to ensure an efficient search via application of evolutionary operators. A new efficient heuristic for this rearrangement was also proposed. The plain DE variants with and without the rearrangement were compared with corresponding hybrid k-means variants. The experimental results showed that hybrid variants with k-means algorithm are essentially more efficient than the non-hybrid ones. Compared to a standard k-means algorithm with restart, the new hybrid algorithm was found more reliable and more efficient, especially in difficult tasks. The results for TRW and VCR criterion were compared. Both criteria provided the same optimal partitions and no significant differences were found in efficiency of the algorithms using these criteria.  相似文献   

17.
We present an efficient parameterized algorithm for the (k,t)-set packing problem, in which we are looking for a collection of k disjoint sets whose union consists of t elements. The complexity of the algorithm is O(2O(t)nNlogN). For the special case of sets of bounded size, this improves the O(k(ck)n) algorithm of Jia et al. [J. Algorithms 50 (1) (2004) 106].  相似文献   

18.
The leading partitional clustering technique, k-modes, is one of the most computationally efficient clustering methods for categorical data. However, in the k-modes-type algorithms, the performance of their clustering depends on initial cluster centers and the number of clusters needs be known or given in advance. This paper proposes a novel initialization method for categorical data which is implemented to the k-modes-type algorithms. The proposed method can not only obtain the good initial cluster centers but also provide a criterion to find candidates for the number of clusters. The performance and scalability of the proposed method has been studied on real data sets. The experimental results illustrate that the proposed method is effective and can be applied to large data sets for its linear time complexity with respect to the number of data points.  相似文献   

19.
This article presents a novel type of queries in spatial databases, called the direction-aware bichromatic reverse k nearest neighbor(DBRkNN) queries, which extend the bichromatic reverse nearest neighbor queries. Given two disjoint sets, P and S, of spatial objects, and a query object q in S, the DBRkNN query returns a subset P′ of P such that k nearest neighbors of each object in P′ include q and each object in P′ has a direction toward q within a pre-defined distance. We formally define the DBRkNN query, and then propose an efficient algorithm, called DART, for processing the DBRkNN query. Our method utilizes a grid-based index to cluster the spatial objects, and the B+-tree to index the direction angle. We adopt a filter-refinement framework that is widely used in many algorithms for reverse nearest neighbor queries. In the filtering step, DART eliminates all the objects that are away from the query object more than a pre-defined distance, or have an invalid direction angle. In the refinement step, remaining objects are verified whether the query object is actually one of the k nearest neighbors of them. As a major extension of DART, we also present an improved algorithm, called DART+, for DBRkNN queries. From extensive experiments with several datasets, we show that DART outperforms an R-tree-based naive algorithm in both indexing time and query processing time. In addition, our extension algorithm, DART+, also shows significantly better performance than DART.  相似文献   

20.
We provide in this article a branch-and-bound algorithm that solves the problem of finding the k closest pairs of points (p,q), p?∈?P,q?∈?Q, considering two sets of points in the euclidean plane P,Q stored in external memory assuming that only one of the sets has a spatial index. This problem arises naturally in many scenarios, for instance when the set without an index is the answer to a spatial query. The main idea of our algorithm is to partition the space occupied by the set without an index into several cells or subspaces and to make use of the properties of a set of metrics defined on two Minimum Bounding Rectangles (MBRs). We evaluated our algorithm for different values of k by means of a series of experiments that considered both synthetical and real world datasets. We compared the performance of our algorithm with that of techniques that either assume that both datasets have a spatial index or that none has an index. The results show that our algorithm needs only between a 0.3 and a 35 % of the disk accesses required by such techniques. Our algorithm also shows a good scalability, both in terms of k and of the size of the data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号