首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 967 毫秒
1.
J. Garcke  M. Griebel  M. Thess 《Computing》2001,67(3):225-253
O (h n −1 n d −1) instead of O(h n −d ) grid points and unknowns are involved. Here d denotes the dimension of the feature space and h n = 2 −n gives the mesh size. To be precise, we suggest to use the sparse grid combination technique [42] where the classification problem is discretized and solved on a certain sequence of conventional grids with uniform mesh sizes in each coordinate direction. The sparse grid solution is then obtained from the solutions on these different grids by linear combination. In contrast to other sparse grid techniques, the combination method is simpler to use and can be parallelized in a natural and straightforward way. We describe the sparse grid combination technique for the classification problem in terms of the regularization network approach. We then give implementational details and discuss the complexity of the algorithm. It turns out that the method scales only linearly with the number of instances, i.e. the amount of data to be classified. Finally we report on the quality of the classifier built by our new method. Here we consider standard test problems from the UCI repository and problems with huge synthetical data sets in up to 9 dimensions. It turns out that our new method achieves correctness rates which are competitive to that of the best existing methods. Received April 25, 2001  相似文献   

2.
In this paper, we consider the problem of generating all maximal cliques in a sparse graph in polynomial delay. Given a graph G=(V,E) with n vertices and m edges, the latest and fastest polynomial delay algorithm for sparse graphs enumerates all maximal cliques in O(Δ 4) time delay, where Δ is the maximum degree of vertices. However, it requires an O(n?m) preprocessing time. We improve it in two aspects. First, our algorithm does not need preprocessing. Therefore, our algorithm is a truly polynomial delay algorithm. Second, our algorithm enumerates all maximal cliques in O(Δ?H 3) time delay, where H is the so called H-value of a graph or equivalently it is the smallest integer satisfying |{vVδ(v)≥H}|≤H given δ(v) as the degree of a vertex. In real-world network data, H usually is a small value and much smaller than Δ.  相似文献   

3.
AnOE¦log2 n) algorithm is presented to construct the visibility graph for a collection ofn nonintersecting line segments, where ¦E¦ is the number of edges in the visibility graph. This algorithm is much faster than theO(n 2)-time andO(n 2)-space algorithms by Asanoet al., and by Welzl, on sparse visibility graphs. Thus we partially resolve an open problem raised by Welzl. Further, our algorithm uses onlyO(n) working storage.  相似文献   

4.
This paper presents several deterministic algorithms for selecting the k th largest record from a set of n records on any n -node hypercubic network. All of the algorithms are based on the selection algorithm of Cole and Yap, as well as on various sorting algorithms for hypercubic networks. Our fastest algorithm runs in O( lg n lg * n) time, very nearly matching the trivial lower bound. Previously, the best upper bound known for selection was O( lg n lg lg n) . A key subroutine in our O( lg n lg * n) time selection algorithm is a sparse version of the Sharesort algorithm that sorts n records using p processors, , in O( lg n ( lg lg p - lg lg (p/n) ) 2 ) time. Received March 23, 1994; revised October 30, 1997.  相似文献   

5.
In this paper we treat the static dictionary problem , very well known in computer science. It consists in storing a set S of m elements in the range [1 . . . n ] so that membership queries on S 's elements can be handled in O(1) time. It can be approached as a table compression problem in which a size n table has m ones and the other elements are zeros. We focus our attention on sparse case (m n ). We use a simple algorithm to solve the problem and make an average-case analysis of the total space required when the input derives from uniform probability distribution. We also find some conditions able to minimize storage requirements. We then propose and analyze a new algorithm able to reduce storage requirements drastically to O(m 4/3 ) . Received December 1, 1997; revised March 1, 1998.  相似文献   

6.
Concept Decompositions for Large Sparse Text Data Using Clustering   总被引:27,自引:0,他引:27  
Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as high-dimensional and sparse vectors–a few thousand dimensions and a sparsity of 95 to 99% is typical. In this paper, we study a certain spherical k-means algorithm for clustering such document vectors. The algorithm outputs k disjoint clusters each with a concept vector that is the centroid of the cluster normalized to have unit Euclidean norm. As our first contribution, we empirically demonstrate that, owing to the high-dimensionality and sparsity of the text data, the clusters produced by the algorithm have a certain fractal-like and self-similar behavior. As our second contribution, we introduce concept decompositions to approximate the matrix of document vectors; these decompositions are obtained by taking the least-squares approximation onto the linear subspace spanned by all the concept vectors. We empirically establish that the approximation errors of the concept decompositions are close to the best possible, namely, to truncated singular value decompositions. As our third contribution, we show that the concept vectors are localized in the word space, are sparse, and tend towards orthonormality. In contrast, the singular vectors are global in the word space and are dense. Nonetheless, we observe the surprising fact that the linear subspaces spanned by the concept vectors and the leading singular vectors are quite close in the sense of small principal angles between them. In conclusion, the concept vectors produced by the spherical k-means algorithm constitute a powerful sparse and localized basis for text data sets.  相似文献   

7.
Dissecting Euclidean d -space with the power diagram of n weighted point sites partitions a given m -point set into clusters, one cluster for each region of the diagram. In this manner, an assignment of points to sites is induced. We show the equivalence of such assignments to constrained Euclidean least-squares assignments. As a corollary, there always exists a power diagram whose regions partition a given d -dimensional m -point set into clusters of prescribed sizes, no matter where the sites are placed. Another consequence is that constrained least-squares assignments can be computed by finding suitable weights for the sites. In the plane, this takes roughly O(n 2 m) time and optimal space O(m) , which improves on previous methods. We further show that a constrained least-squares assignment can be computed by solving a specially structured linear program in n+1 dimensions. This leads to an algorithm for iteratively improving the weights, based on the gradient-descent method. Besides having the obvious optimization property, least-squares assignments are shown to be useful in solving a certain transportation problem, and in finding a least-squares fitting of two point sets where translation and scaling are allowed. Finally, we extend the concept of a constrained least-squares assignment to continuous distributions of points, thereby obtaining existence results for power diagrams with prescribed region volumes. These results are related to Minkowski's theorem for convex polytopes. The aforementioned iterative method for approximating the desired power diagram applies to continuous distributions as well. May 30, 1995; revised June 25, 1996.  相似文献   

8.
L. Guo  H. Chen 《Computing》2006,77(2):205-221
In this paper, an H1-Galerkin mixed finite element method is proposed for the 1-D regularized long wave (RLW) equation ut+ux+uuxδuxxt=0. The existence of unique solutions of the semi-discrete and fully discrete H1-Galerkin mixed finite element methods is proved, and optimal error estimates are established. Our method can simultaneously approximate the scalar unknown and the vector flux effectively, without requiring the LBB consistency condition. Finally, some numerical results are provided to illustrate the efficacy of our method.  相似文献   

9.
We consider the problem of computing a minimum cycle basis of an undirected non-negative edge-weighted graph G with m edges and n vertices. In this problem, a {0,1} incidence vector is associated with each cycle and the vector space over generated by these vectors is the cycle space of G. A set of cycles is called a cycle basis of G if it forms a basis for its cycle space. A cycle basis where the sum of the weights of the cycles is minimum is called a minimum cycle basis of G. Minimum cycle basis are useful in a number of contexts, e.g. the analysis of electrical networks and structural engineering. The previous best algorithm for computing a minimum cycle basis has running time O(m ω n), where ω is the best exponent of matrix multiplication. It is presently known that ω<2.376. We exhibit an O(m 2 n+mn 2log n) algorithm. When the edge weights are integers, we have an O(m 2 n) algorithm. For unweighted graphs which are reasonably dense, our algorithm runs in O(m ω ) time. For any ε>0, we also design an 1+ε approximation algorithm. The running time of this algorithm is O((m ω /ε)log (W/ε)) for reasonably dense graphs, where W is the largest edge weight. A preliminary version of this paper appeared in Kavitha et al. (31st International Colloquium on Automata, Languages and Programming (ICALP), pp. 846–857, 2004). T. Kavitha and K.E. Paluch were in Max-Planck-Institut für Informatik, Saarbrücken, Germany, while this work was done.  相似文献   

10.
We report our development of a vision-based motion planning system for an autonomous motorcycle designed for desert terrain, where uniform road surface and lane markings are not present. The motion planning is based on a vision vector space (V2-Space), which is a unitary vector set that represents local collision-free directions in the image coordinate system. The V2-Space is constructed by extracting the vectors based on the similarity of adjacent pixels, which captures both the color information and the directional information from prior vehicle tire tracks and pedestrian footsteps. We report how the V2-Space is constructed to reduce the impact of varying lighting conditions in outdoor environments. We also show how the V2-Space can be used to incorporate vehicle kinematic, dynamic, and time-delay constraints in motion planning to fit the highly dynamic requirements of the motorcycle. The combined algorithm of the V2-Space construction and the motion planning runs in O(n) time, where n is the number of pixels in the captured image. Experiments show that our algorithm outputs correct robot motion commands more than 90% of the time.  相似文献   

11.
Generalized linear model with L 1 and L 2 regularization is a widely used technique for solving classification, class probability estimation and regression problems. With the numbers of both features and examples growing rapidly in the fields like text mining and clickstream data analysis parallelization and the use of cluster architectures becomes important. We present a novel algorithm for fitting regularized generalized linear models in the distributed environment. The algorithm splits data between nodes by features, uses coordinate descent on each node and line search to merge results globally. Convergence proof is provided. A modifications of the algorithm addresses slow node problem. For an important particular case of logistic regression we empirically compare our program with several state-of-the art approaches that rely on different algorithmic and data spitting methods. Experiments demonstrate that our approach is scalable and superior when training on large and sparse datasets.  相似文献   

12.
J. H. Reif 《Algorithmica》2001,29(3):487-510
{This paper is concerned with the problem of computing the characteristic polynomial of a matrix. In a large number of applications, the matrices are symmetric and sparse : with O(n) non-zero entries. The problem has an efficient sequential solution in this case, requiring O(n 2 ) work by use of the sparse Lanczos method. A major remaining open question is: to find a polylog time parallel algorithm with matching work bounds. Unfortunately, the sparse Lanczos method cannot be parallelized to faster than time Ω (n) using n processors. Let M(n) be the processor bound to multiply two n \times n matrices in O(log n) parallel time. Giesbrecht [G2] gave the best previous polylog time parallel algorithms for the characteristic polynomial of a dense matrix with O (M(n)) processors. There is no known improvement to this processor bound in the case where the matrix is sparse. Often, in addition to being symmetric and sparse, the matrix has a sparsity graph (which has edges between indices of the matrix with non-zero entries) that has small separators. This paper gives a new algorithm for computing the characteristic polynomial of a sparse symmetric matrix, assuming that the sparsity graph is s(n) -separable and has a separator of size s(n)=O(n γ ) , for some γ , 0 < γ < 1 , that when deleted results in connected components of ≤α n vertices, for some 0 < α < 1 , with the same property. We derive an interesting algebraic version of Nested Dissection, which constructs a sparse factorization of the matrix A-λ I n where A is the input matrix and I n is the n \times n identity matrix. While Nested Dissection is commonly used to minimize the fill-in in the solution of sparse linear systems, our innovation is to use the separator structure to bound also the work for manipulation of rational functions in the recursively factored matrices. The matrix elements are assumed to be over an arbitrary field. We compute the characteristic polynomial of a sparse symmetric matrix in polylog time using P(n)(n+M(s(n))) ≤ P(n)(n+ s(n) 2.376 ) processors, where P(n) is the processor bound to multiply two degree n polynomials in O(log n) parallel time using a PRAM (P(n) = O(n) if the field supports an FFT of size n but is otherwise O(nlog log n) [CK]. Our method requires only that a matrix be symmetric and non-singular (it need not be positive definite as usual for Nested Dissection techniques). For the frequently occurring case where the matrix has small separator size, our polylog parallel algorithm has work bounds competitive with the best known sequential algorithms (i.e., the Ω(n 2 ) work of sparse Lanczos methods), for example, when the sparsity graph is a planar graph, s(n) ≤ O( \sqrt n ) , and we require polylog time with only P(n)n 1.188 processors. } Received September 26, 1997; revised June 5, 1999.  相似文献   

13.
Standard support vector machines (SVMs) training algorithms have O(l 3) computational and O(l 2) space complexities, where l is the training set size. It is thus computationally infeasible on very large data sets. To alleviate the computational burden in SVM training, we propose an algorithm to train SVMs on a bound vectors set that is extracted based on Fisher projection. For linear separate problems, we use linear Fisher discriminant to compute the projection line, while for non-linear separate problems, we use kernel Fisher discriminant to compute the projection line. For each case, we select a certain ratio samples whose projections are adjacent to those of the other class as bound vectors. Theoretical analysis shows that the proposed algorithm is with low computational and space complexities. Extensive experiments on several classification benchmarks demonstrate the effectiveness of our approach.  相似文献   

14.
Given a set P of polygons in three-dimensional space, two points p and q are said to be visible from each other with respect to P if the line segment joining them does not intersect any polygon in P . A point p is said to be completely visible from an area source S if p is visible from every point in S . The completely visible region CV(S, P) from S with respect to P is defined as the set of all points in three-dimensional space that are completely visible from S . We present two algorithms for computing CV(S, P) for P with a total of n vertices and a convex polygonal source S with m vertices. Our first result is a divide-and-conquer algorithm which runs in O(m 2 n 2 α(mn)) time and space, where α(mn) is the inverse of Ackermann's function. We next give an incremental algorithm for computing CV(S,P) in O(m 2 n+mn 2 α(n)) time and O(mn+n 2 ) space. We also prove that CV(S,P) consists of Θ(mn+n 2 ) surface elements such as vertices, edges, and faces. Received November 16, 1995; revised November 11, 1996.  相似文献   

15.
We give anO(log4 n)-timeO(n 2)-processor CRCW PRAM algorithm to find a hamiltonian cycle in a strong semicomplete bipartite digraph,B, provided that a factor ofB (i.e., a collection of vertex disjoint cycles covering the vertex set ofB) is computed in a preprocessing step. The factor is found (if it exists) using a bipartite matching algorithm, hence placing the whole algorithm in the class Random-NC. We show that any parallel algorithm which can check the existence of a hamiltonian cycle in a strong semicomplete bipartite digraph in timeO(r(n)) usingp(n) processors can be used to check the existence of a perfect matching in a bipartite graph in timeO(r(n)+n 2 /p(n)) usingp(n) processors. Hence, our problem belongs to the class NC if and only if perfect matching in bipartite graphs belongs to NC. We also consider the problem of finding a hamiltonian path in a semicomplete bipartite digraph.  相似文献   

16.
We present efficient algorithms for computing very sparse low distortion spanners in distributed networks and prove some non-trivial lower bounds on the tradeoff between time, sparseness, and distortion. All of our algorithms assume a synchronized distributed network, where relatively short messages may be communicated in each time step. Our first result is a fast distributed algorithm for finding an ${O(2^{{\rm log}^{*} n} {\rm log} n)}We present efficient algorithms for computing very sparse low distortion spanners in distributed networks and prove some non-trivial lower bounds on the tradeoff between time, sparseness, and distortion. All of our algorithms assume a synchronized distributed network, where relatively short messages may be communicated in each time step. Our first result is a fast distributed algorithm for finding an O(2log* n log n){O(2^{{\rm log}^{*} n} {\rm log} n)} -spanner with size O(n). Besides being nearly optimal in time and distortion, this algorithm appears to be the first that constructs an O(n)-size skeleton without requiring unbounded length messages or time proportional to the diameter of the network. Our second result is a new class of efficiently constructible (α, β)-spanners called Fibonacci spanners whose distortion improves with the distance being approximated. At their sparsest Fibonacci spanners can have nearly linear size, namely O(n(loglogn)f){O(n(\log \log n)^{\phi})} , where f = (1 + ?5)/2{\phi = (1 + \sqrt{5})/2} is the golden ratio. As the distance increases the multiplicative distortion of a Fibonacci spanner passes through four discrete stages, moving from logarithmic to log-logarithmic, then into a period where it is constant, tending to 3, followed by another period tending to 1. On the lower bound side we prove that many recent sequential spanner constructions have no efficient counterparts in distributed networks, even if the desired distortion only needs to be achieved on the average or for a tiny fraction of the vertices. In particular, any distance preservers, purely additive spanners, or spanners with sublinear additive distortion must either be very dense, slow to construct, or have very weak guarantees on distortion.  相似文献   

17.
The Snaer program calculates the posterior mean and variance of variables on some of which we have data (with precisions), on some we have prior information (with precisions), and on some prior indicator ratios (with precisions) are available. The variables must satisfy a number of exact restrictions. The system is both large and sparse. Two aspects of the statistical and computational development are a practical procedure for solving a linear integer system, and a stable linearization routine for ratios. The numerical method for solving large sparse linear least-squares estimation problems is tested and found to perform well, even when the n×k design matrix is large (nk=O(108)).  相似文献   

18.
The greedy algorithm produces high-quality spanners and, therefore, is used in several applications. However, even for points in d-dimensional Euclidean space, the greedy algorithm has near-cubic running time. In this paper, we present an algorithm that computes the greedy spanner for a set of n points in a metric space with bounded doubling dimension in O(n2logn)\ensuremath {\mathcal {O}}(n^{2}\log n) time. Since computing the greedy spanner has an Ω(n 2) lower bound, the time complexity of our algorithm is optimal within a logarithmic factor.  相似文献   

19.
In this paper we give a fully dynamic approximation scheme for maintaining all-pairs shortest paths in planar networks. Given an error parameter such that , our algorithm maintains approximate all-pairs shortest paths in an undirected planar graph G with nonnegative edge lengths. The approximate paths are guaranteed to be accurate to within a 1+ factor. The time bounds for both query and update for our algorithm is O( -1 n 2/3 log 2 n log D) , where n is the number of nodes in G and D is the sum of its edge lengths. The time bound for the queries is worst case, while that for the additions is amortized. Our approximation algorithm is based upon a novel technique for approximately representing all-pairs shortest paths among a selected subset of the nodes by a sparse substitute graph. Received January 1995; revised February 1997.  相似文献   

20.
C. C. McGeoch 《Algorithmica》1995,13(5):426-441
The essential subgraph H of a weighted graph or digraphG contains an edge (v, w) if that edge is uniquely the least-cost path between its vertices. Let s denote the number of edges ofH. This paper presents an algorithm for solving all-pairs shortest paths onG that requires O(ns+n2 logn) worst-case running time. In general the time is equivalent to that of solvingn single-source problems using only edges inH. For general models of random graphs and digraphsG, s=0(n logn) almost surely. The subgraphH is optimal in the sense that it is the smallest subgraph sufficient for solving shortest-path problems inG. Lower bounds on the largest-cost edge ofH and on the diameter ofH andG are obtained for general randomly weighted graphs. Experimental results produce some new conjectures about essential subgraphs and distances in graphs with uniform edge costs.Much of this research was carried out while the author was a Visiting Fellow at the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号