首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the uniform circuit model of computation, the width of a boolean circuit exactly characterizes the “space” complexity of the computed function. Looking for a similar relationship in Valiant’s algebraic model of computation, we propose width of an arithmetic circuit as a possible measure of space. In the uniform setting, we show that our definition coincides with that of VPSPACE at polynomial width. We introduce the class VL as an algebraic variant of deterministic log-space L; VL is a subclass of VP. Further, to define algebraic variants of non-deterministic space-bounded classes, we introduce the notion of “read-once” certificates for arithmetic circuits. We show that polynomial-size algebraic branching programs (an algebraic analog of NL) can be expressed as read-once exponential sums over polynomials in ${{\sf VL}, {\it i.e.}\quad{\sf VBP} \in \Sigma^R \cdot {\sf VL}}$ . Thus, read-once exponential sums can be viewed as a reasonable way of capturing space-bounded non-determinism. We also show that Σ R ·VBPVBP, i.e. VBPs are stable under read-once exponential sums. Though the best upper bound we have for Σ R ·VL itself is VNP, we can obtain better upper bounds for width-bounded multiplicatively disjoint (md-) circuits. Without the width restriction, md- arithmetic circuits are known to capture all of VP. We show that read-once exponential sums over md- constant-width arithmetic circuits are within VP and that read-once exponential sums over md- polylog-width arithmetic circuits are within VQP. We also show that exponential sums of a skew formula cannot represent the determinant polynomial.  相似文献   

2.
We report progress on the NL versus UL problem.
  • We show that counting the number of s-t paths in graphs where the number of s-v paths for any v is bounded by a polynomial can be done in FUL: the unambiguous log-space function class. Several new upper bounds follow from this including ${{{ReachFewL} \subseteq {UL}}}$ and ${{{LFew} \subseteq {UL}^{FewL}}}$
  • We investigate the complexity of min-uniqueness—a central notion in studying the NL versus UL problem. In this regard we revisit the class OptL[log n] and introduce UOptL[log n], an unambiguous version of OptL[log n]. We investigate the relation between UOptL[log n] and other existing complexity classes.
  • We consider the unambiguous hierarchies over UL and UOptL[log n]. We show that the hierarchy over UOptL[log n] collapses. This implies that ${{{ULH} \subseteq {L}^{{promiseUL}}}}$ thus collapsing the UL hierarchy.
  • We show that the reachability problem over graphs embedded on 3 pages is complete for NL. This contrasts with the reachability problem over graphs embedded on 2 pages, which is log-space equivalent to the reachability problem in planar graphs and hence is in UL.
  •   相似文献   

    3.
    We explore relationships between circuit complexity, the complexity of generating circuits, and algorithms for analyzing circuits. Our results can be divided into two parts:
    1. Lower bounds against medium-uniform circuits. Informally, a circuit class is “medium uniform” if it can be generated by an algorithmic process that is somewhat complex (stronger than LOGTIME) but not infeasible. Using a new kind of indirect diagonalization argument, we prove several new unconditional lower bounds against medium-uniform circuit classes, including: ? For all k, P is not contained in P-uniform SIZE(n k ). That is, for all k, there is a language \({L_k \in {\textsf P}}\) that does not have O(n k )-size circuits constructible in polynomial time. This improves Kannan’s lower bound from 1982 that NP is not in P-uniform SIZE(n k ) for any fixed k. ? For all k, NP is not in \({{\textsf P}^{\textsf NP}_{||}-{\textsf {uniform SIZE}}(n^k)}\) .This also improves Kannan’s theorem, but in a different way: the uniformity condition on the circuits is stronger than that on the language itself. ? For all k, LOGSPACE does not have LOGSPACE-uniform branching programs of size n k .
    2. Eliminating non-uniformity and (non-uniform) circuit lower bounds. We complement these results by showing how to convert any potential simulation of LOGTIME-uniform NC 1 in ACC 0/poly or TC 0/poly into a medium-uniform simulation using small advice. This lemma can be used to simplify the proof that faster SAT algorithms imply NEXP circuit lower bounds and leads to the following new connection: ? Consider the following task: given a TC 0 circuit C of n O(1) size, output yes when C is unsatisfiable, and output no when C has at least 2 n-2 satisfying assignments. (Behavior on other inputs can be arbitrary.) Clearly, this problem can be solved efficiently using randomness. If this problem can be solved deterministically in 2 n-ω(log n) time, then \({{\textsf{NEXP}} \not \subset {\textsf{TC}}^0/{\rm poly}}\) .
    Another application is to derandomize randomized TC 0 simulations of NC 1 on almost all inputs: ?Suppose \({{\textsf{NC}}^1 \subseteq {\textsf{BPTC}}^0}\) . Then, for every ε > 0 and every language L in NC 1, there is a LOGTIME?uniform TC 0 circuit family of polynomial size recognizing a language L′ such that L and L′ differ on at most \({2^{n^{\epsilon}}}\) inputs of length n, for all n.  相似文献   

    4.
    We consider transactional memory contention management in the context of balanced workloads, where if a transaction is writing, the number of write operations it performs is a constant fraction of its total reads and writes. We explore the theoretical performance boundaries of contention management in balanced workloads from the worst-case perspective by presenting and analyzing two new polynomial time contention management algorithms. We analyze the performance of a contention management algorithm by comparison with an optimal offline contention management algorithm to provide a competitive ratio. The first algorithm Clairvoyant is $O(\sqrt{s})$ -competitive, where s is the number of shared resources. This algorithm depends on explicitly knowing the conflict graph at each time step of execution. The second algorithm Non-Clairvoyant is $O(\sqrt{s} \cdot \log n)$ -competitive, with high probability, which is only a O(log?n) factor worse, but does not require knowledge of the conflict graph, where n is the number of transactions. Both of these algorithms are greedy. We also prove that the performance of Clairvoyant is close to optimal, since there is no polynomial time contention management algorithm for the balanced transaction scheduling problem that is better than $O((\sqrt{s})^{1-\varepsilon})$ -competitive for any constant ε>0, unless NP?ZPP. To our knowledge, these results are significant improvements over the best previously known O(s) competitive ratio bound.  相似文献   

    5.
    Given a DNF formula f on n variables, the two natural size measures are the number of terms or size s(f) and the maximum width of a term w(f). It is folklore that small DNF formulas can be made narrow: if a formula has m terms, it can be ${\epsilon}$ -approximated by a formula with width ${{\rm log}(m/{\epsilon})}$ . We prove a converse, showing that narrow formulas can be sparsified. More precisely, any width w DNF irrespective of its size can be ${\epsilon}$ -approximated by a width w DNF with at most ${(w\, {\rm log}(1/{\epsilon}))^{O(w)}}$ terms. We combine our sparsification result with the work of Luby & Velickovic (1991, Algorithmica 16(4/5):415–433, 1996) to give a faster deterministic algorithm for approximately counting the number of satisfying solutions to a DNF. Given a formula on n variables with poly(n) terms, we give a deterministic ${n^{\tilde{O}({\rm log}\, {\rm log} (n))}}$ time algorithm that computes an additive ${\epsilon}$ approximation to the fraction of satisfying assignments of f for ${\epsilon = 1/{\rm poly}({\rm log}\, n)}$ . The previous best result due to Luby and Velickovic from nearly two decades ago had a run time of ${n^{{\rm exp}(O(\sqrt{{\rm log}\, {\rm log} n}))}}$ (Luby & Velickovic 1991, in Algorithmica 16(4/5):415–433, 1996).  相似文献   

    6.
    A space-bounded Stack Machine is a regular Turing Machine with a read-only input tape, several space-bounded read-write work tapes, and an unbounded stack. Stack Machines with a logarithmic space bound have been connected to other classical models of computation, such as polynomial-time Turing Machines (P) (Cook in J Assoc Comput Mach 18:4–18, 1971) and polynomial size, polylogarithmic depth, bounded fan-in circuits (NC) e.g., Borodin et al. (SIAM J Comput 18, 1989). In this paper, we present significant new lower bounds and techniques for Stack Machines. This comes in the form of a trade-off lower bound between space and number of passes over the input tape. Specifically, we give an explicit permuted inner product function such that any Stack Machine computing this function requires either ${\Omega (N^{1/4 - \epsilon})}$ space or ${\Omega (N^{1/4 - \epsilon})}$ number of passes for every constant ${\epsilon > 0}$ , where N is the input size. In the case of logarithmic space Stack Machines, this yields an unconditional ${\Omega (N^{1/4 - \epsilon})}$ lower bound for the number of passes. To put this result in perspective, we note that Stack Machines with logarithmic space and a single pass over the input can compute Parity, Majority, as well as certain languages outside NC. The latter follows from Allender (J Assoc Comput Mach 36:912–928, 1989), conditional on the widely believed complexity assumption that PSPACE ${\subsetneq}$ EXP. Our technique is a novel communication complexity reduction, thereby extending the already wide range of models of computation for which communication complexity can be used to obtain lower bounds. Informally, we show that a k-player number-in-hand (NIH) communication protocol for a base function f can efficiently simulate a space- and pass-bounded Stack Machine for a related function F, which consists of several “permuted” instances of f, bundled together by a combining function h. Trade-off lower bounds for Stack Machines then follow from known communication complexity lower bounds. The framework for this reduction was given by Beame & Huynh-Ngoc (2008), who used it to obtain similar trade-off lower bounds for Turing Machines with a constant number of pass-bounded external tapes. We also prove that the latter cannot efficiently simulate Stack Machines, conditional on the complexity assumption that E ${\not \subset}$ PSPACE. It is the treatment of an unbounded stack which constitutes the main technical novelty in our communication complexity reduction.  相似文献   

    7.
    We give a self-reduction for the Circuit Evaluation problem (CircEval) and prove the following consequences.
    1. Amplifying size–depth lower bounds. If CircEval has Boolean circuits of n k size and n 1?δ depth for some k and δ, then for every ${\epsilon > 0}$ , there is a δ′ > 0 such that CircEval has circuits of ${n^{1 + \epsilon}}$ size and ${n^{1- \delta^{\prime}}}$ depth. Moreover, the resulting circuits require only ${\tilde{O}(n^{\epsilon})}$ bits of non-uniformity to construct. As a consequence, strong enough depth lower bounds for Circuit Evaluation imply a full separation of P and NC (even with a weak size lower bound).
    2. Lower bounds for quantified Boolean formulas. Let c, d > 1 and e < 1 satisfy c < (1 ? e d )/d. Either the problem of recognizing valid quantified Boolean formulas (QBF) is not solvable in TIME[n c ], or the Circuit Evaluation problem cannot be solved with circuits of n d size and n e depth. This implies unconditional polynomial-time uniform circuit lower bounds for solving QBF. We also prove that QBF does not have n c -time uniform NC circuits, for all c < 2.
      相似文献   

    8.
    Prolate elements are a “plug-compatible” modification of spectral elements in which Legendre polynomials are replaced by prolate spheroidal wave functions of order zero. Prolate functions contain a“bandwidth parameter” $c \ge 0 $ c ≥ 0 whose value is crucial to numerical performance; the prolate functions reduce to Legendre polynomials for $c\,=\,0$ c = 0 . We show that the optimal bandwidth parameter $c$ c not only depends on the number of prolate modes per element $N$ N , but also on the element widths $h$ h . We prove that prolate elements lack $h$ h -convergence for fixed $c$ c in the sense that the error does not go to zero as the element size $h$ h is made smaller and smaller. Furthermore, the theoretical predictions that Chebyshev and Legendre polynomials require $\pi $ π degrees of freedom per wavelength to resolve sinusoidal functions while prolate series need only 2 degrees of freedom per wavelength are asymptotic limits as $N \rightarrow \infty $ N → ∞ ; we investigate the rather different behavior when $N \sim O(4-10)$ N ~ O ( 4 ? 10 ) as appropriate for spectral elements and prolate elements. On the other hand, our investigations show that there are certain combinations of $N,\,h$ N , h and $c>0$ c > 0 where a prolate basis clearly outperforms the Legendre polynomial approximation.  相似文献   

    9.
    This paper describes the Automated Reasoning for Mizar ( $\textsf{Miz}\mathbb{AR}$ ) service, which integrates several automated reasoning, artificial intelligence, and presentation tools with Mizar and its authoring environment. The service provides ATP assistance to Mizar authors in finding and explaining proofs, and offers generation of Mizar problems as challenges to ATP systems. The service is based on a sound translation from the Mizar language to that of first-order ATP systems, and relies on the recent progress in application of ATP systems in large theories containing tens of thousands of available facts. We present the main features of $\textsf{Miz}\mathbb{AR}$ services, followed by an account of initial experiments in finding proofs with the ATP assistance. Our initial experience indicates that the tool offers substantial help in exploring the Mizar library and in preparing new Mizar articles.  相似文献   

    10.
    Zeev Nutov 《Algorithmica》2012,63(1-2):398-410
    We consider the (undirected) Node Connectivity Augmentation (NCA) problem: given a graph J=(V,E J ) and connectivity requirements $\{r(u,v): u,v \in V\}$ , find a minimum size set I of new edges (any edge is allowed) such that the graph JI contains r(u,v) internally-disjoint uv-paths, for all u,vV. In Rooted NCA there is sV such that r(u,v)>0 implies u=s or v=s. For large values of k=max? u,vV r(u,v), NCA is at least as hard to approximate as Label-Cover and thus it is unlikely to admit an approximation ratio polylogarithmic in k. Rooted NCA is at least as hard to approximate as Hitting-Set. The previously best approximation ratios for the problem were O(kln?n) for NCA and O(ln?n) for Rooted NCA. In this paper we give an approximation algorithm with ratios O(kln?2 k) for NCA and O(ln?2 k) for Rooted NCA. This is the first approximation algorithm with ratio independent of?n, and thus is a constant for any fixed k. Our algorithm is based on the following new structural result which is of independent interest. If $\mathcal{D}$ is a set of node pairs in a graph?J, then the maximum degree in the hypergraph formed by the inclusion minimal tight sets separating at least one pair in $\mathcal{D}$ is O(? 2), where ? is the maximum connectivity in J of a pair in $\mathcal{D}$ .  相似文献   

    11.
    The Hamiltonian Cycle problem is the problem of deciding whether an n-vertex graph G has a cycle passing through all vertices of G. This problem is a classic NP-complete problem. Finding an exact algorithm that solves it in ${\mathcal {O}}^{*}(\alpha^{n})$ time for some constant α<2 was a notorious open problem until very recently, when Björklund presented a randomized algorithm that uses ${\mathcal {O}}^{*}(1.657^{n})$ time and polynomial space. The Longest Cycle problem, in which the task is to find a cycle of maximum length, is a natural generalization of the Hamiltonian Cycle problem. For a claw-free graph G, finding a longest cycle is equivalent to finding a closed trail (i.e., a connected even subgraph, possibly consisting of a single vertex) that dominates the largest number of edges of some associated graph H. Using this translation we obtain two deterministic algorithms that solve the Longest Cycle problem, and consequently the Hamiltonian Cycle problem, for claw-free graphs: one algorithm that uses ${\mathcal {O}}^{*}(1.6818^{n})$ time and exponential space, and one algorithm that uses ${\mathcal {O}}^{*}(1.8878^{n})$ time and polynomial space.  相似文献   

    12.
    We introduce two new natural decision problems, denoted as ? RATIONAL NASH and ? IRRATIONAL NASH, pertinent to the rationality and irrationality, respectively, of Nash equilibria for (finite) strategic games. These problems ask, given a strategic game, whether or not it admits (i) a rational Nash equilibrium where all probabilities are rational numbers, and (ii) an irrational Nash equilibrium where at least one probability is irrational, respectively. We are interested here in the complexities of ? RATIONAL NASH and ? IRRATIONAL NASH. Towards this end, we study two other decision problems, denoted as NASH-EQUIVALENCE and NASH-REDUCTION, pertinent to some mutual properties of the sets of Nash equilibria of two given strategic games with the same number of players. The problem NASH-EQUIVALENCE asks whether or not the two sets of Nash equilibria coincide; we identify a restriction of its complementary problem that witnesses ? RATIONAL NASH. The problem NASH-REDUCTION asks whether or not there is a so called Nash reduction: a suitable map between corresponding strategy sets of players that yields a Nash equilibrium of the former game from a Nash equilibrium of the latter game; we identify a restriction of NASH-REDUCTION that witnesses ? IRRATIONAL NASH. As our main result, we provide two distinct reductions to simultaneously show that (i) NASH-EQUIVALENCE is co- $\mathcal{NP}$ -hard and ? RATIONAL NASH is $\mathcal{NP}$ -hard, and (ii) NASH-REDUCTION and ? IRRATIONAL NASH are both $\mathcal{NP}$ -hard, respectively. The reductions significantly extend techniques previously employed by Conitzer and Sandholm (Proceedings of the 18th Joint Conference on Artificial Intelligence, pp. 765–771, 2003; Games Econ. Behav. 63(2), 621–641, 2008).  相似文献   

    13.
    Given a graph with n vertices, k terminals and positive integer weights not larger than c, we compute a minimum Steiner Tree in $\mathcal{O}^{\star}(2^{k}c)$ time and $\mathcal{O}^{\star}(c)$ space, where the $\mathcal{O}^{\star}$ notation omits terms bounded by a polynomial in the input-size. We obtain the result by defining a generalization of walks, called branching walks, and combining it with the Inclusion-Exclusion technique. Using this combination we also give $\mathcal{O}^{\star}(2^{n})$ -time polynomial space algorithms for Degree Constrained Spanning Tree, Maximum Internal Spanning Tree and #Spanning Forest with a given number of components. Furthermore, using related techniques, we also present new polynomial space algorithms for computing the Cover Polynomial of a graph, Convex Tree Coloring and counting the number of perfect matchings of a graph.  相似文献   

    14.
    This paper studies notions of locality that are inherent to the specification of distributed tasks by identifying fundamental relationships between the various scales of computation, from the individual process to the whole system. A locality property called projection-closed is identified. This property completely characterizes tasks that are wait-free checkable, where a task $T =(\mathcal{I },\mathcal{O },\varDelta )$ T = ( I , O , Δ ) is said to be checkable if there exists a distributed algorithm that, given $s\in \mathcal{I }$ s ∈ I and $t\in \mathcal{O }$ t ∈ O , determines whether $t\in \varDelta {(s)}$ t ∈ Δ ( s ) , i.e., whether $t$ t is a valid output for $s$ s according to the specification of $T$ T . Projection-closed tasks are proved to form a rich class of tasks. In particular, determining whether a projection-closed task is wait-free solvable is shown to be undecidable. A stronger notion of locality is identified by considering tasks whose outputs “look identical” to the inputs at every process: a task $T= (\mathcal{I },\mathcal{O },\varDelta )$ T = ( I , O , Δ ) is said to be locality-preserving if $\mathcal{O }$ O is a covering complex of $\mathcal{I }$ I . We show that this topological property yields obstacles for wait-free solvability different in nature from the classical impossibility results. On the other hand, locality-preserving tasks are projection-closed, and thus they are wait-free checkable. A classification of locality-preserving tasks in term of their relative computational power is provided. This is achieved by defining a correspondence between subgroups of the edgepath group of an input complex and locality-preserving tasks. This correspondence enables to demonstrate the existence of hierarchies of locality-preserving tasks, each one containing, at the top, the universal task (induced by the universal covering complex), and, at the bottom, the trivial identity task.  相似文献   

    15.
    We study the problem of maximizing the weighted number of just-in-time jobs on a single machine with position-dependent processing times. Unlike the vast majority of the literature, we do not restrict ourselves to a specific model of processing time function. Rather, we assume that the processing time function can be of any functional structure that is according to one of the following two cases. The first is the case where the job processing times are under a learning effect, i.e., each job processing time is a non-increasing function of its position in the sequence. In the second case, an aging effect is assumed, i.e., each job processing time is a non-decreasing function of its position in the sequence. We prove that the problem is strongly $\mathcal{N }\mathcal{P }$ N P -hard under a learning effect, even if all the weights are identical. When there is an aging effect, we introduce a dynamic programming (DP) procedure that solves the problem with arbitrary weights in $O(n^{3})$ O ( n 3 ) time (where $n$ n is the number of jobs). For identical weights, a faster optimization algorithm that runs in $O(n\log n)$ O ( n log n ) time is presented. We also extend the analysis to the case of scheduling on a set of $m$ m parallel unrelated machines and provide a DP procedure that solves the problem in polynomial time, given that $m$ m is fixed and that the jobs are under an aging effect.  相似文献   

    16.
    Raz’s parallel repetition theorem (SIAM J Comput 27(3):763–803, 1998) together with improvements of Holenstein (STOC, pp 411–419, 2007) shows that for any two-prover one-round game with value at most ${1- \epsilon}$ 1 - ? (for ${\epsilon \leq 1/2}$ ? ≤ 1 / 2 ), the value of the game repeated n times in parallel on independent inputs is at most ${(1- \epsilon)^{\Omega(\frac{\epsilon^2 n}{\ell})}}$ ( 1 - ? ) Ω ( ? 2 n ? ) , where ? is the answer length of the game. For free games (which are games in which the inputs to the two players are uniform and independent), the constant 2 can be replaced with 1 by a result of Barak et al. (APPROX-RANDOM, pp 352–365, 2009). Consequently, ${n=O(\frac{t \ell}{\epsilon})}$ n = O ( t ? ? ) repetitions suffice to reduce the value of a free game from ${1- \epsilon}$ 1 - ? to ${(1- \epsilon)^t}$ ( 1 - ? ) t , and denoting the input length of the game by m, it follows that ${nm=O(\frac{t \ell m}{\epsilon})}$ n m = O ( t ? m ? ) random bits can be used to prepare n independent inputs for the parallel repetition game. In this paper, we prove a derandomized version of the parallel repetition theorem for free games and show that O(t(m?)) random bits can be used to generate correlated inputs, such that the value of the parallel repetition game on these inputs has the same behavior. That is, it is possible to reduce the value from ${1- \epsilon}$ 1 - ? to ${(1- \epsilon)^t}$ ( 1 - ? ) t while only multiplying the randomness complexity by O(t) when m = O(?). Our technique uses strong extractors to “derandomize” a lemma of Raz and can be also used to derandomize a parallel repetition theorem of Parnafes et al. (STOC, pp 363–372, 1997) for communication games in the special case that the game is free.  相似文献   

    17.
    The parallel complexity class $\textsf{NC}$ 1 has many equivalent models such as polynomial size formulae and bounded width branching programs. Caussinus et al. (J. Comput. Syst. Sci. 57:200–212, 1992) considered arithmetizations of two of these classes, $\textsf{\#NC}$ 1 and $\textsf{\#BWBP}$ . We further this study to include arithmetization of other classes. In particular, we show that counting paths in branching programs over visibly pushdown automata is in $\textsf{FLogDCFL}$ , while counting proof-trees in logarithmic width formulae has the same power as $\textsf{\#NC}$ 1. We also consider polynomial-degree restrictions of $\textsf{SC}$ i , denoted $\textsf{sSC}$ i , and show that the Boolean class $\textsf{sSC}$ 1 is sandwiched between $\textsf{NC}$ 1 and $\textsf{L}$ , whereas $\textsf{sSC}$ 0 equals $\textsf{NC}$ 1. On the other hand, the arithmetic class $\textsf{\#sSC}$ 0 contains $\textsf{\#BWBP}$ and is contained in $\textsf{FL}$ , and $\textsf{\#sSC}$ 1 contains $\textsf{\#NC}$ 1 and is in $\textsf{SC}$ 2. We also investigate some closure properties of the newly defined arithmetic classes.  相似文献   

    18.
    We study the problem of computing canonical forms for graphs and hypergraphs under Abelian group action and show tight complexity bounds. Our approach is algebraic. We transform the problem of computing canonical forms for graphs to the problem of computing canonical forms for associated algebraic structures, and we develop parallel algorithms for these associated problems.
    1. In our first result we show that the problem of computing canonical labelings for hypergraphs of color class size 2 is logspace Turing equivalent to solving a system of linear equations over the field $\mathbb {F} _{2}$ . This implies a deterministic NC 2 algorithm for the problem.
    2. Similarly, we show that the problem of canonical labeling graphs and hypergraphs under arbitrary Abelian permutation group action is fairly well characterized by the problem of computing the integer determinant. In particular, this yields deterministic NC 3 and randomized NC 2 algorithms for the problem.
      相似文献   

    19.
    We present a data structure that stores a sequence s[1..n] over alphabet [1..σ] in $n\mathcal{H}_{0}(s) + o(n)(\mathcal {H}_{0}(s){+}1)$ bits, where $\mathcal{H}_{0}(s)$ is the zero-order entropy of s. This structure supports the queries access, rank and select, which are fundamental building blocks for many other compressed data structures, in worst-case time ${\mathcal{O} ( {\lg\lg\sigma} )}$ and average time ${\mathcal{O} ( {\lg\mathcal{H}_{0}(s)} )}$ . The worst-case complexity matches the best previous results, yet these had been achieved with data structures using $n\mathcal{H}_{0}(s)+o(n\lg \sigma)$ bits. On highly compressible sequences the o(nlgσ) bits of the redundancy may be significant compared to the $n\mathcal{H}_{0}(s)$ bits that encode the data. Our representation, instead, compresses the redundancy as well. Moreover, our average-case complexity is unprecedented. Our technique is based on partitioning the alphabet into characters of similar frequency. The subsequence corresponding to each group can then be encoded using fast uncompressed representations without harming the overall compression ratios, even in the redundancy. The result also improves upon the best current compressed representations of several other data structures. For example, we achieve (i) compressed redundancy, retaining the best time complexities, for the smallest existing full-text self-indexes; (ii) compressed permutations π with times for π() and π ?1() improved to loglogarithmic; and (iii) the first compressed representation of dynamic collections of disjoint sets. We also point out various applications to inverted indexes, suffix arrays, binary relations, and data compressors. Our structure is practical on large alphabets. Our experiments show that, as predicted by theory, it dominates the space/time tradeoff map of all the sequence representations, both in synthetic and application scenarios.  相似文献   

    20.
    Query result clustering has attracted considerable attention as a means of providing users with a concise overview of results. However, little research effort has been devoted to organizing the query results for entities which refer to real-world concepts, e.g., people, products, and locations. Entity-level result clustering is more challenging because diverse similarity notions between entities need to be supported in heterogeneous domains, e.g., image resolution is an important feature for cameras, but not for fruits. To address this challenge, we propose a hybrid relationship clustering algorithm, called Hydra, using co-occurrence and numeric features. Algorithm Hydra captures diverse user perceptions from co-occurrence and disambiguates different senses using feature-based similarity. In addition, we extend Hydra into ${\mathsf{Hydra }_\mathsf{gData }}$ Hydra gData with different sources, i.e., entity types and crowdsourcing. Experimental results show that the proposed algorithms achieve effectiveness and efficiency in real-life and synthetic datasets.  相似文献   

    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号