首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Suffix arrays form a powerful data structure for pattern detection and matching. In a previous work, we presented a novel algorithm (COV) which is the only algorithm that allows the detection of all repeated patterns in a time series by using the actual suffix array. However, the requirements for storing the actual suffix strings even on external media makes the use of suffix arrays impossible for very large time series. We have already proved that using the concept of Longest Expected Repeated Pattern (LERP) allows the actual suffices to be stored in linear capacity O(n) on external media. The repeated pattern detection using LERP has analogous time complexity, and thus makes the analysis of large time series feasible and limited only to the size of the external media and not memory. Yet, there are cases when hardware limitations might be an obstacle for the analysis of very larger time series of size comparable to hard disk capacity. With the Moving LERP (MLERP) method introduced in this paper, it is possible to analyze very large time series (of size tens or hundreds thousands times larger than what the LERP can analyze) by maximal utilization of the available hardware. Further, when empirical knowledge related to the distribution of repeated pattern’s length is available, the proposed method (MLERP) can achieve better time performance compared to the standard LERP method and definitely much better than using any other pattern matching algorithm and applying brute force techniques which are unfeasible in logical (human) time frame. Thus, we may argue that MLERP is a very useful tool for detecting all repeated patterns in a time series regardless of its size and hardware limitations.  相似文献   

2.
Hoare logic [1] is a logic used as a way of specifying semantics of programming languages, which has been extended to be a separation logic to reason about mutable heap structure [2]. In a model M of Hoare logic, each program α induces an M-computable function f α M on the universe of M; and the M-recursive functions are defined on M. It will be proved that the class of all the M-computable functions f α M induced by programs is equal to the class of all the M-recursive functions. Moreover, each M-recursive function is \(\sum {_1^{{N^M}}} \)-definable in M, where the universal quantifier is a number quantifier ranging over the standard part of a nonstandard model M.  相似文献   

3.
4.
Existing definitions of the relativizations of NC 1, L and NL do not preserve the inclusions \({{\bf NC}^1 \subseteq {\bf L}, {\bf NL}\subseteq {\bf AC}^1}\). We start by giving the first definitions that preserve them. Here for L and NL we define their relativizations using Wilson’s stack oracle model, but limit the height of the stack to a constant (instead of log(n)). We show that the collapse of any two classes in \({\{{\bf AC}^0 (m), {\bf TC}^0, {\bf NC}^1, {\bf L}, {\bf NL}\}}\) implies the collapse of their relativizations. Next we exhibit an oracle α that makes AC k (α) a proper hierarchy. This strengthens and clarifies the separations of the relativized theories in Takeuti (1995). The idea is that a circuit whose nested depth of oracle gates is bounded by k cannot compute correctly the (k + 1) compositions of every oracle function. Finally, we develop theories that characterize the relativizations of subclasses of P by modifying theories previously defined by the second two authors. A function is provably total in a theory iff it is in the corresponding relativized class, and hence, the oracle separations imply separations for the relativized theories.  相似文献   

5.
An approach to stabilization of nonlinear oscillations in multidimensional spaces is proposed on the basis of the V.I. Zubov’s stability theory for invariant sets. As a special case, the derived controls make it possible to excite self-oscillating regimes in specified state subspaces R 2k ? R 2n with simultaneous oscillation damping on Cartesian products R 2n?2k .  相似文献   

6.
In (Bauwens and Shen, J. Symb. Log. 79(2), 620–632, 2013) a short proof is given that some strings have maximal plain Kolmogorov complexity but not maximal prefix-free complexity. We argue that the proof technique is useful to simplify existing proofs and to solve open questions. We present a short proof of a result due to Robert Solovay that relates plain and prefix complexity:  相似文献   

7.
We consider the problem of mining web access patterns with super-pattern constraint. This constraint requires that the sequential patterns in the sequence database must contain a particular set of patterns as sub-patterns. One common application of this constraint is web usage mining which mines the user access behavior on the web. In this paper, we introduce an efficient strategy for mining web access patterns with super-pattern constraint that requires only one database scan. Firstly, we present the MWAPC (M ining W eb A ccess P atterns based on super-pattern C onstraint) algorithm, in which each frequent pattern has to be checked if it contains at least one pattern from a user-defined set of patterns. Then we develop an effective algorithm, called EMWAPC that prunes the search space at the beginning of mining process and avoids checking the constraints one by one based on three proposed propositions. We have conducted the experiments on real web log databases. The experimental results show that the proposed algorithms outperform the previous methods.  相似文献   

8.
A B 4-valued propositional logic will be proposed in this paper which there are three unary logical connectives ~1, ~2, ¬ and two binary logical connectives ∧, ∨, and a Gentzen-typed deduction system will be given so that the system is sound and complete with B 4-valued semantics, where B 4 is a Boolean algebra.  相似文献   

9.
Paper presents a unique novel online learning algorithm for eight popular nonlinear (i.e., kernel), classifiers based on a classic stochastic gradient descent in primal domain. In particular, the online learning algorithm is derived for following classifiers: L1 and L2 support vector machines with both a quadratic regularizer w t w and the l 1 regularizer |w|1; regularized huberized hinge loss; regularized kernel logistic regression; regularized exponential loss with l 1 regularizer |w|1 and Least squares support vector machines. The online learning algorithm is aimed primarily for designing classifiers for large datasets. The novel learning model is accurate, fast and extremely simple (i.e., comprised of few coding lines only). Comparisons of performances of the proposed algorithm with the state of the art support vector machine algorithm on few real datasets are shown.  相似文献   

10.
In the problem of the stabilizing solution of the algebraic Riccati equation, the resolvent Θ(s) = (s I 2n ? H)?1 of the Hamilton 2n × 2n-matrix H of the algebraic Riccati equation allows us to reduce the problem to a linear matrix equation. In [1], the constructions necessary for this and the theorem of existence and representation of the stabilized solutions to an algebraic Riccati equation was proposed. In this paper, the methods of constructing the resolvent and the linear reduction matrix defined by it necessary for the application of the theorem, and in addition, the algorithms of constructing stabilizing solution of the algebraic Riccati equation are proposed.  相似文献   

11.
Model-based testing has mainly focused on models where concurrency is interpreted as interleaving (like the ioco theory for labeled transition systems), which may be too coarse when one wants concurrency to be preserved in the implementation. In order to test such concurrent systems, we choose to use Petri nets as specifications and define a concurrent conformance relation named co-ioco. We present a test generation algorithm based on Petri net unfolding able to build a complete test suite w.r.t our co-ioco conformance relation. In addition, we propose several coverage criteria that allow to select finite prefixes of an unfolding in order to build manageable test suites.  相似文献   

12.
Many scholarly writings today are available in electronic formats. With universities around the world choosing to make digital versions of their dissertations, theses, project reports, and related files and data sets available online, an overwhelming amount of information is becoming available on almost any particular topic. How will users decide which dissertation, or subsection of a dissertation, to read to get the required information on a particular topic? What kind of services can such digital libraries provide to make knowledge discovery easier? In this paper, we investigate these issues, using as a case study the Networked Digital Library of Theses and Dissertations (NDLTD), a rapidly growing collection that already has about 800,000 Electronic Theses and Dissertations (ETDs) from universities around the world. We propose the design for a scalable, Web Services based tool KDWebS (Knowledge Discovery System based on Web Services), to facilitate automated knowledge discovery in NDLTD. We also provide some preliminary proof of concept results to demonstrate the efficacy of the approach.  相似文献   

13.
The Longest Previous non-overlapping Factor table (LPnF) stores for each position of a string the maximal length of factors occurring both there and in the preceding part of the string. The notion is a slight variant of the LPF table described before and used for text compression. The LPnF table is an essential element for the design of efficient algorithms on strings as it is related to a certain type of Ziv-Lempel factorisation used for this purpose.We show how to compute the LPnF table in linear time from the suffix array of the string when it is drawn from an integer alphabet. The algorithm is a non-immediate extension of the LPF computation and it does not require any other sophisticated data structure than the suffix array of the input string.  相似文献   

14.
15.
Query optimization in Big Data becomes a promising research direction due to the popularity of massive data analytical systems such as Hadoop system. The query optimization is getting hard to efficiently execute JOIN queries on top of Hadoop query language, Hive, over limited Big Data storages. According to our previous work, HiveQL Optimization for JOIN query over Multi-session Environment (HOME) system has been introduced over Hadoop system to improve its performance by storing the intermediate results to avoid repeated computations. Time overheads and Big Data storages limitation are considered the main drawback of the HOME system, especially in the case of using additional physical storages or renting extra virtualized storages. In this paper, an index-based system for reusing data called indexing HiveQL Optimization for JOIN over Multi-session Big Data Environment (iHOME) is proposed to overcome HOME overheads by storing only the indexes of the joined rows instead of storing the full intermediate results directly. Moreover, the proposed iHOME system addresses eight cases of JOIN queries which classified into three groups; Similar-to-iHOME, Compute-on-iHOME, and Filter-of-iHOME. According to the experimental results of the iHOME system using TPC-H benchmark, it is found that the execution time of eight JOIN queries using iHOME on Hive has been reduced. Also, the stored data size in the iHOME system is reduced relative to the HOME system, as well as, the Big Data storage is saved. So, by increasing stored data size, the iHOME system guarantees the space scalability and overcomes the storage limitation.  相似文献   

16.
Human action recognition is a hot research topic; however, the change in shapes, the high variability of appearances, dynamitic background, potential occlusions in different actions and the image limit of 2D sensor make it more difficult. To solve these problems, we pay more attention to the depth channel and the fusion of different features. Thus, we firstly extract different features for depth image sequence, and then, multi-feature mapping and dictionary learning model (MMDLM) is proposed to deeply discover the relationship between these different features, where two dictionaries and a feature mapping function are simultaneously learned. What is more, these dictionaries can fully characterize the structure information of different features, while the feature mapping function is a regularization term, which can reveal the intrinsic relationship between these two features. Large-scale experiments on two public depth datasets, MSRAction3D and DHA, show that the performances of these different depth features have a big difference, but they are complementary. Further, the features fusion by MMDLM is very efficient and effective on both datasets, which is comparable to the state-of-the-art methods.  相似文献   

17.
This paper presents an agent-based simulator for environmental land change that includes efficient and parallel auto-tuning. This simulator extends the Multi-Agent System for Environmental simulation (MASE) by introducing rationality to agents using a mentalistic approach—the Belief-Desire-Intention (BDI) model—and is thus named MASE-BDI. Because the manual tuning of simulation parameters is an error-prone, labour and computing intensive task, an auto-tuning approach with efficient multi-objective optimization algorithms is also introduced. Further, parallelization techniques are employed to speed up the auto-tuning process by deploying it in parallel systems. The MASE-BDI is compared to the MASE using the Brazilian Cerrado biome case. The MASE-BDI reduces the simulation execution times by at least 82 × and slightly improves the simulation quality. The auto-tuning algorithms, by evaluating less than 0.00115 % of a search space with 6 million parameter combinations, are able to quickly tune the simulation model, regardless of the objective used. Moreover, the experimental results show that executing the tuning in parallel leads to speedups of approximately 11 × compared to sequential execution in a hardware setting with 16-CPU cores.  相似文献   

18.
How do the k-core structures of real-world graphs look like? What are the common patterns and the anomalies? How can we exploit them for applications? A k-core is the maximal subgraph in which all vertices have degree at least k. This concept has been applied to such diverse areas as hierarchical structure analysis, graph visualization, and graph clustering. Here, we explore pervasive patterns related to k-cores and emerging in graphs from diverse domains. Our discoveries are: (1) Mirror Pattern: coreness (i.e., maximum k such that each vertex belongs to the k-core) is strongly correlated with degree. (2) Core-Triangle Pattern: degeneracy (i.e., maximum k such that the k-core exists) obeys a 3-to-1 power-law with respect to the count of triangles. (3) Structured Core Pattern: degeneracy–cores are not cliques but have non-trivial structures such as core–periphery and communities. Our algorithmic contributions show the usefulness of these patterns. (1) Core-A, which measures the deviation from Mirror Pattern, successfully spots anomalies in real-world graphs, (2) Core-D, a single-pass streaming algorithm based on Core-Triangle Pattern, accurately estimates degeneracy up to 12 \(\times \) faster than its competitor. (3) Core-S, inspired by Structured Core Pattern, identifies influential spreaders up to 17 \(\times \) faster than its competitors with comparable accuracy.  相似文献   

19.
This paper presents raSAT SMT solver, which is aimed to handle polynomial constraints over both reals and integers with simple unified methodologies. Its three main features are (1) a raSAT loop for inequalities, which adds testing to interval constraint propagation to accelerate SAT detection, (2) a non-constructive reasoning for equations over reals based on the generalized intermediate value theorem, and (3) soundness of floating-point arithmetic that is guaranteed by (a) rounding up/down over-approximations of intervals, and (b) confirmation of a satisfying instance detected by testing using the iRRAM package, which guarantees error bounds.  相似文献   

20.
Suffix automata and factor automata are efficient data structures for representing the full index of a set of strings. They are minimal deterministic automata representing the set of all suffixes or substrings of a set of strings. This paper presents a novel analysis of the size of the suffix automaton or factor automaton of a set of strings. It shows that the suffix automaton or factor automaton of a set of strings UU has at most 2Q−22Q2 states, where QQ is the number of nodes of a prefix-tree representing the strings in UU. This bound significantly improves over 2‖U‖−12U1, the bound given by Blumer et al. [A. Blumer, J. Blumer, D. Haussler, R.M. McConnell, A. Ehrenfeucht, Complete inverted files for efficient text retrieval and analysis, Journal of the ACM 34 (1987) 578–589], where ‖U‖U is the sum of the lengths of all strings in UU. More generally, we give novel and general bounds for the size of the suffix or factor automaton of an automaton as a function of the size of the original automaton and the maximal length of a suffix shared by the strings it accepts. We also describe in detail a linear-time algorithm for constructing the suffix automaton SS or factor automaton FF of UU in time O(|S|)O(|S|). Our algorithm applies in fact to any input suffix-unique automaton and strictly generalizes the standard on-line construction of a suffix automaton for a single input string. Our algorithm can also be used straightforwardly to generate the suffix oracle or factor oracle of a set of strings, which has been shown to have various useful properties in string-matching. Our analysis suggests that the use of factor automata of automata can be practical for large-scale applications, a fact that is further supported by the results of our experiments applying factor automata to a music identification task with more than 15,000 songs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号