首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
蒋瑜 《控制与决策》2019,34(6):1253-1258
差别矩阵为属性约简提供了很好的思路,差别信息树能有效消除差别矩阵中的冗余元素,并实现对差别矩阵的压缩存储.然而,差别信息树既没有考虑“核”属性在消除差别矩阵中冗余元素的作用,也没有考虑属性序在压缩存储差别矩阵中非空元素的作用.对此,基于“核”属性和属性序关系,提出改进差别信息树,该树能进一步实现对差别矩阵中非空元素的压缩存储.最后,给出基于UCI数据库的仿真结果,并通过仿真结果验证该树的有效性.  相似文献   

An optimal probabilistic-planning algorithm solves a problem, usually modeled by a Markov decision process, by finding an optimal policy. In this paper, we study the k best policies problem. The problem is to find the k best policies of a discrete Markov decision process. The k best policies, k?>?1, cannot be found directly using dynamic programming. Naïvely, finding the k-th best policy can be Turing reduced to the optimal planning problem, but the number of problems queried in the naïve algorithm is exponential in k. We show empirically that solving k best policies problem by using this reduction requires unreasonable amounts of time even when k?=?3. We then provide two new algorithms. The first is a complete algorithm, based on our theoretical contribution that the k-th best policy differs from the i-th policy, for some i?k, on exactly one state. The second is an approximate algorithm that skips many less useful policies. We show that both algorithms have good scalability. We also show that the approximate algorithms runs much faster and finds interesting, high-quality policies.  相似文献   

An addition sequence problem is given a set of numbers X = {n 1, n 2, . . . , n m }, what is the minimal number of additions needed to compute all m numbers starting from 1? This problem is NP-complete. In this paper, we present a branch and bound algorithm to generate an addition sequence with a minimal number of elements for a set X by using a new strategy. Then we improve the generation by generalizing some results on addition chains (m = 1) to addition sequences and finding what we will call a presumed upper bound for each n j , 1 ≤ j ≤ m, in the search tree.  相似文献   

Various sorting algorithms using parallel architectures have been proposed in the search for more efficient results. This paper introduces the Multi-Sort Algorithm for Multi-Mesh of Trees (MMT) Architecture for N=n 4 elements with more efficient time complexity compared to previous architectures. The shear sort algorithm on Single Instruction Multiple Data (SIMD) mesh model requires \(4\sqrt{N}+O\sqrt{N}\) time for sorting N elements, arranged on a \(\sqrt{N}\times \sqrt{N}\) mesh, whereas Multi-Sort algorithm on the SIMD Multi-Mesh (MM) Architecture takes O(N 1/4) time for sorting the same N elements, which proves that Multi-Sort is a better sorting approach. We have improved the time complexity of intrablock Sort. The Communication time complexity for 2D Sort in MM is O(n), whereas this time in MMT is O(log?n). The time complexity of compare–exchange step in MMT is same as that in MM, i.e., O(n). It has been found that the time complexity of the Multi-Sort on MMT has been improved as on Multi-Mesh architecture.  相似文献   

An algorithm of indefinite summation of rational functions is proposed. For a given function f(x), it constructs a pair of rational functions g(x) and r(x) such that f(x) = g(x + 1) ? g(x) + r(x), where the degree of the denominator of r(x) is minimal, and, when this condition is satisfied, the degree of the denominator of g(x) is also minimal.  相似文献   

Although k-anonymity is a good way of publishing microdata for research purposes, it cannot resist several common attacks, such as attribute disclosure and the similarity attack. To resist these attacks, many refinements of kanonymity have been proposed with t-closeness being one of the strictest privacy models. While most existing t-closeness models address the case in which the original data have only one single sensitive attribute, data with multiple sensitive attributes are more common in practice. In this paper, we cover this gap with two proposed algorithms for multiple sensitive attributes and make the published data satisfy t-closeness. Based on the observation that the values of the sensitive attributes in any equivalence class must be as spread as possible over the entire data to make the published data satisfy t-closeness, both of the algorithms use different methods to partition records into groups in terms of sensitive attributes. One uses a clustering method, while the other leverages the principal component analysis. Then, according to the similarity of quasiidentifier attributes, records are selected from different groups to construct an equivalence class, which will reduce the loss of information as much as possible during anonymization. Our proposed algorithms are evaluated using a real dataset. The results show that the average speed of the first proposed algorithm is slower than that of the second proposed algorithm but the former can preserve more original information. In addition, compared with related approaches, both proposed algorithms can achieve stronger protection of privacy and reduce less.  相似文献   

Recent years have witnessed the rapid growth of text data, and thus the increasing importance of in-depth analysis of text data for various applications. Text data are often organized in a database with documents labeled by attributes like time and location. Different documents manifest different topics. The topics of the documents may change along the attributes of the documents, and such changes have been the subject of research in the past. However, previous analyses techniques, such as topic detection and tracking, topic lifetime, and burstiness, all focus on the topic behavior of the documents in a given attribute range without contrasting to the documents in the overall range. This paper introduces the concept of u n i q u e t o p i c s, referring to those topics that only appear frequently within a small range of documents but not in the whole range. These unique topics may reflect some unique characteristics of documents in this small range not found outside of the range. The paper aims at an efficient pruning-based algorithm that, for a user-given set of keywords and a user-given attribute, finds the maximal ranges along the given attribute and their unique topics that are highly related to the given keyword set. Thorough experiments show that the algorithm is effective in various scenarios.  相似文献   

Rapid advances in image acquisition and storage technology underline the need for real-time algorithms that are capable of solving large-scale image processing and computer-vision problems. The minimum st cut problem, which is a classical combinatorial optimization problem, is a prominent building block in many vision and imaging algorithms such as video segmentation, co-segmentation, stereo vision, multi-view reconstruction, and surface fitting to name a few. That is why finding a real-time algorithm which optimally solves this problem is of great importance. In this paper, we introduce to computer vision the Hochbaum’s pseudoflow (HPF) algorithm, which optimally solves the minimum st cut problem. We compare the performance of HPF, in terms of execution times and memory utilization, with three leading published algorithms: (1) Goldberg’s and Tarjan’s Push-Relabel; (2) Boykov’s and Kolmogorov’s augmenting paths; and (3) Goldberg’s partial augment-relabel. While the common practice in computer-vision is to use either BK or PRF algorithms for solving the problem, our results demonstrate that, in general, HPF algorithm is more efficient and utilizes less memory than these three algorithms. This strongly suggests that HPF is a great option for many real-time computer-vision problems that require solving the minimum st cut problem.  相似文献   

We examine the orthographic-n-point problem (OnP), which extends the perspective-n-point problem to telecentric cameras. Given a set of 3D points and their corresponding 2D points under orthographic projection, the OnP problem is the determination of the pose of the 3D point cloud with respect to the telecentric camera. We show that the OnP problem is equivalent to the unbalanced orthogonal Procrustes problem for non-coplanar 3D points and to the sub-Stiefel Procrustes problem for coplanar 3D points. To solve the OnP problem, we apply existing algorithms for the respective Procrustes problems and also propose novel algorithms. Furthermore, we evaluate the algorithms to determine their robustness and speed and conclude which algorithms are preferable in real applications. Finally, we evaluate which algorithm is most suitable as a minimal solver in a RANSAC scheme.  相似文献   

We introduce the novel concept of knowledge states. The knowledge state approach can be used to construct competitive randomized online algorithms and study the trade-off between competitiveness and memory. Many well-known algorithms can be viewed as knowledge state algorithms. A knowledge state consists of a distribution of states for the algorithm, together with a work function which approximates the conditional obligations of the adversary. When a knowledge state algorithm receives a request, it then calculates one or more “subsequent” knowledge states, together with a probability of transition to each. The algorithm uses randomization to select one of those subsequents to be the new knowledge state. We apply this method to randomized k-paging. The optimal minimum competitiveness of any randomized online algorithm for the k-paging problem is the kth harmonic number, \(H_{k}=\sum^{k}_{i=1}\frac{1}{i}\). Existing algorithms which achieve that optimal competitiveness must keep bookmarks, i.e., memory of the names of pages not in the cache. An H k -competitive randomized algorithm for that problem which uses O(k) bookmarks is presented, settling an open question by Borodin and El-Yaniv. In the special cases where k=2 and k=3, solutions are given using only one and two bookmarks, respectively.  相似文献   

运用可辨识矩阵表示信息系统中所有对象的区分信息,为研究属性约简提供了新方向。然而,传统的可辨识矩阵在构造结束后才利用核属性消除冗余元素项,忽略了核属性在矩阵构建过程中的作用。针对这一问题,文中做了以下研究:1)优化可辨识矩阵的构造方式,在计算任意两个对象的区分信息之前,先判断核属性上的取值是否相等,如果不相等,则直接将对应元素项记为Φ,忽略对其他条件属性的判断;2)提出属性加权重要度的概念,综合考虑每个条件属性占可辨识矩阵中非空元素项的比率(称为宏观重要度)与每个属性对区分对象的贡献程度(称为微观重要度),并通过例子说明了该度量方法的合理性;3)针对优化后的矩阵仍然存在大量冗余元素和空集这一缺陷,结合差别信息树的概念提出基于优化可辨识矩阵和属性加权重要度的差别信息树。按照属性加权重要度对优化可辨识矩阵中所有非空元素项进行排序,使得重要度高的属性被更多的节点共享;且在构建过程中将不包含核属性的元素项映射到树中的一条路径上,而包含核属性的元素项则被直接忽略。最后,提出基于优化可辨识矩阵和改进差别信息树的约简算法HSDI-tree。在UCI的5个数据集上分别比较了HSDI-tree算法与CDI-tree,DI-tree和IDI-tree算法的约简结果和节点个数,实验结果表明HSDI-tree算法能有效找到最小属性约简且空间压缩能力更好。  相似文献   

We consider the k-Server problem under the advice model of computation when the underlying metric space is sparse. On one side, we introduce Θ(1)-competitive algorithms for a wide range of sparse graphs. These algorithms require advice of (almost) linear size. We show that for graphs of size N and treewidth α, there is an online algorithm that receives O (n(log α + log log N))* bits of advice and optimally serves any sequence of length n. We also prove that if a graph admits a system of μ collective tree (q, r)-spanners, then there is a (q + r)-competitive algorithm which requires O (n(log μ + log log N)) bits of advice. Among other results, this gives a 3-competitive algorithm for planar graphs, when provided with O (n log log N) bits of advice. On the other side, we prove that advice of size Ω(n) is required to obtain a 1-competitive algorithm for sequences of length n even for the 2-server problem on a path metric of size N ≥ 3. Through another lower bound argument, we show that at least \(\frac {n}{2}(\log \alpha - 1.22)\) bits of advice is required to obtain an optimal solution for metric spaces of treewidth α, where 4 ≤ α < 2k.  相似文献   

Representative skyline computation is a fundamental issue in database area, which has attracted much attention in recent years. A notable definition of representative skyline is the distance-based representative skyline (DBRS). Given an integer k, a DBRS includes k representative skyline points that aims at minimizing the maximal distance between a non-representative skyline point and its nearest representative. In the 2D space, the state-of-the-art algorithm to compute the DBRS is based on dynamic programming (DP) which takes O(k m 2) time complexity, where m is the number of skyline points. Clearly, such a DP-based algorithm cannot be used for handling large scale datasets due to the quadratic time cost. To overcome this problem, in this paper, we propose a new approximate algorithm called ARS, and a new exact algorithm named PSRS, based on a carefully-designed parametric search technique. We show that the ARS algorithm can guarantee a solution that is at most ?? larger than the optimal solution. The proposed ARS and PSRS algorithms run in O(klog2mlog(T/??)) and O(k 2 log3m) time respectively, where T is no more than the maximal distance between any two skyline points. We also propose an improved exact algorithm, called PSRS+, based on an effective lower and upper bounding technique. We conduct extensive experimental studies over both synthetic and real-world datasets, and the results demonstrate the efficiency and effectiveness of the proposed algorithms.  相似文献   

The recognition of primitives in digital geometry is deeply linked with separability problems. This framework leads us to consider the following problem of pattern recognition : given a finite lattice set \(S\subset \mathbb {Z}^d\) and a positive integer n, is it possible to separate S from \(\mathbb {Z}^d \setminus S\) by n half-spaces? In other words, does there exist a polyhedron P defined by at most n half-spaces satisfying \(P\cap \mathbb {Z}^d = S\)? The difficulty comes from the infinite number of constraints generated by all the points of \(\mathbb {Z}^d\setminus S\). It makes the decidability of the problem non-straightforward since the classical algorithms of polyhedral separability can not be applied in this framework. We conjecture that the problem is nevertheless decidable and prove it under some assumptions: in arbitrary dimension, if the interior of the convex hull of S contains at least one lattice point or if the dimension d is 2 or if the dimension \(d=3\) and S is not in a specific configuration of lattice width 0 or 1. The proof strategy is to reduce the set of outliers \(\mathbb {Z}^d\setminus S\) to its minimal elements according to a partial order “is in the shadow of.” These minimal elements are called the lattice jewels of S. We prove that under some assumptions, the set S admits only a finite number of lattice jewels. The result about the decidability of the problem is a corollary of this fundamental property.  相似文献   

A deterministic parallel LL parsing algorithm is presented. The algorithm is based on a transformation from a parsing problem to parallel reduction. First, a nondeterministic version of a parallel LL parser is introduced. Then, it is transformed into the deterministic version—the LLP parser. The deterministic LLP(q,k) parser uses two kinds of information to select the next operation — a lookahead string of length up to k symbols and a lookback string of length up to q symbols. Deterministic parsing is available for LLP grammars, a subclass of LL grammars. Since the presented deterministic and nondeterministic parallel parsers are both based on parallel reduction, they are suitable for most parallel architectures.  相似文献   

In the problem of the stabilizing solution of the algebraic Riccati equation, the resolvent Θ(s) = (s I 2n ? H)?1 of the Hamilton 2n × 2n-matrix H of the algebraic Riccati equation allows us to reduce the problem to a linear matrix equation. In [1], the constructions necessary for this and the theorem of existence and representation of the stabilized solutions to an algebraic Riccati equation was proposed. In this paper, the methods of constructing the resolvent and the linear reduction matrix defined by it necessary for the application of the theorem, and in addition, the algorithms of constructing stabilizing solution of the algebraic Riccati equation are proposed.  相似文献   

With the popularity of mobile computing and social media, various kinds of online event-based social network (EBSN) platforms, such as Meetup, Plancast and Whova, are gaining in prominence. A fundamental task of managing EBSN platforms is to recommend suitable social events to potential users according to the following three factors: spatial locations of events and users, attribute similarities between events and users, and friend relationships among users. However, none of the existing approaches considers all the aforementioned influential factors when they recommend users to proper events. Furthermore, the existing recommendation strategies neglect the bottleneck cases of the global recommendation. Thus, it is impossible for the existing recommendation solutions to be fair in real-world scenarios. In this paper, we first formally define the problem of bottleneck-aware social event arrangement (BSEA), which is proven to be NP-hard. To solve the BSEA problem approximately, we devise two greedy heuristic algorithms, Greedy and Random+Greedy, and a local-search-based optimization technique. In particular, the Greedy algorithm is more effective but less efficient than the Random+Greedy algorithm in most cases. Moreover, a variant of the BSEA problem, called the Extended BSEA problem, is studied, and the above solutions can be extended to address this variant easily. Finally, we conduct extensive experiments on real and synthetic datasets which verify the efficiency and effectiveness of our proposed algorithms.  相似文献   

In this paper, an efficient technique for optimal design of digital infinite impulse response (IIR) filter with minimum passband error (e p ), minimum stopband error (e s ), high stopband attenuation (A s ), and also free from limit cycle effect is proposed using cuckoo search (CS) algorithm. In the proposed method, error function, which is multi-model and non-differentiable in the heuristic surface, is constructed as the mean squared difference between the designed and desired response in frequency domain, and is optimized using CS algorithm. Computational efficiency of the proposed technique for exploration in search space is examined, and during exploration, stability of filter is maintained by considering lattice representation of the denominator polynomials, which requires less computational complexity as well as it improves the exploration ability in search space for designing higher filter taps. A comparative study of the proposed method with other algorithms is made, and the obtained results show that 90% reduction in errors is achieved using the proposed method. However, computational complexity in term of CPU time is increased as compared to other existing algorithms.  相似文献   

Emergence of MapReduce (MR) framework for scaling data mining and machine learning algorithms provides for Volume, while handling of Variety and Velocity needs to be skilfully crafted in algorithms. So far, scalable clustering algorithms have focused solely on Volume, taking advantage of the MR framework. In this paper we present a MapReduce algorithm—data aware scalable clustering (DASC), which is capable of handling the 3 Vs of big data by virtue of being (i) single scan and distributed to handle Volume, (ii) incremental to cope with Velocity and (iii) versatile in handling numeric and categorical data to accommodate Variety. DASC algorithm incrementally processes infinitely growing data set stored on distributed file system and delivers quality clustering scheme while ensuring recency of patterns. The up-to-date synopsis is preserved by the algorithm for the data seen so far. Each new data increment is processed and merged with the synopsis. Since the synopsis itself may grow very large in size, the algorithm stores it as a file. This makes DASC algorithm truly scalable. Exclusive clusters are obtained on demand by applying connected component analysis (CCA) algorithm over the synopsis. CCA presents subtle roadblock to effective parallelism during clustering. This problem is overcome by accomplishing the task in two stages. In the first stage, hyperclusters are identified based on prevailing data characteristics. The second stage utilizes this knowledge to determine the degree of parallelism, thereby making DASC data aware. Hyperclusters are distributed over the available compute nodes for discovering embedded clusters in parallel. Staged approach for clustering yields dual advantage of improved parallelism and desired complexity in \(\mathcal {MRC}^0\) class. DASC algorithm is empirically compared with incremental Kmeans and Scalable Kmeans++ algorithms. Experimentation on real-world and synthetic data with approximately 1.2 billion data points demonstrates effectiveness of DASC algorithm. Empirical observations of DASC execution are in consonance with the theoretical analysis with respect to stability in resources utilization and execution time.  相似文献   

We study two natural extensions of Constraint Satisfaction Problems (CSPs). Balance-Max-CSP requires that in any feasible assignment each element in the domain is used an equal number of times. An instance of Hard-Max-CSP consists of soft constraints and hard constraints, and the goal is to maximize the weight of satisfied soft constraints while satisfying all the hard constraints. These two extensions contain many fundamental problems not captured by CSPs, and challenge traditional theories about CSPs in a more general framework. Max-2-SAT and Max-Horn-SAT are the only two nontrivial classes of Boolean CSPs that admit a robust satisfibiality algorithm, i.e., an algorithm that finds an assignment satisfying at least (1 ? g(ε)) fraction of constraints given a (1 ? ε)-satisfiable instance, where g(ε) → 0 as ε → 0, and g(0) = 0. We prove the inapproximability of these problems with balance or hard constraints, showing that each variant changes the nature of the problems significantly (in different ways). For instance, deciding whether an instance of 2-SAT admits a balanced assignment is NP-hard, and for Max-2-SAT with hard constraints, it is hard to find a constant-factor approximation even on (1 ? ε)-satisfiable instances (in particular, the version with hard constraints does not admit a robust satisfiability algorithm). We also study hardness results for a certain CSP over a larger domain capturing ordering constraints: we show that hard constraints rule out constant-factor approximation algorithms. All our hardness results are almost optimal — they completely rule out algorithms with certain properties, or can be matched by simple extensions to existing algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号