期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

<Emphasis Type="Italic">k</Emphasis>-most suitable locations selection

Yu-Chi Chung I-Fang Su Chiang Lee 《GeoInformatica》2018,22(4):661-692

Choosing the best location for starting a business or expanding an existing enterprize is an important issue. A number of location selection problems have been discussed in the literature. They often apply the Reverse Nearest Neighbor as the criterion for finding suitable locations. In this paper, we apply the Average Distance as the criterion and propose the so-called k-most suitable locations (k-MSL) selection problem. Given a positive integer k and three datasets: a set of customers, a set of existing facilities, and a set of potential locations. The k-MSL selection problem outputs k locations from the potential location set, such that the average distance between a customer and his nearest facility is minimized. In this paper, we formally define the k-MSL selection problem and show that it is NP-hard. We first propose a greedy algorithm which can quickly find an approximate result for users. Two exact algorithms are then proposed to find the optimal result. Several pruning rules are applied to increase computational efficiency. We evaluate the algorithms’ performance using both synthetic and real datasets. The results show that our algorithms are able to deal with the k-MSL selection problem efficiently. 相似文献

2.

Knowledge State Algorithms

Wolfgang Bein Lawrence L. Larmore John Noga Rüdiger Reischuk 《Algorithmica》2011,60(3):653-678

We introduce the novel concept of knowledge states. The knowledge state approach can be used to construct competitive randomized online algorithms and study the trade-off between competitiveness and memory. Many well-known algorithms can be viewed as knowledge state algorithms. A knowledge state consists of a distribution of states for the algorithm, together with a work function which approximates the conditional obligations of the adversary. When a knowledge state algorithm receives a request, it then calculates one or more “subsequent” knowledge states, together with a probability of transition to each. The algorithm uses randomization to select one of those subsequents to be the new knowledge state. We apply this method to randomized k-paging. The optimal minimum competitiveness of any randomized online algorithm for the k-paging problem is the kth harmonic number, \(H_{k}=\sum^{k}_{i=1}\frac{1}{i}\). Existing algorithms which achieve that optimal competitiveness must keep bookmarks, i.e., memory of the names of pages not in the cache. An H _k-competitive randomized algorithm for that problem which uses O(k) bookmarks is presented, settling an open question by Borodin and El-Yaniv. In the special cases where k=2 and k=3, solutions are given using only one and two bookmarks, respectively. 相似文献

3.

Efficient distance-based representative skyline computation in 2D space

Rui Mao Taotao Cai Rong-Hua Li Jeffery Xu Yu Jianxin Li 《World Wide Web》2017,20(4):621-638

Representative skyline computation is a fundamental issue in database area, which has attracted much attention in recent years. A notable definition of representative skyline is the distance-based representative skyline (DBRS). Given an integer k, a DBRS includes k representative skyline points that aims at minimizing the maximal distance between a non-representative skyline point and its nearest representative. In the 2D space, the state-of-the-art algorithm to compute the DBRS is based on dynamic programming (DP) which takes O(k m ²) time complexity, where m is the number of skyline points. Clearly, such a DP-based algorithm cannot be used for handling large scale datasets due to the quadratic time cost. To overcome this problem, in this paper, we propose a new approximate algorithm called ARS, and a new exact algorithm named PSRS, based on a carefully-designed parametric search technique. We show that the ARS algorithm can guarantee a solution that is at most ?? larger than the optimal solution. The proposed ARS and PSRS algorithms run in O(klog2mlog(T/??)) and O(k ² log3m) time respectively, where T is no more than the maximal distance between any two skyline points. We also propose an improved exact algorithm, called PSRS+, based on an effective lower and upper bounding technique. We conduct extensive experimental studies over both synthetic and real-world datasets, and the results demonstrate the efficiency and effectiveness of the proposed algorithms. 相似文献

4.

The Number of Tree Stars Is <Emphasis Type="Italic">O</Emphasis><Superscript>*</Superscript>(1.357<Superscript><Emphasis Type="Italic">k</Emphasis></Superscript>)

Bernhard Fuchs Walter Kern Xinhui Wang 《Algorithmica》2007,49(3):232-244

Every rectilinear Steiner tree problem admits an optimal tree T ^* which is composed of tree stars. Moreover, the currently fastest algorithms for the rectilinear Steiner tree problem proceed by composing an optimum tree T ^* from tree star components in the cheapest way. The efficiency of such algorithms depends heavily on the number of tree stars (candidate components). Fößmeier and Kaufmann (Algorithmica 26, 68–99, 2000) showed that any problem instance with k terminals has a number of tree stars in between 1.32^k and 1.38^k (modulo polynomial factors) in the worst case. We determine the exact bound O ^*(ρ ^k) where ρ≈1.357 and mention some consequences of this result. 相似文献

5.

On the Advice Complexity of the k-server Problem Under Sparse Metrics

Sushmita Gupta Shahin Kamali Alejandro López-Ortiz 《Theory of Computing Systems》2016,59(3):476-499

We consider the k-Server problem under the advice model of computation when the underlying metric space is sparse. On one side, we introduce Θ(1)-competitive algorithms for a wide range of sparse graphs. These algorithms require advice of (almost) linear size. We show that for graphs of size N and treewidth α, there is an online algorithm that receives O (n(log α + log log N))^* bits of advice and optimally serves any sequence of length n. We also prove that if a graph admits a system of μ collective tree (q, r)-spanners, then there is a (q + r)-competitive algorithm which requires O (n(log μ + log log N)) bits of advice. Among other results, this gives a 3-competitive algorithm for planar graphs, when provided with O (n log log N) bits of advice. On the other side, we prove that advice of size Ω(n) is required to obtain a 1-competitive algorithm for sequences of length n even for the 2-server problem on a path metric of size N ≥ 3. Through another lower bound argument, we show that at least \(\frac {n}{2}(\log \alpha - 1.22)\) bits of advice is required to obtain an optimal solution for metric spaces of treewidth α, where 4 ≤ α < 2k. 相似文献

6.

Multiple <Emphasis Type="Italic">k</Emphasis> nearest neighbor search

Yu-Chi Chung I-Fang Su Chiang Lee Pei-Chi Liu 《World Wide Web》2017,20(2):371-398

The problem of kNN (k Nearest Neighbor) queries has received considerable attention in the database and information retrieval communities. Given a dataset D and a kNN query q, the k nearest neighbor algorithm finds the closest k data points to q. The applications of kNN queries are board, not only in spatio-temporal databases but also in many areas. For example, they can be used in multimedia databases, data mining, scientific databases and video retrieval. The past studies of kNN query processing did not consider the case that the server may receive multiple kNN queries at one time. Their algorithms process queries independently. Thus, the server will be busy with continuously reaccessing the database to obtain the data that have already been acquired. This results in wasting I/O costs and degrading the performance of the whole system. In this paper, we focus on this problem and propose an algorithm named COrrelated kNN query Evaluation (COKE). The main idea of COKE is an “information sharing” strategy whereby the server reuses the query results of previously executed queries for efficiently processing subsequent queries. We conduct a comprehensive set of experiments to analyze the performance of COKE and compare it with the Best-First Search (BFS) algorithm. Empirical studies indicate that COKE outperforms BFS, and achieves lower I/O costs and less running time. 相似文献

7.

A family of Hamiltonian and Hamiltonian connected graphs with fault tolerance

Y-Chuang Chen Yong-Zen Huang Lih-Hsing Hsu Jimmy J. M. Tan 《The Journal of supercomputing》2010,54(2):229-238

Processor (vertex) faults and link (edge) faults may happen when a network is used, and it is meaningful to consider networks (graphs) with faulty processors and/or links. A k-regular Hamiltonian and Hamiltonian connected graph G is optimal fault-tolerant Hamiltonian and Hamiltonian connected if G remains Hamiltonian after removing at most k?2 vertices and/or edges and remains Hamiltonian connected after removing at most k?3 vertices and/or edges. In this paper, we investigate in constructing optimal fault-tolerant Hamiltonian and optimal fault-tolerant Hamiltonian connected graphs. Therefore, some of the generalized hypercubes, twisted-cubes, crossed-cubes, and Möbius cubes are optimal fault-tolerant Hamiltonian and optimal fault-tolerant Hamiltonian connected. 相似文献

8.

Faster Parameterized Algorithms for <Emphasis Type="SmallCaps">Minimum Fill-in</Emphasis>

Hans L. Bodlaender Pinar Heggernes Yngve Villanger 《Algorithmica》2011,61(4):817-838

We present two parameterized algorithms for the Minimum Fill-in problem, also known as Chordal Completion: given an arbitrary graph G and integer k, can we add at most k edges to G to obtain a chordal graph? Our first algorithm has running time \(\mathcal {O}(k^{2}nm+3.0793^{k})\), and requires polynomial space. This improves the base of the exponential part of the best known parameterized algorithm time for this problem so far. We are able to improve this running time even further, at the cost of more space. Our second algorithm has running time \(\mathcal {O}(k^{2}nm+2.35965^{k})\) and requires \(\mathcal {O}^{\ast}(1.7549^{k})\) space. To achieve these results, we present a new lemma describing the edges that can safely be added to achieve a chordal completion with the minimum number of edges, regardless of k. 相似文献

9.

Improved Approximation Algorithm for k-level Uncapacitated Facility Location Problem (with Penalties)

Jaroslaw Byrka Shanfei Li Bartosz Rybicki 《Theory of Computing Systems》2016,58(1):19-44

We study the k-level uncapacitated facility location problem (k-level UFL) in which clients need to be connected with paths crossing open facilities of k types (levels). In this paper we first propose an approximation algorithm that for any constant k, in polynomial time, delivers solutions of cost at most α _k times OPT, where α _k is an increasing function of k, with \(\lim _{k\to \infty } \alpha _{k} = 3\). Our algorithm rounds a fractional solution to an extended LP formulation of the problem. The rounding builds upon the technique of iteratively rounding fractional solutions on trees (Garg, Konjevod, and Ravi SODA’98) originally used for the group Steiner tree problem. We improve the approximation ratio for k-level UFL for all k ≥ 3, in particular we obtain the ratio equal 2.02, 2.14, and 2.24 for k = 3,4, and 5. 相似文献

10.

The direction-constrained <Emphasis Type="Italic">k</Emphasis> nearest neighbor query

Min-Joong Lee Dong-Wan Choi SangYeon Kim Ha-Myung Park Sunghee Choi Chin-Wan Chung 《GeoInformatica》2016,20(3):471-502

Finding k nearest neighbor objects in spatial databases is a fundamental problem in many geospatial systems and the direction is one of the key features of a spatial object. Moreover, the recent tremendous growth of sensor technologies in mobile devices produces an enormous amount of spatio-directional (i.e., spatially and directionally encoded) objects such as photos. Therefore, an efficient and proper utilization of the direction feature is a new challenge. Inspired by this issue and the traditional k nearest neighbor search problem, we devise a new type of query, called the direction-constrained k nearest neighbor (DCkNN) query. The DCkNN query finds k nearest neighbors from the location of the query such that the direction of each neighbor is in a certain range from the direction of the query. We develop a new index structure called MULTI, to efficiently answer the DCkNN query with two novel index access algorithms based on the cost analysis. Furthermore, our problem and solution can be generalized to deal with spatio-circulant dimensional (such as a direction and circulant periods of time such as an hour, a day, and a week) objects. Experimental results show that our proposed index structure and access algorithms outperform two adapted algorithms from existing kNN algorithms. 相似文献

11.

Optimizing top-<Emphasis Type="Italic">k</Emphasis> retrieval: submodularity analysis and search strategies

Chaofeng Sha Keqiang Wang Dell Zhang Xiaoling Wang Aoying Zhou 《Frontiers of Computer Science》2016,10(3):477-487

The key issue in top-k retrieval, finding a set of k documents (from a large document collection) that can best answer a user’s query, is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k retrieval problem in the framework of facility location analysis and prove the submodularity of that objective function which provides a theoretical approximation guarantee of factor 1?\(\frac{1}{e}\) for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first obtains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently. 相似文献

12.

An all-pair quantum SVM approach for big data multiclass classification

Arit Kumar Bishwas Ashish Mani Vasile Palade 《Quantum Information Processing》2018,17(10):282

In this paper, we discuss a quantum approach for the all-pair multiclass classification problem. In an all-pair approach, there is one binary classification problem for each pair of classes, and so there are k(k???1)/2 classifiers for a k-class classification problem. As compared to the classical multiclass support vector machine that can be implemented with polynomial run time complexity, our approach exhibits exponential speedup due to quantum computing. The quantum all-pair algorithm can also be used with other classification algorithms, and a speedup gain can be achieved as compared to their classical counterparts. 相似文献

13.

Linear Time Algorithms for Generalizations of the Longest Common Substring Problem

Michael Arnold Enno Ohlebusch 《Algorithmica》2011,60(4):806-818

In its simplest form, the longest common substring problem is to find a longest substring common to two or multiple strings. Using (generalized) suffix trees, this problem can be solved in linear time and space. A first generalization is the k -common substring problem: Given m strings of total length n, for all k with 2≤k≤m simultaneously find a longest substring common to at least k of the strings. It is known that the k-common substring problem can also be solved in O(n) time (Hui in Proc. 3rd Annual Symposium on Combinatorial Pattern Matching, volume 644 of Lecture Notes in Computer Science, pp. 230–243, Springer, Berlin, 1992). A further generalization is the k -common repeated substring problem: Given m strings T ⁽¹⁾,T ⁽²⁾,…,T ^(m) of total length n and m positive integers x ₁,…,x _m, for all k with 1≤k≤m simultaneously find a longest string ω for which there are at least k strings \(T^{(i_{1})},T^{(i_{2})},\ldots,T^{(i_{k})}\) (1≤i ₁<i ₂<???<i _k≤m) such that ω occurs at least \(x_{i_{j}}\) times in \(T^{(i_{j})}\) for each j with 1≤j≤k. (For x ₁=???=x _m=1, we have the k-common substring problem.) In this paper, we present the first O(n) time algorithm for the k-common repeated substring problem. Our solution is based on a new linear time algorithm for the k-common substring problem. 相似文献

14.

Locally adaptive <Emphasis Type="Italic">k</Emphasis> parameter selection for nearest neighbor classifier: one nearest cluster

Faruk Bulut Mehmet Fatih Amasyali 《Pattern Analysis & Applications》2017,20(2):415-425

The k nearest neighbors (k-NN) classification technique has a worldly wide fame due to its simplicity, effectiveness, and robustness. As a lazy learner, k-NN is a versatile algorithm and is used in many fields. In this classifier, the k parameter is generally chosen by the user, and the optimal k value is found by experiments. The chosen constant k value is used during the whole classification phase. The same k value used for each test sample can decrease the overall prediction performance. The optimal k value for each test sample should vary from others in order to have more accurate predictions. In this study, a dynamic k value selection method for each instance is proposed. This improved classification method employs a simple clustering procedure. In the experiments, more accurate results are found. The reasons of success have also been understood and presented. 相似文献

15.

On Fixed Cost k-Flow Problems

MohammadTaghi Hajiaghayi Rohit Khandekar Guy Kortsarz Zeev Nutov 《Theory of Computing Systems》2016,58(1):4-18

In the Fixed Cost k-Flow problem, we are given a graph G = (V, E) with edge-capacities {u _e∣e ∈ E} and edge-costs {c _e∣e ∈ E}, source-sink pair s, t ∈ V, and an integer k. The goal is to find a minimum cost subgraph H of G such that the minimum capacity of an st-cut in H is at least k. By an approximation-preserving reduction from Group Steiner Tree problem to Fixed Cost k-Flow, we obtain the first polylogarithmic lower bound for the problem; this also implies the first non-constant lower bounds for the Capacitated Steiner Network and Capacitated Multicommodity Flow problems. We then consider two special cases of Fixed Cost k-Flow. In the Bipartite Fixed-Cost k-Flow problem, we are given a bipartite graph G = (A ∪ B, E) and an integer k > 0. The goal is to find a node subset S ? A ∪ B of minimum size |S| such G has k pairwise edge-disjoint paths between S ∩ A and S ∩ B. We give an \(O(\sqrt {k\log k})\) approximation for this problem. We also show that we can compute a solution of optimum size with Ω(k/polylog(n)) paths, where n = |A| + |B|. In the Generalized-P2P problem we are given an undirected graph G = (V, E) with edge-costs and integer charges {b _v : v ∈ V}. The goal is to find a minimum-cost spanning subgraph H of G such that every connected component of H has non-negative charge. This problem originated in a practical project for shift design [11]. Besides that, it generalizes many problems such as Steiner Forest, k-Steiner Tree, and Point to Point Connection. We give a logarithmic approximation algorithm for this problem. Finally, we consider a related problem called Connected Rent or Buy Multicommodity Flow and give a log^3+?? n approximation scheme for it using Group Steiner Tree techniques. 相似文献

16.

The problem of finding the maximum multiple flow in the divisible network and its special cases

A.?V.?Smirnov Email author 《Automatic Control and Computer Sciences》2016,50(7):527-535

In the article the problem of finding the maximum multiple flow in the network of any natural multiplicity k is studied. There are arcs of three types: ordinary arcs, multiple arcs and multi-arcs. Each multiple and multi-arc is a union of k linked arcs, which are adjusted with each other. The network constructing rules are described. The definitions of a divisible network and some associated subjects are stated. The important property of the divisible network is that every divisible network can be partitioned into k parts, which are adjusted on the linked arcs of each multiple and multi-arc. Each part is the ordinary transportation network. The main results of the article are the following subclasses of the problem of finding the maximum multiple flow in the divisible network. 1. The divisible networks with the multi-arc constraints. Assume that only one vertex is the ending vertex for a multi-arc in s network parts. In this case the problem can be solved in a polynomial time. 2. The divisible networks with the weak multi-arc constraints. Assume that only one vertex is the ending vertex for a multi-arc in k-1 network parts (1 ≤ s < k ? 1) and other parts have at least two such vertices. In that case the multiplicity of the maximum multiple flow problem can be decreased to k - s. 3. The divisible network of the parallel structure. Assume that the divisible network component, which consists of all multiple arcs, can be partitioned into subcomponents, each of them containing exactly one vertex-beginning of a multi-arc. Suppose that intersection of each pair of subcomponents is the only vertex-network source x ₀. If k=2, the maximum flow problem can be solved in a polynomial time. If k ≥ 3, the problem is NP-complete. The algorithms for each polynomial subclass are suggested. Also, the multiplicity decreasing algorithm for the divisible network with weak multi-arc constraints is formulated. 相似文献

17.

Efficient preconditioning for noisy separable nonnegative matrix factorization problems by successive projection based low-rank approximations

Tomohiko Mizutani Mirai Tanaka 《Machine Learning》2018,107(4):643-673

The successive projection algorithm (SPA) can quickly solve a nonnegative matrix factorization problem under a separability assumption. Even if noise is added to the problem, SPA is robust as long as the perturbations caused by the noise are small. In particular, robustness against noise should be high when handling the problems arising from real applications. The preconditioner proposed by Gillis and Vavasis (SIAM J Optim 25(1):677–698, 2015) makes it possible to enhance the noise robustness of SPA. Meanwhile, an additional computational cost is required. The construction of the preconditioner contains a step to compute the top-k truncated singular value decomposition of an input matrix. It is known that the decomposition provides the best rank-k approximation to the input matrix; in other words, a matrix with the smallest approximation error among all matrices of rank less than k. This step is an obstacle to an efficient implementation of the preconditioned SPA. To address the cost issue, we propose a modification of the algorithm for constructing the preconditioner. Although the original algorithm uses the best rank-k approximation, instead of it, our modification uses an alternative. Ideally, this alternative should have high approximation accuracy and low computational cost. To ensure this, our modification employs a rank-k approximation produced by an SPA based algorithm. We analyze the accuracy of the approximation and evaluate the computational cost of the algorithm. We then present an empirical study revealing the actual performance of the SPA based rank-k approximation algorithm and the modified preconditioned SPA. 相似文献

18.

Diversified top-<Emphasis Type="Italic">k</Emphasis> clique search

Long Yuan Lu Qin Xuemin Lin Lijun Chang Wenjie Zhang 《The VLDB Journal The International Journal on Very Large Data Bases》2016,25(2):171-196

Maximal clique enumeration is a fundamental problem in graph theory and has been extensively studied. However, maximal clique enumeration is time-consuming in large graphs and always returns enormous cliques with large overlaps. Motivated by this, in this paper, we study the diversified top-k clique search problem which is to find top-k cliques that can cover most number of nodes in the graph. Diversified top-k clique search can be widely used in a lot of applications including community search, motif discovery, and anomaly detection in large graphs. A naive solution for diversified top-k clique search is to keep all maximal cliques in memory and then find k of them that cover most nodes in the graph by using the approximate greedy max k-cover algorithm. However, such a solution is impractical when the graph is large. In this paper, instead of keeping all maximal cliques in memory, we devise an algorithm to maintain k candidates in the process of maximal clique enumeration. Our algorithm has limited memory footprint and can achieve a guaranteed approximation ratio. We also introduce a novel light-weight \(\mathsf {PNP}\)-\(\mathsf {Index}\), based on which we design an optimal maximal clique maintenance algorithm. We further explore three optimization strategies to avoid enumerating all maximal cliques and thus largely reduce the computational cost. Besides, for the massive input graph, we develop an I/O efficient algorithm to tackle the problem when the input graph cannot fit in main memory. We conduct extensive performance studies on real graphs and synthetic graphs. One of the real graphs contains 1.02 billion edges. The results demonstrate the high efficiency and effectiveness of our approach. 相似文献

19.

Mining collective knowledge: inferring functional labels from online review for business

Feifan Fan Wayne Xin Zhao Ji-Rong Wen Ge Xu Edward Y. Chang 《Knowledge and Information Systems》2017,50(3):723-750

Uncertain graph has been widely used to represent graph data with inherent uncertainty in structures. Reliability search is a fundamental problem in uncertain graph analytics. This paper investigates on a new problem with broad real-world applications, the top-k reliability search problem on uncertain graphs, that is, finding the k vertices v with the highest reliabilities of connections from a source vertex s to v. Note that the existing algorithm for the threshold-based reliability search problem is inefficient for the top-k reliability search problem. We propose a new algorithm to efficiently solve the top-k reliability search problem. The algorithm adopts two important techniques, namely the BFS sharing technique and the offline sampling technique. The BFS sharing technique exploits overlaps among different sampled possible worlds of the input uncertain graph and performs a single BFS on all possible worlds simultaneously. The offline sampling technique samples possible worlds offline and stores them using a compact structure. The algorithm also takes advantages of bit vectors and bitwise operations to improve efficiency. In addition, we generalize the top-k reliability search problem from single-source case to the multi-source case and show that the multi-source case of the problem can be equivalently converted to the single-source case of the problem. Moreover, we define two types of the reverse top-k reliability search problems with different semantics on uncertain graphs. We propose appropriate solutions for both of them. Extensive experiments carried out on both real and synthetic datasets verify that the optimized algorithm outperforms the baselines by 1–2 orders of magnitude in execution time while achieving comparable accuracy. Meanwhile, the optimized algorithm exhibits linear scalability with respect to the size of the input uncertain graph. 相似文献

20.

On ruin probability minimization under excess reinsurance

Yu. G. Grigor’ev Dinh Le Son 《Automation and Remote Control》2007,68(6):1039-1054

The problem of ruin probability minimization in the Cramer-Lundberg risk model under excess reinsurance is studied. Together with traditional maximization of the Lundberg characteristic coefficient R is considered the problem of direct calculation of insurer’s ruin probability ? _r (x) as an initial-capital function x under the prescribed level of net-retention r. To solve this problem, we propose the excess variant of the Cramer integral equation which is an equivalent to the Hamilton-Jacobi-Bellman equation. The continuation method is used for solving this equation; by means of it is found the analytical solution to the Markov risk model. We demonstrated on a series of standard examples that with any admissible value of x the ruin probability ? _x (r): = ? _r (x) is usually a unimodal function r. A comparison of the analytic representation of ruin probability ? _r(x) with its asymptotic approximation with x → ∞ was conducted. 相似文献