首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The problem of k nearest neighbors (kNN) is to find the nearest k neighbors for a query point from a given data set. In this paper, a novel fast kNN search method using an orthogonal search tree is proposed. The proposed method creates an orthogonal search tree for a data set using an orthonormal basis evaluated from the data set. To find the kNN for a query point from the data set, projection values of the query point onto orthogonal vectors in the orthonormal basis and a node elimination inequality are applied for pruning unlikely nodes. For a node, which cannot be deleted, a point elimination inequality is further used to reject impossible data points. Experimental results show that the proposed method has good performance on finding kNN for query points and always requires less computation time than available kNN search algorithms, especially for a data set with a big number of data points or a large standard deviation.  相似文献   

3.
Rich metadata in high-performance computing (HPC) systems contains extended information about users, jobs, data files, and their relationships. Property graphs are a promising data model to represent heterogeneous rich metadata flexibly. Specifically, a property graph can use vertices to represent different entities and edges to record the relationships between vertices with unique annotations. The high-volume HPC use case, with millions of entities and relationships, naturally requires an out-of-core distributed property graph database, which must support live updates (to ingest production information in real time), low-latency point queries (for frequent metadata operations such as permission checking), and large-scale traversals (for provenance data mining).Among these needs, large-scale property graph traversals are particularly challenging for distributed graph storage systems. Most existing graph systems implement a “level-synchronous” breadth-first search algorithm that relies on global synchronization in each traversal step. This performs well in many problem domains; but a rich metadata management system is characterized by imbalanced graphs, long traversal lengths, and concurrent workloads, each of which has the potential to introduce or exacerbate stragglers (i.e., abnormally slow steps or servers in a graph traversal) that lead to low overall throughput for synchronous traversal algorithms. Previous research indicated that the straggler problem can be mitigated by using asynchronous traversal algorithms, and many graph-processing frameworks have successfully demonstrated this approach. Such systems require the graph to be loaded into a separate batch-processing framework instead of being iteratively accessed, however.In this work, we investigate a general asynchronous graph traversal engine that can operate atop a rich metadata graph in its native format. We outline a traversal-aware query language and key optimizations (traversal-affiliate caching and execution merging) necessary for efficient performance. We further explore the effect of different graph partitioning strategies on the traversal performance for both synchronous and asynchronous traversal engines. Our experiments show that the asynchronous graph traversal engine is more efficient than its synchronous counterpart in the case of HPC rich metadata processing, where more servers are involved and larger traversals are needed. Moreover, the asynchronous traversal engine is more adaptive to different graph partitioning strategies.  相似文献   

4.
基于位置服务(location-based services, LBSs)中的不可信服务提供商不断收集用户个人数据,为用户隐私带来威胁.因此,LBSs中的位置隐私保护研究已在学术界和工业界受到广泛关注.现有道路网络中的位置隐私保护方法大多是基于深度或广度图遍历的算法,需重复扫描道路网络的全局拓扑信息,匿名效率较低.针对这一问题,利用网络Voronoi图(network Voronoi diagram, NVD)将道路网络事先划分为独立的网络Voronoi单元,将传统方法中的多次遍历全局道路网络转化为了访问网络Voronoi单元中的局部路网信息.根据网络Voronoi单元覆盖的移动用户数和路段数,将网络Voronoi单元分为了不安全单元、安全-中单元和安全-大单元3类,提出了适应不同类型网络Voronoi单元特点的高效位置匿名算法.最后,通过在真实数据集上进行大量实验,验证了提出算法在仅比传统算法多牺牲0.01%的查询代价的前提下,保证了100%的匿名成功率和0.34ms的高效匿名时间,在隐私保护强度和算法性能方面取得了较好的平衡.  相似文献   

5.
近年来越来越多的机器学习算法被应用到入侵检测中.但是在网络入侵检测系统(NIDS)中,随着网络规模和速度的增加,一般机器学习算法难以满足入侵检测系统实时性的要求,这也是困扰机器学习算法在入侵检测领域进一步实用化的主要瓶颈之一.为了增加网络入侵检测系统的可用性和实时性.提出了一种基于自组织特征映射(SOFM)的网络入侵检测系统,并且在此基础上实现了一种面向提高入侵检测效率的快速最近邻搜索算法VENNS,以减少系统训练和系统检测时间开销.在DARPA1999入侵检测评估数据的基础上,进行了系统的综合性能评价和对比分析.实验证明,系统在维持较低误报率的基础上取得较高的检测率;系统效率大大提高:训练时间开销大约达到改进前的1/4,检测时间开销则约达到改进前的1/7.  相似文献   

6.
Genetic algorithms (GAs) have been applied to solve the 2-page crossing number problem successfully, but since they work with one global population, the search time and space are limited. Parallelisation provides an attractive prospect to improve the efficiency and solution quality of GAs. This paper investigates the complexity of parallel genetic algorithms (PGAs) based on two evaluation measures: computation time to communication time and population size to chromosome size. Moreover, the paper unifies the framework of PGA models with the function PGA (subpopulation size, cluster size, migration period, topology), and explores the performance of PGAs for the 2-page crossing number problem.  相似文献   

7.
Nowadays, mixed-model assembly line is used increasingly as a result of customers’ demand diversification. An important problem in this field is determining the sequence of products for entering the line. Before determining the best sequence of products, a new procedure is introduced to choose important orders for entering the shop floor. Thus the orders are sorted using an analytical hierarchy process (AHP) approach based on three criteria: critical ratio of each order (CRo), Significance degree of customer and innovation in a product, while the last one is presented for the first time. In this research, six objective functions are presented: minimizing total utility work cost, total setup cost and total production rate variation cost are the objectives which were presented previously, another objective is minimizing total idle cost, meanwhile two other new objectives regarding minimizing total operator error cost and total tardiness cost are presented for the first time. The total tardiness cost tries to choose a sequence of products that minimizes the tardiness cost for customers with high priority. First, to check the feasibility of the model, GAMS software is used. In this case, GAMS software could not search all of the solution space, so it is tried in two stages and because this problem is NP-hard, particle swarm optimization (PSO) and simulated annealing (SA) algorithms are used. For small sized problems, to compare exact method with proposed algorithms, the problem must be solved using meta-heuristic algorithms in two stages as GAMS software, whereas for large sized problems, the problem can be solved in two ways (one stage and two stages) by using proposed algorithms; the computational results and pairwise comparisons (based on sign test) show GAMS is a proper software to solve small sized problems, whereas for a large sized problem the objective function is better when solved in one stage than two stages; therefore it is proposed to solve the problem in one stage for large sized problems. Also PSO algorithm is better than SA algorithm based on objective function and pairwise comparisons.  相似文献   

8.
This study introduces a new fast motion estimation (ME) based on both an adaptive search range adjustment and a matching point decimation. In particular, the authors present a maximum matching error constraint in the matching phase that can eliminate an impossible candidate block much earlier than a conventional partial distortion elimination (PDE) scheme. The constraint is computed during the matching error computation based on sum of absolute difference (SAD) between two blocks. The basic idea of the proposed scheme is based on adjusting a given search range adaptively and early eliminating invalid matching blocks effectively. The adaptive search range adjustment is first performed by analysing the contents of a scene. Next, a maximum partial matching error in reordered sub-blocks of an optimal block is obtained, and it is set as a trigger to eliminate invalid blocks for ME. The main contributions of the proposed scheme are that (i) it can reduce a search range adaptively based on the analysis of scene contents; (ii) it can make an early decision for an impossible candidate before complete SAD computation; (iii) the proposed constraint can reduce the computational cost considerably for SAD calculation; and (iv) the proposed matching ideas can be applied to conventional PDE algorithms without significant changes. In order to evaluate the proposed scheme, several baseline approaches are described and compared. The experimental results show that the proposed algorithm can reduce the computational cost more than 86% for ME at the cost of 0.02%dB quality degradation on against the conventional PDE algorithm.  相似文献   

9.
This paper focuses on minimizing the total completion time in two-machine group scheduling problems with sequence-dependent setups that are typically found in discrete parts manufacturing. As the problem is characterized as strongly NP-hard, three search algorithms based on tabu search are developed for solving industry-size scheduling problems. Four different lower bounding mechanisms are developed to identify a lower bound for all problems attempted, and the largest of the four is aptly used in the evaluation of the percentage deviation of the search algorithms to assess their efficacy. The problem sizes are classified as small, medium and large, and to accommodate the variability that might exist in the sequence-dependent setup times on both machines, three different scenarios are considered. Such finer levels of classification have resulted in the generation of nine different categories of problem instances, thus facilitating the performance of a very detailed statistical experimental design to assess the efficacy and efficiency of the three search algorithms. The search algorithm based on long-term memory with maximal frequencies either recorded a statistically better makespan or one that is indifferent when compared with the other two with all three scenarios and problem sizes. Hence, it is recommended for solving the research problem. Under the three scenarios, the average percentage deviation for all sizes of problem instances solved has been remarkably low. In particular, a mathematical programming based lower bounding mechanism, which focuses on converting (reducing) the original sequence-dependent group scheduling problem with several jobs in each group to a sequence-dependent job scheduling problem, has served well in identifying a high quality lower bound for the original problem, making it possible to evaluate a lower average percentage deviation for the search algorithm. Also, a 16–17-fold reduction in average computation time for solving a large problem instance with the recommended search algorithm compared with identifying just the lower bound of (not solving) the same instance by the mathematical programming based mechanism speaks strongly in favor of the search algorithm for solving industry-size group scheduling problems.  相似文献   

10.
Three commonly used traversal methods for binary trees (forsets) are pre-order, in-order and post-order. It is well known that sequential algorithms for these traversals takes order O(N) time where N is the total number of nodes. This paper establishes a one-to-one correspondence between the set of nodes that possess right sibling and the set of leaf nodes for any forest. For the case of pre-order traversal, this result is shown to provide an alternate characterization that leads to a simple and elegant parallel algorithm of time complexity O(log N) with or without read-conflicts on an N processor SIMD shared memory model, where N is the total number of nodes in a forest.  相似文献   

11.
Complete coverage path planning (CCPP), specifically, the efficiency and completeness of coverage of robots, is one of the major problems in autonomous mobile robotics. This study proposes a path planning technique to solve global time optimization. Conventional algorithms related to template-based coverage can minimize the time required to cover particular cells. The minimal turning path is mostly based on the shape and size of the cell. Conventional algorithms can determine the optimum time path inside a cell; however, these algorithms cannot ensure that the total time determined for the coverage path is the global optimum. This study presents an algorithm that can convert a CCPP problem into a flow network by exact cell decomposition. The total time cost to reach the edge of a flow network is the sum of the time to cover the current cell and the time to shift in adjacent cells. The time cost determines a minimum-cost path from the start node to the final node through the flow network, which is capable of visiting each node exactly once through the network search algorithm. Search results show that the time-efficient coverage can obtain the global optimum. Simulation and experimental results demonstrate that the proposed algorithm operates in a time-efficient manner.  相似文献   

12.
目的 针对现有区域合并和图割的结合算法没有考虑矿岩图像模糊特性,导致分割精度和运行效率较低,模糊边缘无法有效分割的问题,利用快速递推计算的最大模糊2-划熵信息设置以区域为顶点的图割模型似然能来解决。方法 首先利用双边滤波器和分水岭算法对矿岩图像进行预处理,并将其划分为若干一致性较好的区域;然后利用图像在计算最大模糊2-划分熵时,目标和背景的模糊隶属度函数来设计图割能量函数似然能,使得能量函数更接近模糊图像的真实情况,期间为了提高最大模糊2-划分熵值的搜索效率,提出了时间复杂度为O(n2)的递推算法将模糊熵的计算转化为递推过程,并保留不重复的递推结果用于后续的穷举搜索;最后利用设计的图割算法对区域进行标号,以完成分割。结果 本文算法的分割精度较其他区域合并和图割结合算法提高了约23%,分割后矿岩颗粒个数的统计结果相对于人工统计结果,其误差率约为2%,运行时间较其他算法缩短了约60%。结论 本文算法确保精度同时,有效提高矿岩图像的分割效率,为自动化矿岩图像高效分割的工程实践提供重要指导依据。  相似文献   

13.
Parallel Algorithms for Discovery of Association Rules   总被引:2,自引:0,他引:2  
Discovery of association rules is an important data mining task. Several parallel and sequential algorithms have been proposed in the literature to solve this problem. Almost all of these algorithms make repeated passes over the database to determine the set of frequent itemsets (a subset of database items), thus incurring high I/O overhead. In the parallel case, most algorithms perform a sum-reduction at the end of each pass to construct the global counts, also incurring high synchronization cost. In this paper we describe new parallel association mining algorithms. The algorithms use novel itemset clustering techniques to approximate the set of potentially maximal frequent itemsets. Once this set has been identified, the algorithms make use of efficient traversal techniques to generate the frequent itemsets contained in each cluster. We propose two clustering schemes based on equivalence classes and maximal hypergraph cliques, and study two lattice traversal techniques based on bottom-up and hybrid search. We use a vertical database layout to cluster related transactions together. The database is also selectively replicated so that the portion of the database needed for the computation of associations is local to each processor. After the initial set-up phase, the algorithms do not need any further communication or synchronization. The algorithms minimize I/O overheads by scanning the local database portion only twice. Once in the set-up phase, and once when processing the itemset clusters. Unlike previous parallel approaches, the algorithms use simple intersection operations to compute frequent itemsets and do not have to maintain or search complex hash structures. Our experimental testbed is a 32-processor DEC Alpha cluster inter-connected by the Memory Channel network. We present results on the performance of our algorithms on various databases, and compare it against a well known parallel algorithm. The best new algorithm outperforms it by an order of magnitude.  相似文献   

14.
块匹配运动估计是视频编码国际标准中广泛采用的关键技术.许多快速块匹配法通过限制搜索点数来减少运算量,但与全搜索算法相比极易出现匹配误差.该文介绍了一种应用新的判别条件的多级顺序排除算法(MSEA),并在此基础上提出一种新的算法,该算法引入了尺度化的部分失真消除(PDE)技术,用于尺度化累积部分误差和当前最小误差.实验证明,相对于一脉相承的同为穷举搜索算法的全搜索算法(FS)、顺序排除算法(SEA)、多级顺序排除算法(MSEA)等,该算法大幅度提高了搜索效率.与多级顺序排除算法相比,平均每宏块节省了大约75%的运算次数.该算法在保证图像质量的前提下,使视频编码的速度大大提高.  相似文献   

15.
何丽  赵富强  饶俊 《计算机应用》2013,33(1):250-253
针对Web服务组合的时间效率提高问题,提出了一种基于服务社团和服务链的Web服务组合方法。在构造的服务网络上应用基于信息中心度的服务社团发现方法,将Web服务网络划分为不同的服务社团,然后构造了社团服务链发现算法和基于服务链的Web服务组合算法,这些算法将服务社团内Web服务之间的所有可组合关联转变成服务链,实现了基于社团服务链和服务质量(QoS)剪枝的Web服务组合过程。实验结果表明,与传统的图深度遍历Web服务组合方法相比,基于社团服务链的Web服务组合方法在5个测试集上的响应时间平均提高了46%,最好情况为67%。社团服务链可以有效地减少针对当前服务请求的服务搜索空间,提高服务组合的时间效率。  相似文献   

16.
Artificial Intelligence (AI) techniques are utilized widely in the field of Expert Systems (ES) - as applied to robotics, video games self-driving vehicles and so on. Pathfinding algorithms are a class of heuristic algorithms based on AI techniques which are used in ES as decision making functions for the purpose of solving problems that would otherwise require human competence or expertise. ES fields that use pathfinding algorithms and operate in real-time face many challenges: for example time constraints, optimality and memory overhead for storing the paths which are found. For these algorithms to work, appropriate problem-specific maps must be constructed. In relation to this, the uniform-cost grid set-up is the most appropriate for ES applications. In this method, each node in a graph is represented as a tile, and the weight “between” tiles is set at a constant value, usually this is set to 1. In the state-of-the-art heuristic algorithms used with this data structure, multiplying the heuristic function by a weight greater than one is well-known technique. In this paper, we present three new techniques using various weights to accelerate heuristic search of grid maps. The first such technique is based on the iteration of a heuristic search algorithm associated with weight-set w. The second technique is based on the length between the start node and goal node, which is then associated with w. The last technique is based on the travel cost and is associated with a weight-set α. These techniques are applicable to a wide class of heuristic search algorithms. Therefore, we implement them, here, within the A*, the Bidirectional A* (Bi-A*) and Jump Point Search (JPS) algorithms; thus obtaining a family of new algorithms. Furthermore, it is seen that the use of these new algorithms results in significant improvements over current search algorithms. We evaluate them in path-planning benchmarks and show the amended JPS technique's greater stability, across weight values, over the other two techniques. However, it is also shown that this technique yields poor results in terms of cost solution.  相似文献   

17.
Nearest neighbour search is a widely used technique in pattern recognition. During the last three decades a large number of fast algorithms have been proposed. In this work we are interested in algorithms that can be used with any dissimilarity function provided that it fits the mathematical notion of distance.Some of such algorithms organize, in preprocessing time, the data in a tree structure that is traversed in search time to find the nearest neighbour. The speedup is obtained using some pruning rules that avoid the traversal of some parts of the tree.In this work two new decomposition methods to build the tree and three new pruning rules are explored. The behaviour of our proposal is studied through experiments with synthetic and real data.  相似文献   

18.
提出了两种正则四边形网格插值细分曲面的求值算法.算法基于参数m-进制分解和构造矩阵序列,通过参数分解数列对应的矩阵乘积得到基函数值,得到初始网格上对应控制点的权值,从而实现插值细分曲面求值.算法1 基于2D 细分掩模,算法2 基于张量积.数值实验表明,算法高效且低存储.  相似文献   

19.
§1.引 言 许多大型科学与工程计算问题都归结为大型稀疏线性方程组的求解,因此,在高性能并行计算机高速发展的今天,面向并行计算环境研究大型稀疏线性方程组的高效并行算法显得尤为重要. 对于大型稀疏线性方程组 Ax=b, (1)  相似文献   

20.
Track-before-detect (TBD) algorithms are used for tracking systems, where the object’s signal is below the noise floor (low-SNR objects). A lot of computations and memory transfers for real-time signal processing are necessary. GPGPU in parallel processing devices for TBD algorithms is well suited. Finding optimal or suboptimal code, due to lack of documentation for low-level programming of GPGPUs is not possible. High-level code optimization is necessary and the evolutionary approach, based on the single parent and single child is considered, that is local search approach. Brute force search technique is not feasible, because there are N! code variants, where N is the number of motion vectors components. The proposed evolutionary operator—LREI (local random extraction and insertion) allows source code reordering for the reduction of computation time due to better organization of memory transfer and the texture cache content. The starting point, based on the sorting and the minimal execution time metric is proposed. The unbiased random and biased sorting techniques are compared using experimental approach. Tests shows significant improvements of the computation speed, about 8 % over the conventional code for CUDA code. The time period of optimization for the sample code is about 1 h (1,000 iterations) for the considered recursive spatio-temporal TBD algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号