首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
The WAND processing strategy is a dynamic pruning algorithm designed for large scale Web search engines where fast response to queries is a critical service. The WAND is used to reduce the amount of computation by scoring only documents that may become part of the top‐k document results. In this paper, we present two parallel strategies for the WAND algorithm and compare their performance on GPUs. In our first strategy (named size‐based), the posting lists are evenly partitioned among thread blocks. Our second strategy (named range‐based) partitions the posting lists according to document identifier intervals; thus, partitions may have different sizes. We also propose three threshold sharing policies, named Local, Safe‐R, and Safe‐WR, which emulate the WAND algorithm global pruning technique. We evaluated our proposals with different amounts of work, from short to extra‐large queries, using single query processing and batch of queries. Results show that the size‐based strategy reports the highest speedups but at the cost of low quality of results. The range‐based algorithm retrievals the exact top‐k documents and maintains a good speedup. Moreover, both strategies are capable of scaling as the amount of work is increased. In addition, there is no significant difference in the performance of the three threshold sharing policies.  相似文献   

2.
在大规模多媒体数据库中进行基于内容的检索,高维数据牵引结构的研究是重要问题,提出了一种有效的高维索引结构-自适应近似树,阐述了它的结构,给出了构建和检索算法,它结合了树结构和顺序检索的共同优点,针对不同的数据分布情况可以自适应地调整结构,维数较低或数据分布偏斜较大时它呈现树的结构,高维或数据分布密集时呈现顺序扫描的结构,以达到更优的检索效率,在结构上,对MBR使用了压缩存储的方法以节省存储空间,在算法中充分利用了空间划分是MBS和MBR共存的特点,减少了大量复杂的计算,从而大大提高检索效率。  相似文献   

3.
    
Automatic test data generation is a very popular domain in the field of search‐based software engineering. Traditionally, the main goal has been to maximize coverage. However, other objectives can be defined, such as the oracle cost, which is the cost of executing the entire test suite and the cost of checking the system behavior. Indeed, in very large software systems, the cost spent to test the system can be an issue, and then it makes sense by considering two conflicting objectives: maximizing the coverage and minimizing the oracle cost. This is what we did in this paper. We mainly compared two approaches to deal with the multi‐objective test data generation problem: a direct multi‐objective approach and a combination of a mono‐objective algorithm together with multi‐objective test case selection optimization. Concretely, in this work, we used four state‐of‐the‐art multi‐objective algorithms and two mono‐objective evolutionary algorithms followed by a multi‐objective test case selection based on Pareto efficiency. The experimental analysis compares these techniques on two different benchmarks. The first one is composed of 800 Java programs created through a program generator. The second benchmark is composed of 13 real programs extracted from the literature. In the direct multi‐objective approach, the results indicate that the oracle cost can be properly optimized; however, the full branch coverage of the system poses a great challenge. Regarding the mono‐objective algorithms, although they need a second phase of test case selection for reducing the oracle cost, they are very effective in maximizing the branch coverage. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

4.
全文检索引擎对于科技文档检索来说很重要因为它提供了范围最为广阔的信息,并对用户输入的关键词积极做出响应.这项技术着力于分析和查找全文中的每个词,并以获得某篇文章的全部内容为目的.因此,它为后续的研发提供了很大的便利.  相似文献   

5.
This paper describes a novel approach to estimate the size of database query results using neural networks. Using the proposed approach, three layer neural networks are constructed and trained to learn the cumulative distribution functions of attribute values in relations. With a trained network, the estimation of the query result size could be obtained instantly by simply computing the network output from the given query predicates. The basic computational model using a cumulative distribution function to compute the query result size is described. The network construction and training is discussed. Comprehensive experiments were conducted to study the effectiveness of the proposed approach. The results indicate that the approach produces estimates with accuracies that are comparable with or higher than those reported in the literature.  相似文献   

6.
K.  Wen-Syan  M.   《Data & Knowledge Engineering》2000,35(3):259-298
Since media-based evaluation yields similarity values, results to a multimedia database query, Q(Y1,…,Yn), is defined as an ordered list SQ of n-tuples of the form X1,…,Xn. The query Q itself is composed of a set of fuzzy and crisp predicates, constants, variables, and conjunction, disjunction, and negation operators. Since many multimedia applications require partial matches, SQ includes results which do not satisfy all predicates. Due to the ranking and partial match requirements, traditional query processing techniques do not apply to multimedia databases. In this paper, we first focus on the problem of “given a multimedia query which consists of multiple fuzzy and crisp predicates, providing the user with a meaningful final ranking”. More specifically, we study the problem of merging similarity values in queries with multiple fuzzy predicates. We describe the essential multimedia retrieval semantics, compare these with the known approaches, and propose a semantics which captures the requirements of multimedia retrieval problem. We then build on these results in answering the related problem of “given a multimedia query which consists of multiple fuzzy and crisp predicates, finding an efficient way to process the query.” We develop an algorithm to efficiently process queries with unordered fuzzy predicates (sub-queries). Although this algorithm can work with different fuzzy semantics, it benefits from the statistical properties of the semantics proposed in this paper. We also present experimental results for evaluating the proposed algorithm in terms of quality of results and search space reduction.  相似文献   

7.
刁鹏飞  李树森  姜雪松 《控制与决策》2021,36(12):2910-2918
为提高算法求解动态多目标问题的寻优性能,提出一种多种群分解预测动态多目标算法.首先,提出进化向量生成策略,即基于偏好目标的解生成一组均匀分布的平行向量,并采用引力搜索算法优化每个子问题,保证其对应解的精度和分布的均匀性;其次,设计插值生成策略,即根据进化向量子问题的解在目标空间中的取值,通过线性插值的方式生成更多非支配解,保证解集的多样性和均匀性;再次,在环境变化后,根据相邻子问题的解存在相近性预测生成搜索种群,提高算法的寻优速度.与5个对比算法在10个标准动态测试函数上进行对比分析,实验结果表明采用所提出算法求解动态多目标问题具有较好的分布性和收敛性.  相似文献   

8.
分布式查询处理中的场地选择   总被引:2,自引:0,他引:2  
在建立查询模型和代价模型的基础上,重点分析了分布式查询处理中场地选择的几种典型算法,并指出它们的特点和适用范围,最后提出一些有待进一步研究的问题。  相似文献   

9.
吴京  景宁  陈荦 《软件学报》2000,11(2):265-270
在数据库研究中,路径搜索和空间查询处理被认为是两个互不相关的领域,然而在处理具有空间约束的路径查询时,需要数据库系统提供路径计算和空间查询处理两方面的功能.为了处理路径计算中的空间约束,考虑了两类处理策略:(1) 空间运算是否在路径计算之前预处理;(2) 空间对象是否在路径计算之前预选取.基于这两类策略,应用现有的空间连接、R-树空间搜索和空间对象聚类技术,提出4种集成的空间路径查询处理方法.  相似文献   

10.
基于改进的Tent混沌万有引力搜索算法   总被引:1,自引:0,他引:1       下载免费PDF全文
万有引力搜索算法(gravitational search algorithm, GSA)相比于传统的优化算法具有收敛速度快、开拓性能强等特点,但GSA易陷入早熟收敛和局部最优,搜索能力较弱.为此,提出一种基于改进的Tent混沌万有引力搜索算法(gravitational search algorithm based on improved tent chaos, ITC-GSA).首先,改进Tent混沌映射来初始化种群,利用Tent混沌序列随机性、遍历性和规律性的特性使得初始种群随机性和遍历性在可行域内,具有加强算法的全局搜索能力;其次,引入引力常数G的动态调整策略提高算法的收敛速度和收敛精度;再次,设计成熟度指标判断种群成熟度,并使用Tent混沌搜索有效抑制算法早熟收敛,帮助种群跳出局部最优;最后,对10个基准函数进行仿真实验,结果表明所提算法能够有效克服GSA易陷入早熟收敛和局部最优的缺点,提高算法的收敛速度和寻优精度.  相似文献   

11.
在使用标准化代码的系统中,为解决新增记录精确匹配代码的问题,系统中采用了拼音检索方法,拼音检索主要应用在就业信息采集页面中,对采集的数据进行模糊匹配,并将匹配到的数据加载到页面中由用户自行选择到最符合要求的数据,为了提高匹配精度,在系统中采用双重模糊查询方法,解决了系统中有大量待查数据时查询效率与查询精度的问题,该系统投入使用后收到了良好的效果.  相似文献   

12.
We introduce a semantic data model to capture the hierarchical, spatial, temporal, and evolutionary semantics of images in pictorial databases. This model mimics the user's conceptual view of the image content, providing the framework and guidelines for preprocessing to extract image features. Based on the model constructs, a spatial evolutionary query language (SEQL), which provides direct image object manipulation capabilities, is presented. With semantic information captured in the model, spatial evolutionary queries are answered efficiently. Using an object-oriented platform, a prototype medical-image management system was implemented at UCLA to demonstrate the feasibility of the proposed approach.  相似文献   

13.
该文以求解一些NP问题(如TSP问题和背包问题)为例,分析了运行在量子计算机上的量子搜索算法和运行在经典计算机上的进化搜索算法的本质区别,同时也论述了它们之间相互结合的方法,特别是运行在经典计算机上的量子驱动的进化算法。  相似文献   

14.
    
The use of metaheuristic search techniques for the automatic generation of test data has been a burgeoning interest for many researchers in recent years. Previous attempts to automate the test generation process have been limited, having been constrained by the size and complexity of software, and the basic fact that, in general, test data generation is an undecidable problem. Metaheuristic search techniques offer much promise in regard to these problems. Metaheuristic search techniques are high‐level frameworks, which utilize heuristics to seek solutions for combinatorial problems at a reasonable computational cost. To date, metaheuristic search techniques have been applied to automate test data generation for structural and functional testing; the testing of grey‐box properties, for example safety constraints; and also non‐functional properties, such as worst‐case execution time. This paper surveys some of the work undertaken in this field, discussing possible new future directions of research for each of its different individual areas. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

15.
平面选址问题的引力搜索算法求解   总被引:1,自引:0,他引:1  
为求解平面选址问题,给出了一种基于引力搜索算法的求解方法。算法利用万有引力定律进行全局搜索,采用一种邻域搜索方法进行局部搜索,实现算法全局优化和局部优化的平衡。通过大量实验和与现有求解方法的比较,结果验证了算法的可行性和有效性。  相似文献   

16.
随着数据管理需求的不断增长,降低与控制数据中心的能耗成为一个挑战性问题. DBMS 是数据中心核心软件,能效查询处理与优化是其中一个重要议题. 本文提出了新型的能耗代价评估模型,通过评估查询计划的时间和能耗代价,考察了不同优化目标在不同硬件条件下对查询处理的影响. 实验表明,传统硬件下面向性能的优化与面向能耗的优化结果是一致的;在新硬件条件下,两者结果则不同,可以改进数据库系统能效.  相似文献   

17.
随着互联网的高速发展,移动终端设备产生的众包图片可以用在许多重要应用场景当中以获得有效的信息。例如地震后现场区域的修复、重大事故的处理。但是,这些应用场景往往都会有资源限制的问题,如带宽、终端的存储与处理能力等等,这就限制了形成众包图片的数量。因此,如何在资源有限的情况下,从众包图片中实现目标的最佳还原是一个巨大挑战。通过采集与处理图片的地理和几何数据,形成图片的元数组,在限制计算资源的条件下,提出了一种以元数据为输入的众包图片效用最优选择算法,以实现目标的最佳还原。算法的输入是元数据而非像素,所以在资源有限的应用场景中能够高效地分析众包。采用图片的效用来衡量目标区域被覆盖的程度,并提出了图片效用计算方法。最后设计了仿真实验,实验结果验证了算法的有效性与优越性。  相似文献   

18.
针对K-means算法的聚类结果极易受到聚类中心的影响而陷入局部最优解的问题,提出一种基于改进引力搜索的K-means聚类算法。首先引入自适应概念,对引力系数衰减因子进行控制,提高算法的全局探索能力和局部开发能力;然后,引入免疫克隆选择机制,以便算法能够有效跳出局部最优,并通过对12个基准测试函数的实验验证改进引力搜索算法的有效性和优越性;最后,通过结合改进的引力搜索算法和K-means算法,提出一种新的聚类算法A2F-GSA-Kmeans,并在6个测试数据集上的实验表明,该算法具有较好的聚类质量。  相似文献   

19.
布尔函数是在密码学、纠错编码和扩频通信等领域有着广泛应用的密码函数,寻找性能优良的布尔函数一直是密码学领域的重要问题之一。基于引力搜索算法设计了一种搜索布尔函数的新算法。该算法模仿万有引力定律,以n维空间中的质量点表示布尔函数,以布尔函数的密码特性作为目标适应度函数进行搜索。实验结果表明,算法使用新设计的目标适应度函数可以直接生成具有1阶弹性、1阶扩散准则和高非线性度、高代数次数以及低自相关指标等多种密码学指标的平衡布尔函数,并且进一步给出了直接生成2输出平衡布尔函数的计算机搜索算法。  相似文献   

20.
蚁群算法及其改进形式综述   总被引:6,自引:0,他引:6  
蚁群算法是一种具有许多优良特性的模拟进化算法,已经成功地解决了许多复杂的组合优化问题。但是蚁群算法并不完善。本文介绍蚁群算法的模型及其存在的问题,并综述蚁群算法的多种改进形式,最后对蚁群算法将来的研究方向作出预测。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号