首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Journal of Supercomputing - Data warehouses are very large databases and play key role in intelligent decision making in enterprises. The bitmap join indexes selection problem is crucial in the...  相似文献   

2.
位图连接索引是数据仓库中一种有效的优化表间连接操作性能的索引机制。在大内存分析处理应用场景下,位图连接索引不仅需要权衡索引的内存和CPU开销,还需要进一步考虑处理器平台所带来的性能收益和数据访问延迟。提出了基于服务的位图连接索引管理机制,其主要特点体现在三个方面:独立于数据库的自管理索引机制;基于存储空间约束的TOP K关键字位图连接索引机制;处理器敏感(processor-conscious)的位图连接索引技术。索引服务将索引从数据库中内置的数据结构变成数据库外的索引服务层,通过对用户查询负载的分析模块和索引服务管理模块改变传统的由数据库管理员人工管理索引的模式,同时借助于协处理器和内存云技术提高索引服务的性能和灵活性。实验测试结果表明,索引服务机制能够有效地提高索引存储和访问效率,在通用GPU的强大并行处理能力的支持下,位图连接索引服务的性能和数据库整体查询处理性能都得到了显著的提升。  相似文献   

3.
数据库查询优化技术对提高数据库的查询效率,增强数据库性能有重要作用。针对大型数据库中多表连接查询效率低的问题,提出了一种基于粒子群算法的改进查询优化算法。针对多表连接查询的特征,对粒子采用树形编码的方式,并提出了一种计算数据库查询执行代价的模型。实验表明,使用粒子群算法优化后的查询策略比原始查询策略的查询执行代价低,有效提高了系统的查询效率。  相似文献   

4.
孙一凡  张纪会 《控制与决策》2023,38(10):2764-2772
为了进一步提升粒子群算法在离散优化问题中的性能,针对粘性二进制粒子群算法缺乏全局搜索能力、容易陷入局部最优和收敛速度慢的缺点,提出一种新的自适应参数策略和粒子散度指标,并结合模拟退火机制改善该算法的寻优能力.为了检验算法性能,通过选取不同维数的背包问题算例库以及不同规模的UCI特征选择问题算例库进行仿真实验,并对实验数据进行统计分析.实验以及分析结果表明,所提算法在寻优精度、算法稳定性和收敛速度上均优于对比算法.  相似文献   

5.
On-line analytical processing (OLAP) refers to the technologies that allow users to efficiently retrieve data from the data warehouse for decision-support purposes. Data warehouses tend to be extremely large, it is quite possible for a data warehouse to be hundreds of gigabytes to terabytes in size (Chauduri and Dayal, 1997). Queries tend to be complex and ad hoc, often requiring computationally expensive operations such as joins and aggregation. Given this, we are interested in developing strategies for improving query processing in data warehouses by exploring the applicability of parallel processing techniques. In particular, we exploit the natural partitionability of a star schema and render it even more efficient by applying DataIndexes-a storage structure that serves both as an index as well as data and lends itself naturally to vertical partitioning of the data. DataIndexes are derived from the various special purpose access mechanisms currently supported in commercial OLAP products. Specifically, we propose a declustering strategy which incorporates both task and data partitioning and present the Parallel Star Join (PSJ) Algorithm, which provides a means to perform a star join in parallel using efficient operations involving only rowsets and projection columns. We compare the performance of the PSJ Algorithm with two parallel query processing strategies. The first is a parallel join strategy utilizing the Bitmap Join Index (BJI), arguably the state-of-the-art OLAP join structure in use today. For the second strategy we choose a well-known parallel join algorithm, namely the pipelined hash algorithm. To assist in the performance comparison, we first develop a cost model of the disk access and transmission costs for all three approaches.  相似文献   

6.
The optimal mapping of tasks to the processors is one of the challenging issues in heterogeneous computing systems. This article presents a task scheduling problem in distributed systems using discrete particle swarm optimization (DPSO) algorithm with various neighborhood topologies. The DPSO is a recent metaheuristic population‐based algorithm. In DPSO, the set of particles in a swarm flies through the N‐dimensional search space by learning from both the personal best position and a neighborhood best position. Each particle inside the swarm belongs to a specific topology for communicating with neighboring particles in the swarm. The neighborhood topology affects the performance of DPSO significantly, because it determines the rate at which information transmits through the swarm. The proposed DPSO algorithm works on dynamic topology that is binary heap tree for communication between the particles in the swarm. The performance of the proposed topology is compared with other topologies such as star, ring, fully connected, binary tree, and Von Neumann. The three well‐known performance measures such as Makespan, mean flow time, and reliability cost are used for the comparison of the proposed topology with other neighborhood topologies. Computational simulation results indicate that the performance of DPSO algorithm has shown significant improvement with binary heap tree topology used for communication among the particles in the swarm.  相似文献   

7.
The data warehouse (DW) technology is developed in order to support the integration of external data sources (EDSs) for the purpose of advanced data analysis by On-Line Analytical Processing (OLAP) applications. Since contents and structures of integrated EDSs may evolve in time, the content and schema of a DW must evolve too in order to correctly reflect the evolution of EDSs. In order to manage a DW evolution, we developed the multiversion data warehouse (MVDW) approach. In this approach, different states of a DW are represented by the sequence of persistent DW versions that correspond either to the real world state or to a simulation scenario. Typically, OLAP applications execute star queries that join multiple fact and dimension tables. An important optimization technique for this kind of queries is based on join indexes. Since in the MVDW fact and dimension data are physically distributed among multiple DW versions, standard join indexes need extensions. In this paper we present the concept of a multiversion join index (MVJI) applicable to indexing dimension and fact tables in the MVDW. The MVJI has a two-level structure, where an upper level is used for indexing attributes and a lower level is used for indexing DW versions. The paper also presents the theoretical upper bound (pessimistic) analysis of the MVJI performance characteristic with respect to I/O operations. The analysis is followed by experimental evaluation. It shows that the MVJI increases a system performance for queries addressing multiple DW versions with exact match and range predicates.  相似文献   

8.
多版本数据仓库中,不同数据仓库版本的维度实例可以共享存储。直接建立维度表与事实表的位图连接索引会产生大量无用的索引项,影响查询效率。给出了一种数据仓库版本的形式定义和维度实例的共享存储方式,并在此基础上设计了查询优化算法DWVOQ,通过建立维度实例的版本视图及其与事实实例的连接索引来降低索引空间代价,提高索引查询效率。  相似文献   

9.
One of the important research and technological problems in data warehouse query optimization concerns star queries. So far, most of the research focused on optimizing such queries by means of join indexes, bitmap join indexes, or various multidimensional indexes. These structures neither support navigation well along dimension hierarchies nor optimize joins with the Time dimension, which in practice is used in most of the star queries. In this paper we propose an index, called TimeHOBI, for optimizing the star queries that compute aggregates along dimension hierarchies. TimeHOBI, created on a dimension hierarchy, is composed of (1) a Hierarchically Organized Bitmap Index (HOBI), where one bitmap index is maintained for one dimension level, and (2) a Time Index (TI) that implicitly encodes time in every dimension. HOBI allows to quickly search for fact rows satisfying predicates defined on different levels of dimension hierarchies. With the support of TI joining a fact table with the Time dimension is avoided. Thus, TimeHOBI supports a broad class of star queries. In this paper we explain how query execution plans for star queries can profit from TimeHOBI. We show, based on experiments, the efficiency of TimeHOBI for different classes of queries, as compared to HOBI and a traditional bitmap index. Based on the experiments, we also demonstrate how sensitive TimeHOBI is to variable selectivity of queries. We also analyze the maintenance time of TimeHOBI as compared to HOBI and a traditional bitmap index. The experiments used in the paper have been conducted on a real dataset, coming from the biggest East-European Internet auction platform Allegro.pl. The experiments show that TimeHOBI can be successfully applied to the optimization of star queries as it offers promising performance improvement.  相似文献   

10.
基于粒子群优化算法的测试选择优化方法研究   总被引:4,自引:3,他引:1  
测试选择优化问题作为复杂电子装备的诊断设计优化过程中的一个关键问题,是一个典型的集合覆盖问题,属于经典的N—P难题;针对现有优化方法存在的不足,通过对测试选择问题的分析,提出一种基于二进制粒子群优化算法的测试选择优化方法,将备选测试集合采用二进制粒子编码,构造粒子适应度函数,通过粒子群搜索实现了快速求解;与传统方法相比较,该方法搜索速度快,优化效果明显,该方法已在工程实践中得到应用。  相似文献   

11.
This paper presents an evolving ant direction particle swarm optimization algorithm for solving the optimal power flow problem with non-smooth and non-convex generator cost characteristics. In this method, ant colony search is used to find a suitable velocity updating operator for particle swarm optimization and the ant colony parameters are evolved using genetic algorithm approach. To update the velocities for particle swarm optimization, five velocity updating operators are used in this method. The power flow problem is solved by the Newton–Raphson method. The feasibility of the proposed method was tested on IEEE 30-bus, IEEE 39-bus and IEEE-57 bus systems with three different objective functions. Several cases were investigated to test and validate the effectiveness of the proposed method in finding the optimal solution. Simulation results prove that the proposed method provides better results compared to classical particle swarm optimization and other methods recently reported in the literature. An innovative statistical analysis based on central tendency measures and dispersion measures was carried out on the bus voltage profiles and voltage stability indices.  相似文献   

12.
An important problem in the study of evolutionary algorithms is how to continuously predict promising solutions while simultaneously escaping from local optima. In this paper, we propose an elitist probability schema (EPS) for the first time, to the best of our knowledge. Our schema is an index of binary strings that expresses the similarity of an elitist population at every string position. EPS expresses the accumulative effect of fitness selection with respect to the coding similarity of the population. For each generation, EPS can quantify the coding similarity of the population objectively and quickly. One of our key innovations is that EPS can continuously predict promising solutions while simultaneously escaping from local optima in most cases. To demonstrate the abilities of the EPS, we designed an elitist probability schema genetic algorithm and an elitist probability schema compact genetic algorithm. These algorithms are estimations of distribution algorithms (EDAs). We provided a fair comparison with the persistent elitist compact genetic algorithm (PeCGA), quantum-inspired evolutionary algorithm (QEA), and particle swarm optimization (PSO) for the 0–1 knapsack problem. The proposed algorithms converged quicker than PeCGA, QEA, and PSO, especially for the large knapsack problem. Furthermore, the computation time of the proposed algorithms was less than some EDAs that are based on building explicit probability models, and was approximately the same as QEA and PSO. This is acceptable for evolutionary algorithms, and satisfactory for EDAs. The proposed algorithms are successful with respect to convergence performance and computation time, which implies that EPS is satisfactory.  相似文献   

13.
基于粒子群算法的数据库查询优化   总被引:1,自引:0,他引:1  
研究粒子群算法在数据库查询优化中的应用问题。为了解决大型数据库信息检索困难、查询效率低的问题,提出了一种基于粒子群算法优化数据库查询技术方案。算法提出了一种数据库查询执行计划代价模型,主要包括了查询多链接次序以及副本的选择问题,准确定义了数据库查询执行代价,采用提出的粒子群算法来优化并求解该执行代价问题,从而使得分组数目更少、数据定位更精确。实例验证结果表明,通过属性表现和违规行为任何教师都可以被准确定位,减少了分组,为数据库查询提供了优化。  相似文献   

14.
大型数据仓库实现技术的研究   总被引:2,自引:0,他引:2  
大型数据仓库是实现海量数据存储的有效途径,但在大型数据仓库的实现中存在很多问题。在分析问题的基础上,对大型数据仓库的实现问题提出了一定的解决策略,对其中的几个关键技术即数据立方体的有效计算、增量式更新维护、索引优化、故障恢复、模式设计和查询优化的代价模型及元数据的定义和管理等作了研究。  相似文献   

15.
张伟  黄卫民 《自动化学报》2022,48(10):2585-2599
在多目标粒子群优化算法中,平衡算法收敛性和多样性是获得良好分布和高精度Pareto前沿的关键,多数已提出的方法仅依靠一种策略引导粒子搜索,在解决复杂问题时算法收敛性和多样性不足.为解决这一问题,提出一种基于种群分区的多策略自适应多目标粒子群优化算法.采用粒子收敛性贡献对算法环境进行检测,自适应调整粒子的探索和开发过程;为准确制定不同性能的粒子的搜索策略,提出一种多策略的全局最优粒子选取方法和多策略的变异方法,根据粒子的收敛性评价指标,将种群划分为3个区域,将粒子性能与算法寻优过程结合,提升种群中各个粒子的搜索效率;为解决因选取的个体最优粒子不能有效指导粒子飞行方向,使算法停滞,陷入局部最优的问题,提出一种带有记忆区间的个体最优粒子选取方法,提升个体最优粒子选取的可靠性并加快粒子收敛过程;采用包含双性能测度的融合指标维护外部存档,避免仅根据粒子密度对外部存档维护时,删除收敛性较好的粒子,导致种群产生退化,影响粒子开发能力.仿真实验结果表明,与其他几种多目标优化算法相比,该算法具有良好的收敛性和多样性.  相似文献   

16.
The feature selection process constitutes a commonly encountered problem of global combinatorial optimization. This process reduces the number of features by removing irrelevant, noisy, and redundant data, thus resulting in acceptable classification accuracy. Feature selection is a preprocessing technique with great importance in the fields of data analysis and information retrieval processing, pattern classification, and data mining applications. This paper presents a novel optimization algorithm called catfish binary particle swarm optimization (CatfishBPSO), in which the so-called catfish effect is applied to improve the performance of binary particle swarm optimization (BPSO). This effect is the result of the introduction of new particles into the search space (“catfish particles”), which replace particles with the worst fitness by the initialized at extreme points of the search space when the fitness of the global best particle has not improved for a number of consecutive iterations. In this study, the K-nearest neighbor (K-NN) method with leave-one-out cross-validation (LOOCV) was used to evaluate the quality of the solutions. CatfishBPSO was applied and compared to 10 classification problems taken from the literature. Experimental results show that CatfishBPSO simplifies the feature selection process effectively, and either obtains higher classification accuracy or uses fewer features than other feature selection methods.  相似文献   

17.
The problem of near-optimal test point set selection with imperfect test is solved by using the heuristic particle swarm optimization (HPSO) algorithm. First, to describe the uncertainty of each test, the testability analysis model and such indexes as fault detection rate, fault isolation rate, and false alarm rate are redefined. A heuristic function is then established to evaluate the detection isolation capability and uncertainty of the test point, which can provide heuristic information to improve the searching efficiency of particle swarm optimization (PSO). The heuristic function and least test cost principle are used as bases to design a fitness function of PSO algorithm for test point selection. Finally, the HPSO algorithm is proposed to select the optimal test point set for two practical systems. Simulation and experiment results show that the method can determine the global optimal test point accurately and effectively while meeting the requirements of testability indexes with least cost.  相似文献   

18.
为了平衡优化算法在高维多目标优化问题中收敛性和多样性之间的关系,增加算法的选择压力,本文提出了一种基于目标空间映射策略的高维多目标粒子群优化算法(many-objective particle swarm optimization al-gorithm based on objective space mapping ...  相似文献   

19.
The rapidly increasing scale of data warehouses is challenging today’s data analytical technologies. A conventional data analytical platform processes data warehouse queries using a star schema — it normalizes the data into a fact table and a number of dimension tables, and during query processing it selectively joins the tables according to users’ demands. This model is space economical. However, it faces two problems when applied to big data. First, join is an expensive operation, which prohibits a parallel database or a MapReduce-based system from achieving efficiency and scalability simultaneously. Second, join operations have to be executed repeatedly, while numerous join results can actually be reused by different queries. In this paper, we propose a new query processing framework for data warehouses. It pushes the join operations partially to the pre-processing phase and partially to the postprocessing phase, so that data warehouse queries can be transformed into massive parallelized filter-aggregation operations on the fact table. In contrast to the conventional query processing models, our approach is efficient, scalable and stable despite of the large number of tables involved in the join. It is especially suitable for a large-scale parallel data warehouse. Our empirical evaluation on Hadoop shows that our framework exhibits linear scalability and outperforms some existing approaches by an order of magnitude.  相似文献   

20.
网络故障诊断中大量无关或冗余的特征会降低诊断的精度,需要对初始特征进行选择。Wrapper模式特征选择方法分类算法计算量大,为了降低计算量,本文提出了基于支持向量的二进制粒子群(SVB-BPSO)的故障特征选择方法。该算法以SVM为分类器,首先通过对所有样本的SVM训练选出SV集,在封装的分类训练中仅使用SV集,然后采用异类支持向量之间的平均距离作为SVM的参数进行训练,最后根据分类结果,利用BPSO在特征空间中进行全局搜索选出最优特征集。在DARPA数据集上的实验表明本文提出的方法能够降低封装模式特征选择的计算量且获得了较高的分类精度以及较明显的降维效果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号