首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
As a widely-used parallel computing framework for big data processing today, the Hadoop MapReduce framework puts more emphasis on high-throughput of data than on low-latency of job execution. However, today more and more big data applications developed with MapReduce require quick response time. As a result, improving the performance of MapReduce jobs, especially for short jobs, is of great significance in practice and has attracted more and more attentions from both academia and industry. A lot of efforts have been made to improve the performance of Hadoop from job scheduling or job parameter optimization level. In this paper, we explore an approach to improve the performance of the Hadoop MapReduce framework by optimizing the job and task execution mechanism. First of all, by analyzing the job and task execution mechanism in MapReduce framework we reveal two critical limitations to job execution performance. Then we propose two major optimizations to the MapReduce job and task execution mechanisms: first, we optimize the setup and cleanup tasks of a MapReduce job to reduce the time cost during the initialization and termination stages of the job; second, instead of adopting the loose heartbeat-based communication mechanism to transmit all messages between the JobTracker and TaskTrackers, we introduce an instant messaging communication mechanism for accelerating performance-sensitive task scheduling and execution. Finally, we implement SHadoop, an optimized and fully compatible version of Hadoop that aims at shortening the execution time cost of MapReduce jobs, especially for short jobs. Experimental results show that compared to the standard Hadoop, SHadoop can achieve stable performance improvement by around 25% on average for comprehensive benchmarks without losing scalability and speedup. Our optimization work has passed a production-level test in Intel and has been integrated into the Intel Distributed Hadoop (IDH). To the best of our knowledge, this work is the first effort that explores on optimizing the execution mechanism inside map/reduce tasks of a job. The advantage is that it can complement job scheduling optimizations to further improve the job execution performance.  相似文献   

2.
提出以分布式、多线程并行处理技术实现基于甲骨文数据库管理系统的高效大规模化学结构检索数据库系统的方法;以相同的结构搜索算法和不同的模块组合机制分别构建了单机单线程、单机多线程、分布式单线程和分布式多线程4种不同的化学结构检索数据库系统,并在4种不同的实现方案下对同一组化学结构分别做了结构检索实验。结果表明:在4种实现方案中,分布式多线程并行处理方法的检索效率最高,稳定性也很好(与其他3种实现方法相同)。该方法已成功应用于微芯公司开发的TASS(Target Activityand Structure System)软件系统中。  相似文献   

3.
Effective task assignment is essential for achieving high performance in heterogeneous distributed computing systems. This paper proposes a new technique for minimizing the parallel application time cost of task assignment based on the honeybee mating optimization (HBMO) algorithm. The HBMO approach combines the power of simulated annealing, genetic algorithms, and an effective local search heuristic to find the best possible solution to the problem within an acceptable amount of computation time. The performance of the proposed HBMO algorithm is shown by comparing it with three existing task assignment techniques on a large number of randomly generated problem instances. Experimental results indicate that the proposed HBMO algorithm outperforms the competing algorithms.  相似文献   

4.
Peer-to-peer grid computing is an attractive computing paradigm for high throughput applications. However, both volatility due to the autonomy of volunteers (i.e., resource providers) and the heterogeneous properties of volunteers are challenging problems in the scheduling procedure. Therefore, it is necessary to develop a scheduling mechanism that adapts to a dynamic peer-to-peer grid computing environment. In this paper, we propose a Mobile Agent based Adaptive Group Scheduling Mechanism (MAAGSM). The MAAGSM classifies and constructs volunteer groups to perform a scheduling mechanism according to the properties of volunteers such as volunteer autonomy failures, volunteer availability, and volunteering service time. In addition, the MAAGSM exploits a mobile agent technology to adaptively conduct various scheduling, fault tolerance, and replication algorithms suitable for each volunteer group. Furthermore, we demonstrate that the MAAGSM improves performance by evaluating the scheduling mechanism in Korea@Home. SungJin Choi is a Ph.D. student in the Department of Computer Science and Engineering at Korea University. His research interests include mobile agent, peer-to-peer computing, grid computing, and distributed systems. Mr. Choi received a M.S. in computer science from Korea University. He is a student member of the IEEE. MaengSoon Baik is a senior research member at the SAMSUNG SDS Research & Develop Center. His research interests include mobile agent, grid computing, server virtualization, storage virtualization, and utility computing. Dr. Baik received a Ph.D. in computer science from Korea University. JoonMin Gil is a professor in the Department of Computer Science Education at Catholic University of Daegu, Korea. His recent research interests include grid computing, distributed and parallel computing, Internet computing, P2P networks, and wireless networks. Dr. Gil received his Ph.D. in computer science from Korea University. He is a member of the IEEE and the IEICE. SoonYoung Jung is a professor in the Department of Computer Science Education at Korea University. His research interests include grid computing, web-based education systems, database systems, knowledge management systems, and mobile computing. Dr. Jung received his Ph.D. in computer science from Korea University. ChongSun Hwang is a professor in the Department of Computer Science and Engineering at Korea University. His research interests include distributed systems, distributed algorithms, and mobile computing. Dr. Hwang received a Ph.D. in statistics and computer science from the University of Georgia.  相似文献   

5.
高岚  赵雨晨  张伟功  王晶  钱德沛 《软件学报》2024,35(2):1028-1047
并行计算已成为主流趋势. 在并行计算系统中, 同步是关键设计之一, 对硬件性能的充分利用至关重要. 近年来, GPU (graphic processing unit, 图形处理器)作为应用最为广加速器得到了快速发展, 众多应用也对GPU线程同步提出更高要求. 然而, 现有GPU系统却难以高效地支持真实应用中复杂的线程同步. 研究者虽然提出了很多支持GPU线程同步的方法并取得了较大进展, 但GPU独特的体系结构及并行模式导致GPU线程同步的研究仍然面临很多挑战. 根据不同的线程同步目的和粒度对GPU并行编程中的线程同步进行分类. 在此基础上, 围绕GPU线程同步的表达和执行, 首先分析总结GPU线程同步存在的难以高效表达、错误频发、执行效率低的关键问题及挑战; 而后依据不同的GPU线程同步粒度, 从线程同步表达方法和性能优化方法两个方面入手, 介绍近年来学术界和产业界对GPU线程竞争同步及合作同步的研究, 对现有研究方法进行分析与总结. 最后, 指出GPU线程同步未来的研究趋势和发展前景, 并给出可能的研究思路, 从而为该领域的研究人员提供参考.  相似文献   

6.
本文对WindowsNT操作系统的多线程同步机制和同步对象进行了分析,以其在检测仪和经纬仪同步通信程序开发中的应用为例,论述了如何通过共享事件来实现应用程序和设备驱动程序的同步通信,并给出了同步驱动程序的实现原理和具体编写步骤。  相似文献   

7.
为了提高并行应用系统的效率,研究了针对大型稀疏矩阵的压缩通信问题.通过对矩阵压缩通信过程中矩阵稀疏度、网络带宽、处理器计算能力之间的关系进行定量分析,推导出稀疏度下界计算公式.通过对不同稀疏度情况下算法所取得的效率进行分析,总结出压缩通信中稀疏度与通信效率之间的函数关系.结合油藏数值模拟的应用实例,设计实现了稀疏矩阵的压缩通信算法.结果表明本算法在稀疏矩阵通信方面效率有明显的提高.  相似文献   

8.
胡军 《计算机工程与设计》2007,28(24):5921-5923,5927
粒计算是一种新的软计算思想,它涵盖了所有和粒度相关的理论、方法和技术.提出了一种信息粒的位表示方法,从而将繁琐的集合运算转化为更适于计算机运算的二进制数的逻辑运算,并基于此提出了一种新的属性约简算法.实验结果表明,该算法在时间上较当前的其它同类算法具有更高的效率.  相似文献   

9.
并行任务划分一直是高性能计算的研究重点。结合地震资料数据处理的应用云环境,以任务运行时间估计模型作为优化目标函数,提出了一种改进的粒子群优化算法,用以解决地震资料任务划分问题。仿真实验证明,改进后的算法增强了全局搜索能力,提高了收敛速度和收敛精度,有效提高了云环境下任务的执行效率。  相似文献   

10.
自适应和声搜索算法及在粗糙集属性约简中的应用   总被引:1,自引:0,他引:1  
针对改进和声搜索算法(IHS)存在的不足,提出了自适应和声搜索算法(AHS).该算法利用和声库中变量函数的最大差值来调节PAR 和bw,从而提高了对多维问题的搜索效率.利用5个标准测试函数对AHS算法进行测试,并应用于粗糙集的属性约简中.仿真结果表明了该算法的有效性和实用性.  相似文献   

11.
In computer vision, moving object detection and tracking methods are the most important preliminary steps for higher-level video analysis applications. In this frame, background subtraction (BS) method is a well-known method in video processing and it is based on frame differencing. The basic idea is to subtract the current frame from a background image and to classify each pixel either as foreground or background by comparing the difference with a threshold. Therefore, the moving object is detected and tracked by using frame differencing and by learning an updated background model. In addition, simulated annealing (SA) is an optimization technique for soft computing in the artificial intelligence area. The p-median problem is a basic model of discrete location theory of operational research (OR) area. It is a NP-hard combinatorial optimization problem. The main aim in the p-median problem is to find p number facility locations, minimize the total weighted distance between demand points (nodes) and the closest facilities to demand points. The SA method is used to solve the p-median problem as a probabilistic metaheuristic. In this paper, an SA-based hybrid method called entropy-based SA (EbSA) is developed for performance optimization of BS, which is used to detect and track object(s) in videos. The SA modification to the BS method (SA-BS) is proposed in this study to determine the optimal threshold for the foreground-background (i.e., bi-level) segmentation and to learn background model for object detection. At these segmentation and learning stages, all of the optimization problems considered in this study are taken as p-median problems. Performances of SA-BS and regular BS methods are measured using four videoclips. Therefore, these results are evaluated quantitatively as the overall results of the given method. The obtained performance results and statistical analysis (i.e., Wilcoxon median test) show that our proposed method is more preferable than regular BS method. Meanwhile, the contribution of this study is discussed.  相似文献   

12.
现有的任务卸载策略通常在一个时隙内制定卸载决策,没有考虑多个卸载时隙间的内在联系,因此无法根据任务的实际需求进行卸载。针对该问题,提出了一种基于深度强化学习的任务二次申请卸载策略(DQN-TSAO)。首先提出了一种支持任务进行二次申请卸载的云边端三层架构,建立了任务卸载优先级模型、时延模型和能耗模型;然后以最小化系统能耗为目标,将能耗优化问题转变为最大累积卸载奖励的马尔可夫决策过程;最后通过DQN-TSAO算法提取各个时隙的任务卸载特征,使任务在与环境不断交互的过程中获得多个时隙内的最佳卸载决策。仿真结果表明DQN-TSAO算法能够有效降低一段时间内的系统总能耗。  相似文献   

13.
Cloud computing is becoming a profitable technology because of it offers cost-effective IT solutions globally. A well-designed task scheduling algorithm ensures the optimal utilization of clouds resources and reducing execution time dynamically. This research article deals with the task scheduling of inter-dependent subtasks on unrelated parallel computing machines in a cloud computing environment. This article considers two variants of the problem-based on two different objective function values. The first variant considers the minimization of the total completion time objective function while the second variant considers the minimization of the makespan objective function. Heuristic and meta-heuristic (HEART) based algorithms are proposed to solve the task scheduling problems. These algorithms utilize the property of list scheduling algorithm of unrelated parallel machine scheduling problem. A mixed integer linear programming (MILP) formulation has been provided for the two variants of the problem. The optimal solution is obtained by solving MILP formulation using A Mathematical Programming Language (AMPL) software. Extensive numerical experiments have been performed to evaluate the performance of proposed algorithms. The solutions obtained by the proposed algorithms are found to out-perform the existing algorithms. The proposed algorithms can be used by cloud computing service providers (CCSPs) for enhancing their resources utilization to reduce their operating cost.  相似文献   

14.
蔡勇  李胜 《计算机应用》2016,36(3):628-632
针对传统并行计算方法实现结构拓扑优化快速计算的硬件成本高、程序开发效率低的问题,提出了一种基于Matlab和图形处理器(GPU)的双向渐进结构优化(BESO)方法的全流程并行计算策略。首先,探讨了Matlab编程环境中实现GPU并行计算的三种途径的优缺点和适用范围;其次,分别采用内置函数直接并行的方式实现了拓扑优化算法中向量和稠密矩阵的并行化计算,采用MEX函数调用CUSOLVER库的形式实现了稀疏格式有限元方程组的快速求解,采用并行线程执行(PTX)代码的方式实现了拓扑优化中单元敏度分析等优化决策的并行化计算。数值算例表明,基于Matlab直接开发GPU并行计算程序不仅编程效率高,而且还可以避免不同编程语言间的计算精度差异,最终使GPU并行程序可以在保持计算结果不变的前提下取得可观的加速比。  相似文献   

15.
针对目前数据降维算法受高维空间样本分布影响效果不佳的问题,提出了一种自适应加权的t分布随机近邻嵌入(t-SNE)算法。该算法对两样本点在高维空间中的欧氏距离进行归一化后按距离的不同分布状况进行分组分析,分别按照近距离、较近距离和远距离三种情况在计算高维空间内样本点间的相似概率时进行自适应加权处理,以加权相对距离代替欧氏绝对距离,从而更真实地度量每一组不同样本在高维空间的相似程度。在高维脑网络状态观测矩阵中的降维实验结果表明,自适应加权t-SNE的降维聚类可视化效果优于其它降维算法,与传统t-SNE算法相比,聚类指标值DBI值平均降低了28.39%,DI值平均提高了161.84%,并且有效地消除了分散、交叉和散点等问题。  相似文献   

16.
Boiler combustion optimization is a key measure to improve the energy efficiency and reduce pollutants emissions of power units. However, time-variability of boiler combustion systems and lack of adaptive regression models pose great challenges for the application of the boiler combustion optimization technique. A recent approach to address these issues is to use the least squares support vector machine (LS-SVM), a computationally attractive machine learning technique with rather legible training processes and topologic structures, to model boiler combustion systems. In this paper, we propose an adaptive algorithm for the LS-SVM model, namely adaptive least squares support vector machine (ALS-SVM), with the aim of developing an adaptive boiler combustion model. The fundamental mechanism of the proposed algorithm is firstly introduced, followed by a detailed discussion on key functional components of the algorithm, including online updating of model parameters. A case study using a time-varying nonlinear function is then provided for model validation purposes, where model results illustrate that adaptive LS-SVM models can fit variable characteristics accurately after being updated with the ALS-SVM method. Based on the introduction to the proposed algorithm and the case study, a discussion is then delivered on the potential of applying the proposed ALS-SVM method in a boiler combustion optimization system, and a real-life fossil fuel power plant is taken as an instance to demonstrate its feasibility. Results show that the proposed adaptive model with the ALS-SVM method is able to track the time-varying characteristics of a boiler combustion system.  相似文献   

17.
A new dynamic clustering approach (DCPSO), based on particle swarm optimization, is proposed. This approach is applied to image segmentation. The proposed approach automatically determines the “optimum” number of clusters and simultaneously clusters the data set with minimal user interference. The algorithm starts by partitioning the data set into a relatively large number of clusters to reduce the effects of initial conditions. Using binary particle swarm optimization the “best” number of clusters is selected. The centers of the chosen clusters is then refined via the K-means clustering algorithm. The proposed approach was applied on both synthetic and natural images. The experiments conducted show that the proposed approach generally found the “optimum” number of clusters on the tested images. A genetic algorithm and random search version of dynamic clustering is presented and compared to the particle swarm version.  相似文献   

18.
This paper presents a comparative analysis of three versions of an evolutionary algorithm in which the decision maker's preferences are incorporated using an outranking relation and preference parameters associated with the ELECTRE TRI method. The aim is using the preference information supplied by the decision maker to guide the search process to the regions where solutions more in accordance with his/her preferences are located, thus narrowing the scope of the search and reducing the computational effort. An example dealing with a pertinent problem in electrical distribution network is used to compare the different versions of the algorithm and illustrate how meaningful information can be elicited from a decision maker and used in the operational framework of an evolutionary algorithm to provide decision support in real-world problems.  相似文献   

19.
针对主成分分析(PCA)算法获取的主成分向量不够稀疏,拥有较多的非零元这一问题,使用重加权方法对PCA算法进行优化,提出了一个新的提取高维数据特征的方法,即重加权稀疏主成分分析(RSPCA)算法。首先,将重加权l1最优化框架和LASSO回归模型引入到PCA算法数学模型中,建立新的数据降维模型;然后,使用交替最小化算法、奇异值分解算法、最小角回归算法等方式对模型进行求解;最后,使用人脸识别实验对算法效果进行了验证。在实验中使用K折交叉验证的方法针对ORL人脸数据集分别使用PCA算法和RSPCA算法进行识别实验。实验结果表明,RSPCA算法在获取更稀疏解的情况下仍拥有着不弱于PCA算法的表现,平均识别准确率达到95.1%,所提算法与表现最好的sPCA-rSVD算法相比,识别准确率提高了6.2个百分点;针对手写数字识别这一具体现实应用进行求解,获取到平均识别准确率96.4%的良好实验效果。证明了所提方法在人脸识别及书写数字识别方面的优异性。  相似文献   

20.
The purpose of this paper is to investigate the relationship between adverse events and infrastructure development investments in an active war theater by using soft computing techniques including fuzzy inference systems (FIS), artificial neural networks (ANNs), and adaptive neuro-fuzzy inference systems (ANFIS) where the accuracy of the predictions is directly beneficial from an economic and humanistic point of view. Fourteen developmental and economic improvement projects were selected as independent variables. A total of four outputs reflecting the adverse events in terms of the number of people killed, wounded or hijacked, and the total number of adverse events has been estimated.The results obtained from analysis and testing demonstrate that ANN, FIS, and ANFIS are useful modeling techniques for predicting the number of adverse events based on historical development or economic project data. When the model accuracy was calculated based on the mean absolute percentage error (MAPE) for each of the models, ANN had better predictive accuracy than FIS and ANFIS models, as demonstrated by experimental results. For the purpose of allocating resources and developing regions, the results can be summarized by examining the relationship between adverse events and infrastructure development in an active war theater, with emphasis on predicting the occurrence of events. We conclude that the importance of infrastructure development projects varied based on the specific regions and time period.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号