首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, a programming model is presented which enables scalable parallel performance on multi-core shared memory architectures. The model has been developed for application to a wide range of numerical simulation problems. Such problems involve time stepping or iteration algorithms where synchronization of multiple threads of execution is required. It is shown that traditional approaches to parallelism including message passing and scatter-gather can be improved upon in terms of speed-up and memory management. Using spatial decomposition to create orthogonal computational tasks, a new task management algorithm called H-Dispatch is developed. This algorithm makes efficient use of memory resources by limiting the need for garbage collection and takes optimal advantage of multiple cores by employing a “hungry” pull strategy. The technique is demonstrated on a simple finite difference solver and results are compared to traditional MPI and scatter-gather approaches. The H-Dispatch approach achieves near linear speed-up with results for efficiency of 85% on a 24-core machine. It is noted that the H-Dispatch algorithm is quite general and can be applied to a wide class of computational tasks on heterogeneous architectures involving multi-core and GPGPU hardware.  相似文献   

2.
Tiled multi-core architectures have become an important kind of multi-core design for its good scalability and low power consumption. Stream programming has been productively applied to a number of important application domains. It provides an attractive way to exploit the parallelism. However, the architecture characteristics of large amounts of cores, memory hierarchy and exposed communication between tiles have presented a performance challenge for stream programs running on tiled multi-cores. In this paper, we present StreamTMC, an efficient stream compilation framework that optimizes the execution of stream applications for the tiled multi-core. This framework is composed of three optimization phases. First, a software pipelining schedule is constructed to exploit the parallelism. Second, an efficient hybrid of SPM and cache buffer allocation algorithm and data copy elimination mechanism is proposed to improve the efficiency of the data access. Last, a communication aware mapping is proposed to reduce the network communication and synchronization overhead. We implement the StreamTMC compiler on Godson-T, a 64-core tiled architecture and conduct an experimental study to verify the effectiveness. The experimental results indicate that StreamTMC can achieve an average of 58% improvement over the performance before optimization.  相似文献   

3.
The computation of eigenvalues and eigenvectors of symmetric tridiagonal matrices arises frequently in applications; often as one of the steps in the solution of Hermitian and symmetric eigenproblems. While several accurate and efficient methods for the tridiagonal eigenproblem exist, their corresponding implementations usually target uni-processors or large distributed memory systems. Our new eigensolver MR3-SMP is instead specifically designed for multi-core and many-core general purpose processors, which today have effectively replaced uni-processors. We show that in most cases MR3-SMP is faster and achieves better speedups than state-of-the-art eigensolvers for uni-processors and distributed-memory systems.  相似文献   

4.
5.
Frequent pattern mining (FPM) is an important data mining paradigm to extract informative patterns like itemsets, sequences, trees, and graphs. However, no practical framework for integrating the FPM tasks has been attempted. In this paper, we describe the design and implementation of the Data Mining Template Library (DMTL) for FPM. DMTL utilizes a generic data mining approach, where all aspects of mining are controlled via a set of properties. It uses a novel pattern property hierarchy to define and mine different pattern types. This property hierarchy can be thought of as a systematic characterization of the pattern space, i.e., a meta-pattern specification that allows the analyst to specify new pattern types, by extending this hierarchy. Furthermore, in DMTL all aspects of mining are controlled by a set of different mining properties. For example, the kind of mining approach to use, the kind of data types and formats to mine over, the kind of back-end storage manager to use, are all specified as a list of properties. This provides tremendous flexibility to customize the toolkit for various applications. Flexibility of the toolkit is exemplified by the ease with which support for a new pattern can be added. Experiments on synthetic and public dataset are conducted to demonstrate the scalability provided by the persistent back-end in the library. DMTL been publicly released as open-source software (), and has been downloaded by numerous researchers from all over the world.  相似文献   

6.
In other industries, the idea of build corporate culture by establishing a common level of “best practice” is widely known and used. The architecture concept directly supports this goal for our industry and can help us improve problem areas dominated by organizational and social issues, such as health care organizations, educational systems, and so on. Our proposed reference model for architecture specification and development is organized around a set of aspects that structure concepts and rules; these, in turn, specify a conceptual architecture. We have added principles and guidelines to the concepts and rules to give a more complete picture of the architecture and to provide a place to store and communicate successfully applied design patterns and other knowledge related to the architecture. Adding architectural elements is a step toward a more constructive type of architecture representation. Our current research is focused on further refining these concepts and developing a formal specification of the architecture reference model. We are continuing to test our ideas in case studies, such as applying our model to the OSCA architecture and the application machine concept. We are also developing a prototype architecture editor, and we are testing different tools to learn more about integrating them into a real infrastructure and to learn what typical services an infrastructure must provide  相似文献   

7.
闭合序列模式挖掘算法   总被引:3,自引:1,他引:2  
提出了一种新的挖掘闭合序列模式的PosD算法,该算法利用位置数据保存数据项的顺序信息,并基于位置数据列表保存数据项的顺序关系提出了两种修剪方法:逆向超模式和相同位置数据。为了确保栅格存储的正确性和简洁性,另外还针对一些特殊情况做处理。试验结果表明,在中大型数据库和小支持度的情况下谊算法比CloSpan算法更有效。  相似文献   

8.
Scientific progress in recent years has led to the generation of huge amounts of biological data, most of which remains unanalyzed. Mining the data may provide insights into various realms of biology, such as finding co-occurring biosequences, which are essential for biological data mining and analysis. Data mining techniques like sequential pattern mining may reveal implicitly meaningful patterns among the DNA or protein sequences. If biologists hope to unlock the potential of sequential pattern mining in their field, it is necessary to move away from traditional sequential pattern mining algorithms, because they have difficulty handling a small number of items and long sequences in biological data, such as gene and protein sequences. To address the problem, we propose an approach called Depth-First SPelling (DFSP) algorithm for mining sequential patterns in biological sequences. The algorithm’s processing speed is faster than that of PrefixSpan, its leading competitor, and it is superior to other sequential pattern mining algorithms for biological sequences.  相似文献   

9.
Pattern Analysis and Applications - Frequent pattern (itemset) mining is one of the established approaches for knowledge discovery. Minimizing the number of database scans (I/O overhead) is a...  相似文献   

10.
11.
Real space teems with potential feature patterns with instances that frequently appear in the same locations. As a member of the data-mining family, co-location can effectively find such feature patterns in space. However, given the constant expansion of data, efficiency and storage problems become difficult issues to address. Here, we propose a maximal-framework algorithm based on two improved strategies. First, we adopt a degeneracy-based maximal clique mining method to yield candidate maximal co-locations to achieve high-speed performance. Motivated by graph theory with parameterized complexity, we regard the prevalent size-2 co-locations as a sparse undirected graph and subsequently find all maximal cliques in this graph. Second, we introduce a hierarchical verification approach to construct a condensed instance tree for storing large instance cliques. This strategy further reduces computing and storage complexities. We use both synthetic and real facility data to compare the computational time and storage requirements of our algorithm with those of two other competitive maximal algorithms: “order-clique-based” and “MAXColoc”. The results show that our algorithm is both more efficient and requires less storage space than the other two algorithms.  相似文献   

12.
Since Agrawal and Srikant proposed sequential pattern mining in 1995, there have been many scholars working to improve the efficiency and reduce the processing time of algorithms. This study intends to propose a fuzzy AprioriSome algorithm for fuzzy sequential patterns mining with integration with clustering technique, K-means algorithm. Two experiments performed using transaction data provided by a securities firm and foodmarket data from SQL sever 2000 demonstrate the strength of fuzzy AprioriSome sequential pattern mining in mining large quantity of transaction data.  相似文献   

13.
In this paper we show that frequent closed itemset mining and biclustering, the two most prominent application fields in pattern discovery, can be reduced to the same problem when dealing with binary (0–1) data. FCPMiner, a new powerful pattern mining method, is then introduced to mine such data efficiently. The uniqueness of the proposed method is its extendibility to non-binary data. The mining method is coupled with a novel visualization technique and a pattern aggregation method to detect the most meaningful, non-overlapping patterns. The proposed methods are rigorously tested on both synthetic and real data sets.  相似文献   

14.
通过给出页面层次的概念,充分考虑用户在页面上的浏览时间以及在路径选择上表现出来的浏览偏爱,结合Web站点的结构层次特征,提出了一种改进的Web用户浏览偏爱模式挖掘算法.通过具体的事例和试验数据证明,新的模型能够更准确地寻找用户浏览偏爱模式,从而发现用户的兴趣和爱好.  相似文献   

15.
《微型机与应用》2016,(22):22-25
由于高效用模式挖掘较为复杂,提高其挖掘算法的效率是数据挖掘的研究热点。HUP-miner算法是典型的基于垂直模式类的高效用模式挖掘算法,虽然能够有效地减少效用列表的总个数,但对于项集的划分,效用列表需要更多的空间。针对该问题,在HUI-miner算法的基础上充分考虑了1-扩展集中项集的关联性,减少了效用列表个数,提出了改进的IHUI-miner算法。实验结果表明,改进算法IHUI-miner在时间效率和减少效用列表的个数上都优于HUP-miner与HUI-miner算法。  相似文献   

16.
The practical relevance of process mining is increasing as more and more event data become available. Process mining techniques aim to discover, monitor and improve real processes by extracting knowledge from event logs. The two most prominent process mining tasks are: (i) process discovery: learning a process model from example behavior recorded in an event log, and (ii) conformance checking: diagnosing and quantifying discrepancies between observed behavior and modeled behavior. The increasing volume of event data provides both opportunities and challenges for process mining. Existing process mining techniques have problems dealing with large event logs referring to many different activities. Therefore, we propose a generic approach to decompose process mining problems. The decomposition approach is generic and can be combined with different existing process discovery and conformance checking techniques. It is possible to split computationally challenging process mining problems into many smaller problems that can be analyzed easily and whose results can be combined into solutions for the original problems.  相似文献   

17.
18.
基于多核处理器并行系统的任务调度算法   总被引:6,自引:0,他引:6  
针对多核处理器并行系统的特点,提出了相应的任务调度算法,该算法在任务调度之前加入了任务分配技术,通过合理的任务分配,可有效减少多个处理器间的通信开销,使任务调度效率更佳.仿真实现了该算法,并通过实验数据证明了该算法的优越性.  相似文献   

19.
High temperature will affect the stability and performance of multi-core processors. A temperature-aware scheduling algorithm for soft real-time multi-core systems is proposed in this paper, namely LTCEDF (Low Thermal Contribution Early Deadline First). According to the core temperature and thread thermal contribution, LTCEDF performs thread migration and exchange to avoid thermal saturation and to keep temperature equilibrium among all the cores. The core temperature calculation method and the thread thermal contribution prediction method are presented. LTCEDF is simulated on ATMI simulator platform. Simulation results show that LTCEDF can not only minimize the thermal penalty, but also meet real-time guarantee. Moreover, it can create a more uniform power density map than other thermal-aware algorithms, and significantly reduce thread migration frequency.  相似文献   

20.

We describe a novel, systematic approach to efficiently parallelizing data mining algorithms: starting with the representation of an algorithm as a sequential composition of functions, we formally transform it into a parallel form using higher-order functions for specifying parallelism. We implement the approach as an extension of the industrial-strength Java-based library Xelopes, and we illustrate its use by developing a multi-threaded Java program for the popular naive Bayes classification algorithm. In comparison with the popular MapReduce programming model, our resulting programs enable not only data-parallel, but also task-parallel implementation and a combination of both. Our experiments demonstrate an efficient parallelization and good scalability on multi-core processors.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号