首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper describes a number of optimizations that can be used to support the efficient execution of irregular problems on distributed memory parallel machines. These primitives (1) coordinate interprocessor data movement, (2) manage the storage of, and access to, copies of off-processor data, (3) minimize interprocessor communication requirements, and (4) support a shared name space. We present a detailed performance and scalability analysis of the communication primitives. This performance and scalability analysis is carried out using a workload generator, kernels from real applications, and a large unstructured adaptive application (the molecular dynamics code CHARMM).  相似文献   

2.
Falk  G. McQuillan  J.M. 《Computer》1977,10(11):22-29
The increasing use of computer data communications over the past several years has spawned a variety of network architectures to support requirements for distributed processing. Developed by various R&D groups,1-3by the common carriers,4-4by minicomputer and mainframe manufacturers,7,8and by the vendors of traditional communications hardware,9,10these new architectures represent alternative means to similar ends. This article provides a framework for understanding existing and forthcoming systems, focusing particular attention on the impact of evolving requirements and technologies.  相似文献   

3.
本文介绍了一种数据流分析技术,功能-信息分析法。这一方法明确了功能和信息对基于数据处理的现实系统,是其逻辑模型的两大基本要素,强调自顶向下逐层分解系统应立足于对功能和信息的分析。最后本文将功能-信息分析法应用于具体系统分析实例中。  相似文献   

4.
A longest common subsequence (LCS) of two strings is a common subsequence of two strings of maximal length. The LCS problem is to find an LCS of two given strings and the length of the LCS (LLCS). In this paper, we present a new linear processor array for solving the LCS problem. The array is based on parallelization of a recent LCS algorithm which consists of two phases, i.e. preprocessing and computation. The computation phase is based on bit-level dynamic programming approach. Implementations of the preprocessing and computation phases are discussed on the same processor array architecture for the LCS problem. Further, we propose a block processor array architecture which reduces the overall communication and time requirements. Finally, we develop a performance model for estimating the performance of the processor array architecture on Pentium processors.  相似文献   

5.
We present a semantics-based technique for analysing probabilistic properties of imperative programs. This consists in a probabilistic version of classical data flow analysis. We apply this technique to pWhile programs, i.e programs written in a probabilistic version of a simple While language. As a first step we introduce a syntax based definition of a linear operator semantics (LOS) which is equivalent to the standard structural operational semantics of While. The LOS of a pWhile program can be seen as the generator of a Discrete Time Markov Chain and plays a similar role as a collecting or trace semantics for classical While. Probabilistic Abstract Interpretation techniques are then employed in order to define data flow analyses for properties like Parity and Live Variables.  相似文献   

6.
用数据流分析方法检查程序信息流安全   总被引:2,自引:0,他引:2  
程序信息流安全是信息安全的一个重要研究方向.基于类型的分析虽然是检查程序信息流安全的一种有效方法,但过于保守.本文尝试将传统的数据流分析方法用于程序信息流安全的检查,即利用数据流分析来跟踪程序数据间的安全依赖关系,达到检查程序信息流安全的目的.和基于类型的方法相比,数据流分析方法能更加精确地分析程序,具有更大的宽容性.最后,本文对数据流分析方法的可靠性进行了证明.  相似文献   

7.
数据流分析方法   总被引:4,自引:2,他引:4  
数据流分析是一项编译时使用的技术,它能从程序代码中收集程序的语义信息,并通过代数的方法在编译时确定变量的定义和使用。该文对数据流、数据流框架及数据流算法进行了阐述;并简要地介绍了所提出的需求过程间数据流分析的方法。  相似文献   

8.
分布式磁盘阵列对于提高数据存储的可靠性、带宽和容量,具有十分重要的意义。本文介绍了分布式磁盘阵列的两种连接方式,磁盘分布连接到计算机和磁盘阵列连接到网络,以及在分布式磁盘阵列中得到应用的两种冗余策略,Chained declustering和RAID-x。  相似文献   

9.
基于指针数组的数据划分模式   总被引:1,自引:0,他引:1  
数据划分是分布主存系统中并行编译的关键技术,它以数组和包含这些数组的嵌套循环为研究对象,以提高数据局部性和挖掘计算并行性为根本目的。传统数据划分模式不适合指向数组的指针数组的数据划分,论文提出了解决该类指针数组数据划分的划分模式,文中称为数组向量的数据划分。分析其数据引用的特性,通过选取代表元,给出数据划分的策略,弥补了现有数据划分研究的不足。  相似文献   

10.
杨旭  何虎  孙义和 《计算机学报》2011,34(1):182-192
应用的需求促使如今的处理器必须尽可能高地利用程序中所存在的指令级并行度,然而,高指令级并行的硬件和指令调度技术会给寄存器资源带来极大的压力.要在单一寄存器堆的情况下,既维持高的指令级并行度,又保持高的运行时钟频率是一件非常困难的事情,这是因为,当指令级并行度足够高时,在单一寄存器堆情况下,寄存器堆访问端口数目的限制会使...  相似文献   

11.
Arrays are a common and important class of data in many applications. Arrays can model data such as digital images, digital video, scientific and experimental data, matrices, and finite element grids. Although array manipulations are diverse and domain-specific, they often exhibit structural regularities. This paper describes an algorithm called sub-pushdown to trace data lineage in such array computations. Lineage tracing is a type of data-flow analysis that relates parts of a result array to those parts of the argument (base) arrays that have bearings on the result array parts. Sub-pushdown can be used to trace data lineage in array-manipulating computations expressed in the Array Manipulation Language (AML) that was introduced previously. Sub-pushdown has several useful features. First, the lineage computation is expressed as an AML query. Second, it is not necessary to evaluate the AML lineage query to compute the array data lineage. Third, sub-pushdown never gives false-negative answers. Sub-pushdown has been implemented as part of the ArrayDB prototype array database system that we have built.  相似文献   

12.
《软件工程师》2019,(12):44-46
由于数据流的不稳定性,将数据流查询安排在固定节点上就会造成分布式数据流处理技术很难对计算资源实现较高的处理效率,基于此,提出大数据分析下分布式数据流处理技术研究。具体流程是数据收集、历史数据的存储和查询、Storm实时处理、智能索引、数据模型的建立。根据实验结果可知,本文提出的大数据分析下分布式数据流处理技术与传统技术相比,在数据流的处理效率上占有较大优势,一般维持在75%以上,能够大大节省处理时间。  相似文献   

13.
分析了现有云计算系统中机密性的几种保护机制,并提出了一种基于数据流分析的机密性风险评估模型.通过LKM部署的监控器截获系统调用获得系统的数据流,交由评估器与服务提供的数据流模式集进行比较分析,来对服务机密性进行综合评估.实验表明该方法能够有效识别云计算环境中破坏服务和数据机密性的行为.  相似文献   

14.
压阻阵列触觉传感器为减少外接引线,一般来用行列电极结构,给触觉的扫描采样带来了噪声。随着阵列数的增加,提高传感器的频响,将直接影响它的实时处理。为此,设计了一种新的行扫描采样电路,解决了这二方面的问题.  相似文献   

15.
本文基于数据流框架理论,提出了如何将数据流分析方法应用于JAVA字节码中,通过建立数据流与半格、数据流和函数调用图的关系,从而对类型信息进行分析.实验表明该数据流分析方法能够对文件中的类型信息进行较精确的分析.  相似文献   

16.
Simultaneous Multithreading (SMT) is a processor architectural technique that promises to significantly improve the utilization and performance of modern wide-issue superscalar processors. An SM T processor is capable of issuing multiple instructions from multiple threads to a processor's functional units each cycle. Unlike shared-memory multiprocessors, SMT provides and benefits from fine-grained sharing of processor and memory system resources; unlike current uniprocessors, SMT exposes and benefits from inter-thread instruction-level parallelism when hiding long-latency operations. Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine, particularly for parallel processors. For example, when targeting shared-memory multiprocessors, parallel programs are compiled to minimize sharing, in order to decrease high-cost inter-processor communication. Therefore, optimizations that are appropriate for these conventional machines may be inappropriate for SMT, which can benefit from finegrained resource sharing within the processor. This paper reexamines several compiler optimizations in the context of simultaneous multithreading. We revisit three optimizations in this light: loop-iteration scheduling, software speculative execution, and loop tiling. Our results show that all three optimizations should be applied differently in the context of SMT architectures: threads should be parallelized with a cyclic, rather than a blocked algorithm; non-loop programs should not be software speculated, and compilers no longer need to be concerned about precisely sizing tiles to match cache sizes. By following these new guidelines, compilers can generate code that improves the performance of programs executing on SMT machines.  相似文献   

17.
The combination of static and dynamic software analysis, such as data flow analysis (Dfa) and model checking, provides benefits for both disciplines. On the one hand, the information extracted by Dfas about program data may be utilized by model checkers to optimize the state space representation. On the other hand, the expressiveness of logic formulas allows us to consider model checkers as generic data flow analyzers. Following this second approach, we propose in this paper an algorithm to calculate Dfas using on-the-fly resolution of boolean equation systems (Bess). The overall framework includes the abstraction of the input program into an implicit labeled transition system (Lts), independent of the program specification language. Moreover, using Bess as an intermediate representation allowed us to reformulate classical Dfas encountered in the literature, which were previously encoded in terms of μ-calculus formulas with forward and backward modalities. Our work was implemented and integrated into the widespread verification platform Cadp, and experimented on real examples.  相似文献   

18.
Algorithms are presented for detecting errors and anomalies in programs which use synchronization constructs to implement concurrency. The algorithms employ data flow analysis techniques. First used in compiler object code optimization, the techniques have more recently been used in the detection of variable usage errors in dngle process programs. By adapting these existing algorithms, the sane classes of variable usage errors can be detected in concurrent process programs. Important classes of errors unique to concurrent process programs are also described, and algorithms for their detection are presented.  相似文献   

19.
一种基于扩展数据流分析的OpenMP程序应用级检查点机制   总被引:1,自引:0,他引:1  
随着多核处理器体系结构在高性能计算领域日益广泛的应用,面向共享存储并行程序的容错问题成为研究的热点.近年来,检查点技术已经成为该领域占主导地位的容错机制.目前已有一些针对OpenMP程序检查点技术的研究工作,但其中绝大多数解决方案都依赖于特殊的运行时库或硬件平台.该文提出一种编译辅助的OpenMP应用级检查点,它是一种平台无关的方案,通过面向OpenMP的扩展数据流分析选择那些"必需"的变量保存到检查点映像,从而降低容错的开销,同时通过运行一种非阻塞式的协议维护检查点的全局一致性.文章讨论了该机制的各个关键问题,并通过实验评测以及与同类工作的比较,表明了该文所提出的检查点机制在容错性能方面的优势.  相似文献   

20.
本文介绍了两种市场主流高端磁盘阵列的体系结构,阐述了软件对提高存储服务器性能的作用,比较了两种高端磁盘阵列,最后总结了高端磁盘阵列所采用技术并预测未来发展趋势.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号