期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

肖增良何锫康立山《计算机工程与应用》2009,45(11):39-41

针对形式化程序验证中的并行调度问题,提出了基于依赖集的算法。通过引入依赖图和依赖集概念,以形式化方式描述程序语句间的依赖关系,然后给出了从语法分析树构造依赖图和依赖集的算法;最后在此基础上设计了并行调度算法并应用于计算机辅助程序验证系统。实验结果表明,该方法具有较高的并行效率。相似文献

2.

并行性挖掘的图文法方法

邹阳吕建曾晓勤《小型微型计算机系统》2011,32(2)

并行性分析技术一般通过对程序的控制与数据流图或相关依赖图的分析来实现,因而需要从程序中抽取出相应依赖图的算法的支持.本文基于上下文相关图文法RGG形式框架,定义了一种任务级的并行编程图语言GPPL来直接描述顺序或并行程序的控制与数据流图,而且设计了相应的并行性分析算法以挖掘GPPL图程序的并行性特征.GPPL图语言可视为并行程序设计与程序代码生成之间的协同语言,从而使并行性挖掘避免了从程序中抽取出相应依赖图的过程.与已有的描述顺序或并行程序的图语言及其分析算法相比,GPPL图程序形式更为简洁和直观,易于设计,描述能力也更强;基于GPPL图的并行性分析算法的分析能力更强,而且具有可扩展性. 相似文献

3.

一种基于非正规域的区域依赖关系分析法 总被引：1，自引：0，他引：1

朱根江谢立《计算机学报》1994,17(3):168-175

在自动并行编译中，并行性的识别主要集中在循环及语句级，而许多程序实际上可通过挖掘子程序级这种“任务“并行性来提高性能。本文提出了基于非正规域的区域依赖分析方法，旨在发掘这类并行性，它能精确地刻划程序中的数据访问区域。克服了现有区域分析技术中趋于保守的弱点，从而提出了并行度，依赖关系的测试算法简单而有效。相似文献

4.

PVM并行程序验证系统的原理与实现 总被引：5，自引：0，他引：5

张兆庆蒋昌俊乔如良叶志宝周杰《计算机学报》1999,22(4):409-414

本文主要介绍ＰＶＭ并行程序验证系统的基本原理和实现技术。首先,我们扼要分析ＰＶＭ程序的构成与特点,然后阐述验证系统的理论模型和验证算法;最后,讨论开发过程中的若干关键技术,本系统的研制可为产行程序的自动转换和分析验证提供了一个可视化的运行环境。相似文献

5.

XQuery并行实现中的任务调度算法

权跃龙廖湖声高红雨《计算机工程与设计》2015,36(1):138-143,183

为提高XQuery语言的处理性能,针对XQuery并行实现中的任务调度问题,提出一种适用于共享内存多线程环境的调度算法。在一种新型调度策略的指导下,能够利用XQuery语言中存在任务并行性、数据并行性和流水线并行性的特点,提高程序并行执行效率;针对流水线并行执行方式,建立一种流水线局部并行自动机模型,通过利用流水线中各节拍之间的空闲等待时间,提高系统资源的利用率。通过实验验证了该算法的可行性和有效性。相似文献

6.

基于并行和预测的方向菱形运动估计算法

魏长虎贾智平程志《计算机应用》2008,28(11):2887-2889

对MPEG-4视频编码中的块运动估计算法进行分析,得出算法中存在的并行性。将多核并行计算的思想应用于方向菱形模板的运动估计算法中,并引入一种预测机制来充分提高并行性,得到基于多核并行预测的方向菱形运动估计算法(PPDDME)。在PC机和Omap5910 两个平台上的验证表明,该算法相比串行算法在保证了压缩质量的前提下,有效提高了编码速度。相似文献

7.

基于区域平均执行时间和数据依赖信息的可能并行区域识别 总被引：1，自引：0，他引：1

张超王蕾向晓娅冯晓兵《计算机学报》2008,31(10)

随着多核处理器逐渐成为处理器发展的新趋势,为了持续提高程序性能,必须并行执行应用程序.传统的自动并行技术能够很好地并行科学计算应用中的规则循环,但对于含有大量函数调用和指针引用的不规则程序,目前还不能有效地对其实施并行.针对这一现状,文中提出了基于区域平均执行时间和数据依赖信息的可能并行区域识别方法来对一些不规则程序实施高效并行,主要贡献如下:(1)自动识别程序中的多种并行性,不仅包括传统并行性分析中的循环迭代间的细粒度并行性,而且也包括传统并行性分析尚不能有效处理的循环体和函数调用点间的粗粒度并行性.对于程序中蕴含的众多并行性,文中基于区域平均执行时间实施收益分析来选择合适的并行区域实施并行;(2)自动识别可能并行区域间数据依赖关系的数量、类型以及导致数据依赖关系的程序变量.基于文中的分析结果,作者使用面向行为的投机并行系统(behavior oriented parallelism)对SPEC2006中的4个测试用例实现了并行化.并行化后的程序在Intel和AMD多核处理器上分别得到了300%和260%的平均性能加速. 相似文献

8.

隧道算法的分布式并行计算模型

余胜生文元桥周敬利《数值计算与计算机应用》2006,27(4):299-306

本文在总结各种隧道算法的共同特点的基础上,从隧道算法钻隧过程的可并行性出发,提出了基于agent的分布式并行隧道算法,并给出了系统原型和agent模型设计．通过在agent中使用多线程技术,增强了算法的并行性．数值实验证明了该算法的可行性、可扩展性和并行效率．相似文献

9.

并行测试任务可并行性分析与应用

胡雷刚肖明清王磊《计算机测量与控制》2008,16(10):1373-1375,1379

针对并行测试系统中不同UUT测试任务组的并行测试效率提高存在差异的现象,提出了测试任务的可并行性概念,以描述测试任务在并行测试过程中的固有属性;首先根据测试任务组的并行测试效率不同的现象,提出了测试任务可并行性的概念;然后给出了衡量可并行性的可并行度指标,用以定量地指导并行测试系统开发过程中仪器资源的配置;最后通过应用实例验证了可并行性概念的合理性、可并行度指标的实用性;建立的可并行性概念不仅丰富了测试领域的理论基础,对并行测试系统的仪器资源的配置也具有指导价值。相似文献

10.

HEVC帧内预测算法加速设计与实现

王飞龙刘新闯刘鹏辛晓斐石鹏飞《计算机应用与软件》2020,37(1):151-156

新一代视频编码标准获得了较高的编码效率,但同时也增加了计算量。HEVC(High Efficiency Video Coding)并行算法能够提高编码速度,开发适用于多核处理器的并行编码算法对于满足高清视频实时传输和大规模实时共享具有十分重要的意义。分析帧内预测算法在处理像素过程中数据之间的依赖关系,进行基于预测模式的细粒度并行性的设计。块与块之间采用流水线处理,减少帧内预测算法的执行时间。利用动态可编程可重构视频阵列处理器,对帧内预测算法进行验证。实验结果表明,相比于HM16.0官方测试标准,信噪比提高了10%,算法的执行时间减少了大约70%。相似文献

11.

Parallel Logic Programming for Problem Solving

Ramiro Varela Arias Camino Rodríguez Vela Jorge Puente Peinador Cesar Alonso González 《International journal of parallel programming》2000,28(3):275-319

We present a new model for parallel evaluation of logic programs. This model can exploit the main sources of parallelism that the language of logic expresses: Independent AND parallelism and OR parallelism, together with a secondary source emerging as a consequence of the Independent AND Parallelism: the producer/consumer parallelism. The efficiency is derived from the use of ordered structures for managing the information generated throughout the search process. The model is suitable for evaluating programs with a high degree of non-determinism because it never generates two processes for solving the same subgoal and hence it can exploit the same real parallelism generating a lower number of processes than other models. As an application example, we consider the Job Shop Scheduling problem. We report experimental results showing that logic programs can be designed that exhibit parallelism, and that the use of heuristic information translates into speedup in obtaining answers. 相似文献

12.

Performance evaluation of unified memory and dynamic parallelism for selected parallel CUDA applications

Łukasz Jarząbek Paweł Czarnul 《The Journal of supercomputing》2017,73(12):5378-5401

The aim of this paper is to evaluate performance of new CUDA mechanisms—unified memory and dynamic parallelism for real parallel applications compared to standard CUDA API versions. In order to gain insight into performance of these mechanisms, we decided to implement three applications with control and data flow typical of SPMD, geometric SPMD and divide-and-conquer schemes, which were then used for tests and experiments. Specifically, tested applications include verification of Goldbach’s conjecture, 2D heat transfer simulation and adaptive numerical integration. We experimented with various ways of how dynamic parallelism can be deployed into an existing implementation and be optimized further. Subsequently, we compared the best dynamic parallelism and unified memory versions to respective standard API counterparts. It was shown that usage of dynamic parallelism resulted in improvement in performance for heat simulation, better than static but worse than an iterative version for numerical integration and finally worse results for Golbach’s conjecture verification. In most cases, unified memory results in decrease in performance. On the other hand, both mechanisms can contribute to simpler and more readable codes. For dynamic parallelism, it applies to algorithms in which it can be naturally applied. Unified memory generally makes it easier for a programmer to enter the CUDA programming paradigm as it resembles the traditional memory allocation/usage pattern. 相似文献

13.

Modeling and formal verification of embedded systems based on a Petri net representation 总被引：2，自引：0，他引：2

Luis Alejandro Petru Zebo 《Journal of Systems Architecture》2003,49(12-15):571

In this paper we concentrate on aspects related to modeling and formal verification of embedded systems. First, we define a formal model of computation for embedded systems based on Petri nets that can capture important features of such systems and allows their representation at different levels of granularity. Our modeling formalism has a well-defined semantics so that it supports a precise representation of the system, the use of formal methods to verify its correctness, and the automation of different tasks along the design process. Second, we propose an approach to the problem of formal verification of embedded systems represented in our modeling formalism. We make use of model checking to prove whether certain properties, expressed as temporal logic formulas, hold with respect to the system model. We introduce a systematic procedure to translate our model into timed automata so that it is possible to use available model checking tools. We propose two strategies for improving the verification efficiency, the first by applying correctness-preserving transformations and the second by exploring the degree of parallelism characteristic to the system. Some examples, including a realistic industrial case, demonstrate the efficiency of our approach on practical applications. 相似文献

14.

A parallel execution model of logic programs

Chen A.C. Wu C.-I. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(1):79-92

A parallel-execution model that can concurrently exploit AND and OR parallelism in logic programs is presented. This model employs a combination of techniques in an approach to executing logic problems in parallel, making tradeoffs among number of processes, degree of parallelism, and combination bandwidth. For interpreting a nondeterministic logic program, this model (1) performs frame inheritance for newly created goals, (2) creates data-dependency graphs (DDGs) that represent relationships among the goals, and (3) constructs appropriate process structures based on the DDGs. (1) The use of frame inheritance serves to increase modularity. In contrast to most previous parallel models that have a large single process structure, frame inheritance facilitates the dynamic construction of multiple independent process structures, and thus permits further manipulation of each process structure. (2) The dynamic determination of data dependency serves to reduce computational complexity. In comparison to models that exploit brute-force parallelism and models that have fixed execution sequences, this model can reduce the number of unification and/or merging steps substantially. In comparison to models that exploit only AND parallelism, this model can selectively exploit demand-driven computation, according to the binding of the query and optional annotations. (3) The construction of appropriate process structures serves to reduce communication complexity. Unlike other methods that map DDGs directly onto process structures, this model can significantly reduce the number of data sent to a process and/or the number of communication channels connected to a process 相似文献

15.

Verification problems in conceptual workflow specifications 总被引：14，自引：0，他引：14

A.H.M ter Hofstede M.E Orlowska J Rajapakse 《Data & Knowledge Engineering》1998,24(3):239-256

Most of today's business requirements can only be accomplished through integration of various autonomous systems which were initially designed to serve the needs of particular applications. In the literature workflows are proposed to design these kinds of applications. The key tool for designing such applications is a powerful conceptual specification language. Such a language should be capable of capturing interactions and cooperation between component tasks of workflows among others. These include sequential execution, iteration, choice, parallelism and synchronisation. The central focus of this paper is the verification of such process control aspects in conceptual workflow specifications. As is generally agreed upon, that the later in the software development process an error is detected, the more it will cost to correct it; it is thus of vital importance to detect errors as early as possible in the systems-development process. In this paper some typical verification problems in workflow specifications are identified and their complexity is addressed. It will be proven that some fundamental problems are not tractable and we will show what restriction is needed to allow termination problems to be recognized in polynomial time. 相似文献

16.

Expressive program verification via structured specifications

Cristian Gherghina Cristina David Shengchao Qin Wei-Ngan Chin 《International Journal on Software Tools for Technology Transfer (STTT)》2014,16(4):363-380

Conventional specifications typically have a flat structure that is based primarily on the underlying logic. Such specifications lack structures that could provide better guidance to the verification process. In this work, we propose to add three new structures to a specification framework for separation logic to achieve a more precise and better guided verification for pointer-based programs. The newly introduced structures empower users with more control over the verification process in the following ways: (1) case analysis can be invoked to take advantage of disjointedness conditions in the logic, (2) early, as opposed to late, instantiation can minimise the use of existential quantification and (3) novel formulae structuring can provide better reuse of the verification process. Initial experiments have shown that structured specifications can lead to more precise verification without incurring any performance overhead. To support our proposal, we shall illustrate the usage of structured specifications in the context of proving termination and we will briefly outline the impact of our proposal on a recent development focussed on verifying the FreeRTOS scheduler Ferreira et al. (Int. J. Softw. Tools Technol. Trans. 2014). 相似文献

17.

Implementing regularly structured neural networks on the DREAMmachine

Shams S. Gaudiot J.-L. 《Neural Networks, IEEE Transactions on》1995,6(2):407-421

High-throughput implementations of neural network models are required to transfer the technology from small prototype research problems into large-scale "real-world" applications. The flexibility of these implementations in accommodating for modifications to the neural network computation and structure is of paramount importance. The performance of many implementation methods today is greatly dependent on the density and the interconnection structure of the neural network model being implemented. A principal contribution of this paper is to demonstrate an implementation method which exploits maximum amount of parallelism from neural computation, without enforcing stringent conditions on the neural network interconnection structure, to achieve this high implementation efficiency. We propose a new reconfigurable parallel processing architecture, the Dynamically Reconfigurable Extended Array Multiprocessor (DREAM) machine, and an associated mapping method for implementing neural networks with regular interconnection structures. Details of the system execution rate calculation as a function of the neural network structure are presented. Several example neural network structures are used to demonstrate the efficiency of our mapping method and the DREAM machine architecture on implementing diverse interconnection structures. We show that due to the reconfigurable nature of the DREAM machine, most of the available parallelism of neural networks can be efficiently exploited. 相似文献

18.

基于数据分解的并发面向对象程序开发方法 总被引：1，自引：0，他引：1

杨大军吕建《软件学报》2000,11(1):67-72

提出了一种从VDM-SL(Vienna development method-specification language)规约到并发面向对象程序的开发方法,这种方法基于DD-VDM(data decomposition-Vienna development method).在此基础上提出了虚拟原子、服务并行和内部并行等概念,继而提出一种嵌套面向对象结构来体现这些功能.分别从共享量并行系统和分布并行系统的角度讨论了嵌套面向对象结构的实现技术. 相似文献

19.

Static analysis and process model transformation for an advanced business process to Petri net mapping

下载免费PDF全文

Thomas S. Heinze Wolfram Amme Simon Moser 《Software》2018,48(1):161-195

Verification of business processes typically relies on Petri net–based process models. While they allow for natural modeling and analysis of aspects such as parallelism and message exchange, such a process model is seldom complete and precise. This is mainly because the available techniques for deriving a Petri net model from the original model neglect process data in favor of feasible verification. In this paper, we present an approach for deriving more precise process models by leveraging a process‐to‐Petri‐net compiler, which takes as input a business process and generates as output a Petri net model for the process. This can be subsequently used for verification. However, in contrast to a conventional compiler, our compiler's objective is not to create the most efficient code but rather to produce a most precise but still effectively verifiable Petri net–based process model. 相似文献

20.

Data-parallel support for numerical irregular problems

E. L. Zapata O. Plata R. Asenjo G. P. Trabado 《Parallel Computing》1999,25(13-14)

A large class of intensive numerical applications show an irregular structure, exhibiting an unpredictable runtime behavior. Two kinds of irregularity can be distinguished in these applications. First, irregular control structures, derived from the use of conditional statements on data only known at runtime. Second, irregular data structures, derived from computations involving sparse matrices, grids, trees, graphs, etc. Many of these applications exhibit a large amount of parallelism, but the above features usually make that exploiting such parallelism becomes a very difficult task. This paper discusses the effective parallelization of numerical irregular codes, focusing on the definition and use of data-parallel extensions to express the parallelism that they exhibit. We show that the combination of data distributions with storage structures allows to obtain efficient parallel codes. Codes dealing with sparse matrices, finite element methods and molecular dynamics (MD) simulations are taken as working examples. 相似文献