期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

姚震郑启龙陈国良杨晓奇《小型微型计算机系统》2008,29(3):437-443

基于事务性执行的投机并行多线程是一种适合未来多核微处理器架构的新型并行程序设计和编译技术.但在此基础上的并行程序执行过程更为复杂,程序执行过程的模拟成为关键问题之一.本文提出利用二进制代码级动态插桩技术对投机并行多线程程序进行功能性模拟,设计并实现了完整的软件平台,可精确地模拟和监控并行程序的线程级投机执行过程,检测访存冲突,从而实现投机并行多线程的语义.该软件平台同时可以作为进一步研究投机多线程并行程序真实执行过程的基础,并有效支持投机并行多线程编译器的设计和分析. 相似文献

2.

多线程技术在ARM9串口通讯中的应用

LIU Zhen 《数字社区&智能家居》2008,(23)

高速串口数据通讯,要求在接收数据采集设备发送大量数据的同时,完成对已接收到数据的实时存储。利用多线程技术,解决高速ARM在运行任务时应用程序的执行速度和串口传输数据速度不匹配,提高ARM对用户应用程序的响应速度,从而提高整个任务的执行速度和保证数据的完整性,提高系统整体性能。相似文献

3.

Prophet推测多线程系统设计与实现

李钟赵银亮杜延宁《计算机科学》2011,38(2):296-301

推测多线程技术通过推测执行的方式开发应用程序的线程级并行性,以提高程序执行性能。该技术一般通过执行模型来检测运行时可能的线程推测错误情况,并采取合适的机制恢复程序正确运行。描述的Prophet是一种基于硬件实现的推测多线程执行模型。重点描述了Prophet执行模型针对执行模型设计的关键问题的解决方案,包括Prophet的线程状态控制和多版本的Cach。系统,Prophet的多版本Cache系统提供了推测数据缓存功能,并使用基于总线监听的Cache协议实现了数据依赖违规检测。还给出了使用Olden基准程序对Prophet执行模型进行功能和性能测试的结果,并分析说明了Prophet系统可以有效地开发应用程序的线程级并行性。相似文献

4.

基于Java多线程技术的网络编程

陈隽《电脑编程技巧与维护》2009,(22):83-84

Java强大的安全性使得其在网络编程中被广泛采用,特别是其出色的多线程机制。利用Java的多线程编程接口,开发人员可以方便地写出支持多线程的应用程序,有效地减少并发并行程序设计的困难,提高了程序执行效率。以开发一个多用户具有并发服务器的网络聊天室为例,论述Java的多线程技术在网络编程中的应用。相似文献

5.

Cell异构多核处理器上流水并行优化技术*

曹倩胡长军李士刚《计算机应用研究》2011,28(9):3344-3347

针对如何发挥异构多核处理器的优势从而提高程序执行效率的问题,提出了Cell异构多核处理器上实现线程同步流水并行和迭代同步流水并行两种优化技术,该优化技术可以有效地提高非规则写和控制结构非规则的执行速度。通过在Cell处理器上对NAS benchmarks中的IS、EP、LU以及SPEC2001中的MOLDYN进行测试,结果表明该流水并行方案有效地改善了临界区和flush操作的执行效率,明显地提高了程序的执行速度。相似文献

6.

针对SECD抽象机的基于踪迹的即时编译技术

于成龙廖湖声武辰之苏航《计算机工程与设计》2015,(2):384-391

为提高程序的整体执行效率,提出一种基于踪迹(trace-based)的通用即时编译技术。在SECD抽象机指令序列的解释执行中,将执行频率高的程序片段识别为踪迹,并编译成Java字节码,由Java虚拟机执行。任何用SECD抽象机实现的编程语言都可以使用该技术来提高程序执行效率。介绍该技术的实现方法,包括解释执行环境和Java字节码程序执行环境的转换方法,实现采用该技术的执行引擎框架。实验结果表明,该技术可有效提高程序的执行效率。相似文献

7.

基于ARM-FPGA的PLC源程序编译系统的设计

蒋玉新蔡启仲李克俭张炜《计算机应用与软件》2013,(9)

在研究PLC程序编译执行过程和新指令特点的基础上,基于ARM-FPGA的PLC主机结构,提出一种新型PLC指令的编译方法。静态编译将指令的操作数转换为PLC软元件的直接地址,建立转移类指令的转移地址链表,编译成为新的PLC程序代码序列;动态编译在PLC程序执行过程中将新程序指令中操作数的直接地址转换为立即数,由FPGA模块执行。通过对PLC用户源程序的编译与执行,表明该方法能够充分应用FPGA高速并行处理的功能,提高了PLC程序执行的速度。相似文献

8.

外连接在PL/SQL数据迁移程序中的应用 总被引：1，自引：0，他引：1

宋鹏赵球红《计算机系统应用》2008,17(10):121-123

针对数据迁移中PL/SQL程序要求具有较高的执行速度,分析了迁移程序中游标的查询语句,提出应用外连接技术优化查询语句能减少游标的嵌套层次,达到提高数据迁移程序执行速度的目的。数据迁移实践证明此方法能显著提高海量数据迁移程序的执行速度。相似文献

9.

基于变量跟踪的运行时错误现场分析工具

张天炯王铮《计算机应用》2014,34(3):857-860

运行时错误是程序动态运行时产生的错误,错误发生后,需要借助传统的调试手段分析错误原因。对于某些异常行为和多线程程序不能再现真实的执行环境,传统的调试分析手段的作用不明显。如果能够捕获程序执行时的变量信息,那么运行时错误现场也会被捕获,进而以此为依据分析错误原因。对此提出基于变量跟踪的运行时错误现场捕获技术,它可以根据用户需求捕获特定变量信息,从而有效提高了变量信息获取的灵活性。以该技术为基础实现了运行时错误现场分析工具(RFST),该工具能够为程序运行时错误分析提供错误现场和辅助分析手段。相似文献

10.

基于变量跟踪的运行时错误现场分析工具

《计算机应用》2014,(3)

运行时错误是程序动态运行时产生的错误,错误发生后,需要借助传统的调试手段分析错误原因。对于某些异常行为和多线程程序不能再现真实的执行环境,传统的调试分析手段的作用不明显。如果能够捕获程序执行时的变量信息,那么运行时错误现场也会被捕获,进而以此为依据分析错误原因。对此提出基于变量跟踪的运行时错误现场捕获技术,它可以根据用户需求捕获特定变量信息,从而有效提高了变量信息获取的灵活性。以该技术为基础实现了运行时错误现场分析工具(RFST),该工具能够为程序运行时错误分析提供错误现场和辅助分析手段。相似文献

11.

Emerging technology enabled energy-efficient GPGPUs register file

《Microprocessors and Microsystems》2017

Modern Graphics Processing Units (GPGPUs) employ the fine-grained multi-threading among thousands of active threads, leading to the sizable register file (RF) with massive energy consumption. In this study, we explore the emerging technology (i.e., Tunnel FET (TFET)) enabled energy-efficient GPGPUs RF. TFET is much more energy-efficient than CMOS at the low voltage operations, but always using TFET at the low voltage (so that low frequency) causes significant performance degradation. In this study, we first design the hybrid CMOS-TFET based register file, and propose the memory-contention-aware TFET register allocation (MEM_RA). MEM_RA allocates TFET-based registers to threads whose execution progress can be delayed to some degree to avoid the memory contentions with other threads, and the CMOS-based registers are still used for threads requiring normal execution speed. We further observe the insufficient TFET register resources for the memory-intensive benchmarks when applying the MEM_RA technique. We then develop the TFET-register-utilization-aware block allocation (TUBA) and TFET-regsiter-request-aware warp scheduling (TRWS) mechanisms to effectively utilize the limited TFET registers and achieve the maximal energy savings. Our experimental results show that the proposed techniques achieve 40% energy (including both dynamic and leakage) reduction in GPGPUs register file with negligible performance overhead. 相似文献

12.

Entity-life modeling: modeling a thread architecture on the problem environment

Sanden B.I. 《Software, IEEE》2003,20(4):70-78

With Java threads and the wider availability of multiprocessors, more programmers are confronted with multithreading. Concurrent threads let you take advantage of multiprocessors to speed up execution. They are also useful on a single processor, where one thread can compute while others wait for external input. Entity-life modeling is an approach for designing multithread programs. 相似文献

13.

基于多核的并行粒子滤波运动目标跟踪

王爱侠李晶皎王青王骄《计算机科学》2012,39(8):296-299

粒子滤波中大量的粒子计算使得算法的实时性较差。由于粒子滤波本身具有可并行化的特点,因此利用OpenMP多线程库派生出多个线程,将算法过程由单线程串行执行转变为多线程并行执行。用多核并行计算技术实现粒子滤波运动目标的跟踪。实验结果表明:基于多核的并行计算技术提高了粒子滤波算法的计算效率。相似文献

14.

大数据文件和混合文件的多线程并行下载

韦兴柳钟诚李智蔡德霞陈清媛《计算机工程与应用》2012,48(14):84-89

在应用计算机模拟病例训练与考试系统的过程中,客户端时常需要在线下载许多大数据文件、音频和视频混合文件,系统响应速度是一个关键问题。研究了在RIA中实现多线程的技术方案,提出在多核计算机上有效实现多线程并行下载大数据文件、音频和视频混合文件的优化方法。算法分析与实验结果表明,提出的多线程并行下载技术能够加速计算机模拟病例系统模块的在线下载,显著优化了系统运行性能。相似文献

15.

Capturing and Analyzing the Execution Control Flow of OpenMP Applications

Karl Fürlinger Shirley Moore 《International journal of parallel programming》2009,37(3):266-276

An important aspect of understanding the behavior of applications with respect to their performance, overhead, and scalability characteristics is knowledge of their execution control flow. High level knowledge of which functions or constructs were executed after which other constructs allows reasoning about temporal application characteristics such as cache reuse. This paper describes an approach to capture and visualize the execution control flow of OpenMP applications in a compact way. Our approach does not require a full trace of program execution events but is instead based on a straightforward extension to the summary data already collected by an existing profiling tool. In multithreaded applications each thread may define its own independent flow of control, complicating both the recording as well as the visualization of the execution dynamics. Our approach allows for the full flexibility with respect to independent threads. However, the most common usage models of OpenMP have threads operate in a largely uniform way, synchronizing frequently at sequence points and diverging only to operate on different data items in worksharing constructs. Our approach accounts for this by offering a simplified representation of the execution control flow for threads with similar behavior. 相似文献

16.

Synchronous C++: a language for interactive applications

Petitpierre C. 《Computer》1998,31(9):65-72

Synchronous C++ defines active objects that contain their own execution threads and can communicate with each other by means of synchronizing method calls. The author shows how to model programs in sC++ and compares sC++ with event driven programming. He focuses on examples in which the dynamic and functional models dominate and the object model is secondary. In doing so, he proposes a mapping between the elements of all three models and sC++ statements. Several other concepts have been proposed to extend OO languages to concurrency: delayed evaluations (a concept that proposes to launch each method on a separate thread, as in Actors), processes orthogonal to the objects (Ada), asynchronous channels and exceptions (Eiffel), and others. However, the researchers who proposed these solutions emphasized the improvements of speed they expected from achieving concurrency, rather than the characteristics that make concurrency suitable for the analysis and the implementation of specifications such as the ones provided by models 相似文献

17.

使用取指策略控制同时多线程处理器中个体线程的性能

孙彩霞张民选《计算机学报》2008,31(2):309-317

当前,对同时多线程(Si multaneous Multithreading,SMT)处理器取指策略的研究大都集中在总体性能的优化上.文中提出一种新颖的SMT处理器取指策略(Controlling Performance of Individual Thread,CPIT),用于控制个体线程的执行.结果表明,对于模拟的所有负载,CPIT在94%以上的情况下都能保证受控线程获得期望性能.而对于失败的情况,受控线程的平均性能偏差不超过1.25%.此外,CPIT策略对处理器总体性能的影响并不大.与ICOUNT这种以优化性能为目标的取指策略相比,总体性能的平均降低不超过3%,而除受控线程外的其他线程的性能平均只降低了1.75%. 相似文献

18.

基于Win32的多线程技术及其应用

刘红海侯向华蔡勇《计算机工程与设计》2003,24(10):113-115

在现代人机交互的通讯控制方式中，多线程技术的应用越来越广泛。讲述了现代操作系统中多线程与进程的关系，引进多线程的好处，线程间的同步和在Win32系统中如何对传统的单进程系统的改造，提高了系统的运行效率，改善用户的交互性。因为在同一个进程中的线程由于共享存储空间和文件，它们无须调用内核就可以通信，这样使得不同执行程序间的通信效率大大提高。另外多线程进程在创建新进程时，与没有使用线程的进程相比，进程创建的速度大大提高。相似文献

19.

CMP Support for Large and Dependent Speculative Threads 总被引：1，自引：0，他引：1

Colohan C.B. Ailamaki A.C. Steffan J.G. Mowry T.C. 《Parallel and Distributed Systems, IEEE Transactions on》2007,18(8):1041-1054

Thread-level speculation (TLS) has proven to be a promising method of extracting parallelism from both integer and scientific workloads, targeting speculative threads that range in size from hundreds to several thousand dynamic instructions and which have minimal dependences between them. However, recent work has shown that TLS can offer compelling performance improvements when targeting much larger speculative threads of more than 50,000 dynamic instructions per thread with many frequent data dependences between them. To support such large and dependent speculative threads, the hardware must be able to buffer the additional speculative state and must also address the more challenging problem of tolerating the resulting cross-thread data dependences. In this paper, we present a chip-multiprocessor (CMP) support for large speculative threads that integrates several previous proposals for the TLS hardware. We also present a support for subthreads: a mechanism for tolerating cross-thread data dependences by checkpointing speculative execution. Through an evaluation that exploits the proposed hardware support in the database domain, we find that the transaction response time for three of the five transactions from TPC-C (on a simulated four-processor chip-multiprocessor) speed up by a factor of 1.9 to 2.9. 相似文献

20.

基于虚拟现实的渲染优化算法

下载免费PDF全文

李媛媛罗训《计算机系统应用》2019,28(6):178-182

随着虚拟现实技术的不断发展，对虚拟场景的真实度要求也越来越提高.然而在虚拟场景中，复杂的地形、大量的植被和建筑使需要渲染的数据量大得惊人，故渲染速度成为了虚拟现实技术的一大瓶颈.现有的研究并不能很好的提升虚幻引擎中的渲染速度，还会出现“突越”和对视野外模型剔除效果差的问题.本文提出一种游戏线程与渲染线程并行和双层裁剪算法.首先在虚幻引擎中将游戏线程与渲染线程并行以提升渲染速度，然后使用淡入淡出细节层次算法进行第一层裁剪，最后使用缓慢剔除算法进行第二层裁剪，提升剔除效果.实验证明，该方法与串行线程相比渲染速度提升了40%，与传统单层裁剪算法相比，帧率也达到了55. 相似文献