期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Derin Harmanci Vincent Gramoli Pascal Felber Christof Fetzer 《Journal of Parallel and Distributed Computing》2010

Transactional Memory (TM) is a promising abstraction as it hides all synchronization complexities from the programmers of concurrent applications. More particularly, the TM paradigm operated a complexity shift from the application programming to the TM programming. Therefore, expert programmers have now started to look for the ideal TM that will bring, once-for-all, performance to all concurrent applications. Researchers have recently identified numerous issues TMs may suffer from. Surprisingly, no TMs have ever been tested in these scenarios. In this paper, we present the first to date TM testbed. We propose a framework, TMunit, that provides a domain specific language to write rapidly TM workloads so that our test-suite is easily extensible. Our reproducible semantic tests indicate through reproducible counter-examples that existing TMs do not satisfy recent consistency criteria. Our performance tests identify workloads where well-known TMs perform differently. Finally, additional tests indicate some workloads preventing contention managers from progressing. 相似文献

2.

Integrating file operations into transactional memory

Brian Demsky^{Author Vitae} 《Journal of Parallel and Distributed Computing》2011,71(10):1293-1304

Researchers have proposed transactional memory as a concurrency primitive to simplify the development of multi-threaded programs. In this paper we present a new approach for supporting I/O operations in the context of transactional memory. Our approach provides isolation between the file operations of different transactions while allowing multiple transactions to concurrently perform I/O. To ease adoption, our approach attempts to implement the traditional I/O programming interface as closely as possible. We formalize aspects of our approach and use the formalization to reason about the correctness of the approach.We have implemented our approach as a Java library and have integrated it with the DSTM2 transactional memory system. We have evaluated the approach with several benchmarks including JCarder, TupleSoup, a financial transaction benchmark, a parallel sort benchmark, and a parallel grep benchmark. Our experience shows that the approach provides a straightforward mechanism for developers to integrate I/O in a transactional memory environment and that it performs well. 相似文献

3.

Software transactional memory 总被引：1，自引：0，他引：1

Nir Shavit Dan Touitou 《Distributed Computing》1997,10(2):99-116

Summary. As we learn from the literature, flexibility in choosing synchronization operations greatly simplifies the task of designing highly concurrent programs. Unfortunately, existing hardware is inflexible and is at best on the level of a Load–Linked/Store–Conditional operation on a single word. Building on the hardware based transactional synchronization methodology of Herlihy and Moss, we offer software transactional memory (STM), a novel software method for supporting flexible transactional programming of synchronization operations. STM is non-blocking, and can be implemented on existing machines using only a Load–Linked/Store–Conditional operation. We use STM to provide a general highly concurrent method for translating sequential object implementations to non-blocking ones based on implementing a k-word compare&swap STM-transaction. Empirical evidence collected on simulated multiprocessor architectures shows that our method always outperforms the non-blocking translation methods in the style of Barnes, and outperforms Herlihy’s translation method for sufficiently large numbers of processors. The key to the efficiency of our software-transactional approach is that unlike Barnes style methods, it is not based on a costly “recursive helping” policy. Received: January 1996 / Revised: June 1996 / Accepted: August 1996 相似文献

4.

Adaptive thread mapping strategies for transactional memory applications

Márcio Castro Luís Fabrício W. Góes Jean-François Méhaut 《Journal of Parallel and Distributed Computing》2014

Transactional Memory (TM) is a programmer friendly alternative to traditional lock-based concurrency. Although it intends to simplify concurrent programming, the performance of the applications still relies on how frequent they synchronize and the way they access shared data. These aspects must be taken into consideration if one intends to exploit the full potential of modern multicore platforms. Since these platforms feature complex memory hierarchies composed of different levels of cache, applications may suffer from memory latencies and bandwidth problems if threads are not properly placed on cores. An interesting approach to efficiently exploit the memory hierarchy is called thread mapping. However, a single fixed thread mapping cannot deliver the best performance when dealing with a large range of transactional workloads, TM systems and platforms. In this article, we propose and implement in a TM system a set of adaptive thread mapping strategies for TM applications to tackle this problem. They range from simple strategies that do not require any prior knowledge to strategies based on Machine Learning techniques. Taking the Linux default strategy as baseline, we achieved performance improvements of up to 64.4% on a set of synthetic applications and an overall performance improvement of up to 16.5% on the standard STAMP benchmark suite. 相似文献

5.

Keeping it together: The role of transactional situation awareness in team performance

《International Journal of Industrial Ergonomics》2016

It has been argued that communications in teams are a means of transmitting Situation Awareness to improve performance. This study explored the frequency and types of situation awareness transactions in two groups of teams. Twelve teams were grouped into either more effective or less effective teams, based on performance measures. Distributed Situation Awareness theory predicts that Situation Awareness transaction are a medium for co-ordinating teamwork, and that more of these transaction will lead to improved performance. Differences in the frequency and type of transactions were observed between the more effective teams and the less effective teams with the former having a higher frequency of overall communications and, more importantly, a higher number of relevant situation awareness transaction types compared to less effective teams. Situation awareness transactions supported the team in making sense of the situation they found themselves in as it unfolded and enabled team members to perform their discrete tasks and therefore contribute to overall team success.Relevance to industry: Teams are a major feature of most industrial applications of work and communication play an important role in coordinating team work. Communication has been found to be linked to both team performance and situation awareness. Situation awareness is distributed in teams through transactions of information. A study was devised to explore the differences between more effective and less effective teams on a number of situation awareness transactional factors. Analysing the team as a functional unit of situation awareness is presented for future work. 相似文献

6.

Implementation tradeoffs in the design of flexible transactional memory support

Arrvindh Shriraman Sandhya Dwarkadas Michael L. Scott 《Journal of Parallel and Distributed Computing》2010

We present FlexTM (FLEXible Transactional Memory), a high performance TM framework that allows software to determine when (eagerly, lazily, or in a mixed fashion) and how to manage conflicts, while employing hardware to manage transactional state and to track conflicts. FlexTM coordinates four decoupled hardware mechanisms: read and write signatures, which summarize per-thread access sets; per-thread conflict summary tables (CSTs), which identify the processors with which conflicts have occurred; Programmable Data Isolation, which buffers speculative updates in the local cache and uses an overflow table to handle unbounded updates; and Alert-On-Update, which notifies a thread immediately when a specified location is written by another processor. The CSTs enable an STM-inspired commit protocol that manages conflicts in a decentralized manner (no global arbitration) and allows parallel commits. 相似文献

7.

A transactional runtime system for the Cell/BE architecture

Alexandro Baldassin Felipe Goldstein Rodolfo Azevedo 《Journal of Parallel and Distributed Computing》2012

Single-core architectures have hit the end of the road and industry and academia are currently exploiting new multicore design alternatives. In special, heterogeneous multicore architectures have attracted a lot of attention but developing applications for such architectures is not an easy task due to the lack of appropriate tools and programming models. We present the design of a runtime system for the Cell/BE architecture that works with memory transactions. Transactional programs are automatically instrumented by the compiler, shortening development time and avoiding synchronization mistakes usually present in lock-based approaches (such as deadlock). Experimental results conducted with a prototype implementation and the STAMP benchmark show good scalability for applications with moderate to low contention levels, and whose transactions are not too small. For those cases in which a small performance loss is admissible, we believe that the ease of programming provided by transactions greatly pays off. 相似文献

8.

Optimised memory allocation for less false abortion and better performance in hardware transactional memory

Xiuhong Li Altenbek Gulila 《International Journal of Parallel, Emergent and Distributed Systems》2020,35(4):483-491

ABSTRACT

This paper introduces and tackles a special performance hazard in Hardware Transactional Memory (HTM): false abortion. False abortion causes many unnecessary transaction abortions in HTM and can greatly impact the performance, making HTM not that useful when it is adopted as a fast path for Software Transactional Memory. By introducing a new memory allocator design, we are able to put objects that are likely to be accessed together from different threads into different cache lines and thus avoid conflicts of hardware transactions in different threads. Experiments show that our method can reduce 47% of transaction abortion and achieve a speedup of up to 1.67× (averagely 22%), yet only consume 14% more memory, showing great potential to enhance current HTM technology. 相似文献

9.

Improving performance of software transactional memory through contention locality

Ehsan Atoofian 《The Journal of supercomputing》2013,64(2):527-547

In this paper, we introduce contention locality in Transactional Memory (TM) which describes the likelihood that a previously aborted transaction conflicts again in the future. We find that conflicts are highly predictable in TMs and we propose two optimization techniques based on contention locality: The first optimization technique is Speculative Contention Avoidance (SCA). SCA dynamically controls the number of concurrently executing transactions and serializes those transactions that are likely to conflict. As such, SCA reduces contention in TMs and improves performance. The second optimization technique is Adaptive Validation (AV). We show that there is no single validation policy that works well across all applications. AV adjusts validation based on applications’ behavior and improves performance of TMs. In this paper, SCA and AV are evaluated using Transactional Locking II (TL2) and Stamp v0.9.10 benchmark suite. The evaluation reveals that SCA and AV are effective and improve performance significantly. 相似文献

10.

事务存储研究 总被引：1，自引：0，他引：1

黄国睿张平魏广博马航《计算机工程与设计》2010,31(2)

为了研究多核处理器系统上的并行编程问题,开展了对事务存储模型的研究.阐述了事务存储,介绍了事务存储系统的实现方法,利用4种事务存储系统详细阐述了事务存储的实现;重点讨论了6种影响事务存储发展的关键技术,即实现方式、数据结构组织、并发控制,冲突检测、争用管理等;提出了事务存储将向着软硬件结合、提升性能、提高正确性和满足多核应用需求的方向发展. 相似文献

11.

Good programming in transactional memory : Game theory meets multicore architecture

Raphael Eidenbenz Roger Wattenhofer 《Theoretical computer science》2011,412(32):4136-4150

In a multicore transactional memory (TM) system, concurrent execution threads interact and interfere with each other through shared memory. The less interference a thread provokes the better for the system. However, as a programmer is primarily interested in optimizing her individual code’s performance rather than the system’s overall performance, she does not have a natural incentive to provoke as little interference as possible. Hence, a TM system must be designed compatible with good programming incentives (GPI), i.e., writing efficient code for the overall system should coincide with writing code that optimizes an individual thread’s performance. We show that with most contention managers (CM) proposed in the literature so far, TM systems are not GPI compatible. We provide a generic framework for CMs that base their decisions on priorities and explain how to modify Timestamp-like CMs so as to feature GPI compatibility. In general, however, priority-based conflict resolution policies are prone to be exploited by selfish programmers. In contrast, a simple non-priority-based manager that resolves conflicts at random is GPI compatible. 相似文献

12.

Privacy preserving serial publication of transactional data

《Information Systems》2019

相似文献

13.

On the impact of serializing contention management on STM performance

Tomer Heber Danny Hendler Adi Suissa 《Journal of Parallel and Distributed Computing》2012

Transactional memory (TM) is an emerging concurrent programming abstraction. Numerous software-based transactional memory (STM) implementations have been developed in recent years. STM implementations must guarantee transaction atomicity and isolation. In order to ensure progress, an STM implementation must resolve transaction collisions by consulting a contention manager (CM). 相似文献

14.

面向数据中心的事务内存框架设计

下载免费PDF全文

孙勇《计算机工程与应用》2011,47(27):74-76

针对由计算机集群构成的云计算数据中心的特性,提出了一种基于事务内存的分布式编程框架。该框架将云计算任务封装为事务,自动完成所有事务的调度执行、负载均衡和故障恢复;将数据中心的分布式数据封装为事务对象,保证事务访问事务对象时的ACID特性。与同类研究相比,它无需用户关心程序的并行控制,具有简单易用性。该框架已在仿真环境下实现,实验结果表明它具有良好的可扩展性和容错性。相似文献

15.

The performance of message-passing using restricted virtual memory remapping

Shin-Yuan Tzou David P. Anderson 《Software》1991,21(3):251-267

DASH is a distributed operating system kernel. Message-passing (MP) is used for local communication, and the MP system uses virtual memory ( VM) remapping instead of software memory copying for moving large amounts of data between virtual address spaces. Remapping eliminates a potential communication bottleneck and may increase the feasibility of moving services such as file services to the user level. Previous systems that have used VM remapping for message transfer, however, have suffered from high per-operation delay, limiting the use of the technique. The DASH design reduces this delay by restricting the generality of remapping: a fixed part of every space is reserved for remapping, and a page's virtual address does not change when it is moved between spaces. We measured the performance of the DASH kernel for Sun 3/50 workstations, on which memory can be copied at 3·9 MB/s. Using remapping, DASH can move large messages between user spaces at a rate of 39 MB/s if they are not referenced and 24·8 MB/s if each page is referenced. Furthermore, the per-operation delay is low, so VM remapping is beneficial even for messages containing only one page. To further understand the performance of the DASH MP system, we broke an MP operation into short code segments and timed them with microsecond precision. The results show the relative costs of data movement and the other components of MP operations, and allow us to evaluate several specific design decisions. 相似文献

16.

DaSH: A benchmark suite for hybrid dataflow and shared memory programming models

《Parallel Computing》2015

The current trend in development of parallel programming models is to combine different well established models into a single programming model in order to support efficient implementation of a wide range of real world applications. The dataflow model has particularly managed to recapture the interest of the research community due to its ability to express parallelism efficiently. Thus, a number of recently proposed hybrid parallel programming models combine dataflow and traditional shared memory models. Their findings have influenced the introduction of task dependency in the OpenMP 4.0 standard.This article presents DaSH – the first comprehensive benchmark suite for hybrid dataflow and shared memory programming models. DaSH features 11 benchmarks, each representing one of the Berkeley dwarfs that capture patterns of communication and computation common to a wide range of emerging applications. DaSH also includes sequential and shared-memory implementations based on OpenMP and Intel TBB to facilitate easy comparison between hybrid dataflow implementations and traditional shared memory implementations based on work-sharing and/or tasks. Finally, we use DaSH to evaluate three different hybrid dataflow models, identify their advantages and shortcomings, and motivate further research on their characteristics. 相似文献

17.

Efficient execution of speculative threads and transactions with hardware transactional memory

《Future Generation Computer Systems》2014

Thread-level speculation (TLS) was researched to automatically parallelize portions of serial programs for execution, and transactional memory (TM) was studied as a promising alternative of lock for parallel programming due to its simplicity. Both TLS and TM require similar underlying support. In the paper, we present SeTM (sequential transactional memory), a hardware enhanced TM system which supports TLS at minor extra cost. Signature is an effective way to buffer speculative states in TM and TLS. But it cripples TM and TLS performance due to its false-positive in terms of conflict detection, especially for conflict-intensive TLS. SeTM adopts R/W bits and signature concurrently to ameliorate this bad influence. Additionally, SeTM introduces the fast rollback mechanism, which provides fast abort recovery for eager log-based HTM and TLS. The most important contribution of SeTM is the conflict-tolerant mechanism, which tolerates some ambiguous data conflicts in TLS. Finally, in order to achieve an efficient execution for these un-order transactions, we add an extra ordering mechanism for SeTM. With this ordering mechanism, the transactions in TM can also gain the performance improvement with the support of conflict-tolerant mechanism. Our evaluation major on TM and TLS separately. For the TLS applications, six representative benchmarks have been adopted to evaluate the above model. Our experimental results show that our scheme improves the execution performance of most tested codes at a modest hardware cost. For a set of important scientific loops, we report the highest speedup of 6.5 with 15 cores. Besides, experimental results also show good scalability of SeTM system. For the TM applications, with respect to LogTM-SE, the benchmarks from STAMP also gain performance improvement signally. 相似文献

18.

Kokkos: Enabling manycore performance portability through polymorphic memory access patterns

《Journal of Parallel and Distributed Computing》2014,74(12):3202-3216

The manycore revolution can be characterized by increasing thread counts, decreasing memory per thread, and diversity of continually evolving manycore architectures. High performance computing (HPC) applications and libraries must exploit increasingly finer levels of parallelism within their codes to sustain scalability on these devices. A major obstacle to performance portability is the diverse and conflicting set of constraints on memory access patterns across devices. Contemporary portable programming models address manycore parallelism (e.g., OpenMP, OpenACC, OpenCL) but fail to address memory access patterns. The Kokkos C++ library enables applications and domain libraries to achieve performance portability on diverse manycore architectures by unifying abstractions for both fine-grain data parallelism and memory access patterns. In this paper we describe Kokkos’ abstractions, summarize its application programmer interface (API), present performance results for unit-test kernels and mini-applications, and outline an incremental strategy for migrating legacy C++ codes to Kokkos. The Kokkos library is under active research and development to incorporate capabilities from new generations of manycore architectures, and to address a growing list of applications and domain libraries. 相似文献

19.

The local memory access sequence of multiple induction variables on distributed memory machines

Tsung-Chuan Huang^{Author Vitae} Liang-Cheng Shiu Author VitaeAuthor Vitae 《Computers & Electrical Engineering》2004,30(3):231-244

相似文献

20.

The effects of general privacy concerns and transactional privacy concerns on Facebook apps usage

《Information & Management》2016,53(7):868-877

This study elucidates the role of control in the context of information privacy to develop a better understanding of the interactions between general privacy concerns and transactional privacy concerns. We posit that general privacy concerns moderate the effects of information collection and profile control on transactional privacy concerns, which in turn, influence willingness to delegate profile to Facebook apps. We test the research model in the context of Facebook apps installation. Results support our propositions. Theoretical contributions and practical implications for service providers and users are discussed. 相似文献