首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   18篇
  免费   2篇
化学工业   3篇
机械仪表   1篇
自动化技术   16篇
  2022年   1篇
  2018年   3篇
  2017年   1篇
  2016年   3篇
  2015年   1篇
  2014年   1篇
  2013年   1篇
  2012年   1篇
  2010年   3篇
  2008年   3篇
  2005年   1篇
  2004年   1篇
排序方式: 共有20条查询结果,搜索用时 31 毫秒
1.
In the search for new paradigms to simplify multithreaded programming, Transactional Memory (TM) is currently being advocated as a promising alternative to deadlock-prone lock-based synchronization. In this way, future many-core CMP architectures may need to provide hardware support for TM. On the other hand, power dissipation constitutes a first class consideration in multicore processor designs. In this work, we propose Selective Dynamic Serialization (SDS) as a new technique to improve energy consumption without degrading performance in applications with conflicting transactions by avoiding wasted work due to aborted transactions. Our proposal, which is implemented on top of a hardware transactional memory (HTM) system with an eager conflict management policy, detects and serializes conflicting transactions dynamically (at run-time). In its simplest form, in case of conflict, one transaction is allowed to continue whilst the rest are completely stalled. Once the executing transaction has finished, it wakes up several of the stalling transactions. More elaborated implementations of SDS try to delay this behavior until serialization of transactions is profitable, achieving the best trade-off between performance, energy savings and network traffic. SDS implementations differ from each other in the condition that triggers the serialization mode. We have evaluated several SDS schemes using GEMS, a full-system simulator implementing the LogTM-SE Eager–Eager HTM system, and several benchmarks from the STAMP suite. Results for a 16-core CMP show that SDS obtains reductions of 6 % on average in energy consumption (more than 20 % in high contention scenarios) in a wide range of benchmarks without affecting, on average, execution time. At the same time, network traffic level is also reduced by 22 %.  相似文献   
2.
In glueless shared-memory multiprocessors where cache coherence is usually maintained using a directory-based protocol, the fast access to the on-chip components (caches and network router, among others) contrasts with the much slower main memory. Unfortunately, directory-based protocols need to obtain the sharing status of every memory block before coherence actions can be performed. This information has traditionally been stored in main memory, and therefore these cache coherence protocols are far from being optimal. In this work, we propose two alternative designs for the last-level private cache of glueless shared-memory multiprocessors: the lightweight directory and the SGluM cache. Our proposals completely remove directory information from main memory and store it in the home node’s L2 cache, thus reducing both the number of accesses to main memory and the directory memory overhead. The main characteristics of the lightweight directory are its simplicity and the significant improvement in the execution time for most applications. Its drawback, however, is that the performance of some particular applications could be degraded. On the other hand, the SGluM cache offers more modest improvements in execution time for all the applications by adding some extra structures that cope with the cases in which the lightweight directory fails.  相似文献   
3.
Although directory-based cache-coherence protocols are the best choice when designing chip multiprocessors with tens of cores on-chip, the memory overhead introduced by the directory structure may not scale gracefully with the number of cores. Many approaches aimed at improving the scalability of directories have been proposed. However, they do not bring perfect scalability and usually reduce the directory memory overhead by compressing coherence information, which in turn results in extra unnecessary coherence messages and, therefore, wasted energy and some performance degradation. In this work, we present a distributed directory organization based on duplicate tags for tiled CMP architectures whose size is independent on the number of tiles of the system up to a certain number of tiles. We demonstrate that this number of tiles corresponds to the number of sets in the private caches. Additionally, we show that the area overhead of the proposed directory structure is 0.56% with respect to the on-chip data caches. Moreover, the proposed directory structure keeps the same information than a non-scalable full-map directory. Finally, we propose a mechanism that takes advantage of this directory organization to remove the network traffic caused by replacements. This mechanism reduces total traffic by 15% for a 16-core configuration compared to a traditional directory-based protocol.  相似文献   
4.
High performance processor designs have evolved toward architectures that integrate multiple processing cores on the same chip. As the number of cores inside a Chip MultiProcessor (CMP) increases, the interconnection network will have significant impact on both overall performance and energy consumption as previous studies have shown. Moreover, wires used in such interconnect can be designed with varying latency, bandwidth and power characteristics. In this work, we show how messages can be efficiently managed in tiled CMP, from the point of view of both performance and energy, by combining both address compression with a heterogeneous interconnect. In particular, our proposal is based on applying an address compression scheme that dynamically compresses the addresses within coherence messages allowing for a significant area slack. The arising area is exploited for wire latency improvement by using a heterogeneous interconnection network comprised of a small set of very-low-latency wires for critical short-messages in addition to baseline wires. Detailed simulations of a 16-core CMP show that our proposal obtains average improvements of 10% in execution time and 38% in the energy-delay2 product of the interconnect. Additionally, the sensitivity analysis shows that our proposal performs well when either OoO cores or caches with higher latencies are considered.  相似文献   
5.
It is widely accepted that transient failures will appear more frequently in chips designed in the near future due to several factors such as the increased integration scale. On the other hand, chip-multiprocessors (CMP) that integrate several processor cores in a single chip are nowadays the best alternative to more efficient use of the increasing number of transistors that can be placed in a single die. Hence, it is necessary to design new techniques to deal with these faults to be able to build sufficiently reliable chip multiprocessors (CMPs). In this work, we present a coherence protocol aimed at dealing with transient failures that affect the interconnection network of a CMP, thus assuming that the network is no longer reliable. In particular, our proposal extends a token-based cache coherence protocol so that no data can be lost and no deadlock can occur due to any dropped message. Using GEMS full system simulator, we compare our proposal against TokenCMP. We show that in absence of failures our proposal does not introduce overhead in terms of increased execution time over TokenCMP. Additionally, our protocol can tolerate message loss rates much higher than those likely to be found in the real world without increasing execution time more than 15 percent.  相似文献   
6.
Air may be easily incorporated by vigorous mechanical stirring, with the help of surfactants, of activated geopolymer‐yielding suspensions. The cellular structure is stabilized by the viscosity increase caused by curing reactions, configuring an “inorganic gel casting”. The present paper is aimed at extending this approach to mullite foams, obtained by the thermal treatment of engineered alkali activated suspensions. “Green” foams were first obtained by gel casting of a suspension for Na‐geopolymer enriched with reactive γ‐Al2O3 powders. Sodium was later extracted by ionic exchange with ammonium salts. In particular, the removal of Na+ ions was achieved by immersion in ammonium nitrate solution overnight, with retention of the cellular structure. Finally, the ion‐exchanged foams were successfully converted into pure mullite foams by application of a firing treatment at 1300°C, for 1 hour. Preliminary results concerning the extension of the concept to mullite three‐dimensional scaffolds are presented as well.  相似文献   
7.
The Journal of Supercomputing - Hardware transactional memory emerged to make parallel programming more accessible. However, the performance pitfall of this technique is squashing speculatively...  相似文献   
8.
9.
One important issue the designer of a scalable shared-memory multiprocessor must deal with is the amount of extra memory required to store the directory information. It is desirable that the directory memory overhead be kept as low as possible, and that it scales very slowly with the size of the machine. Unfortunately, current directory architectures provide scalability at the expense of performance. This work presents a scalable directory architecture that significantly reduces the size of the directory for large-scale configurations of a multiprocessor without degrading performance. First, we propose multilayer clustering as an effective approach to reduce the width of directory entries. Based on this concept, we derive three new compressed sharing codes, some of them with a space complexity of O(log/sub 2/(log/sub 2/(N))) for an N-node system. Then, we present a novel two-level directory architecture to eliminate the penalty caused by compressed directories in general. The proposed organization consists of a small full-map first-level directory (which provides precise information for the most recently referenced lines) and a compressed second-level directory (which provides in-excess information for all the lines). The proposals are evaluated based on extensive execution-driven simulations (using RSIM) of a 64-node cc-NUMA multiprocessor. Results demonstrate that a system with a two-level directory architecture achieves the same performance as a multiprocessor with a big and nonscalable full-map directory, with a very significant reduction of the memory overhead.  相似文献   
10.
The stabilization of inorganic waste of various nature and origin, in glasses, has been a key strategy for environmental protection for the last decades. When properly formulated, glasses may retain many inorganic contaminants permanently, but it must be acknowledged that some criticism remains, mainly concerning costs and energy use. As a consequence, the sustainability of vitrification largely relies on the conversion of waste glasses into new, usable and marketable glass‐based materials, in the form of monolithic and cellular glass‐ceramics. The effective conversion in turn depends on the simultaneous control of both starting materials and manufacturing processes. While silica‐rich waste favours the obtainment of glass, iron‐rich wastes affect the functionalities, influencing the porosity in cellular glass‐based materials as well as catalytic, magnetic, optical and electrical properties. Engineered formulations may lead to important reductions of processing times and temperatures, in the transformation of waste‐derived glasses into glass‐ceramics, or even bring interesting shortcuts. Direct sintering of wastes, combined with recycled glasses, as an example, has been proven as a valid low‐cost alternative for glass‐ceramic manufacturing, for wastes with limited hazardousness. The present paper is aimed at providing an up‐to‐date overview of the correlation between formulations, manufacturing technologies and properties of most recent waste‐derived, glass‐based materials. © 2016 The Authors. Journal of Chemical Technology & Biotechnology published by John Wiley & Sons Ltd on behalf of Society of Chemical Industry.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号