期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

周权彪张兴军梁宁静霍文洁董小社《计算机研究与发展》2018,55(5):1065-1077

经典的闪存转换层(flash translation layer, FTL)地址映射方法DFTL(demand-based FTL)将全局映射信息放在闪存中,仅缓存最近最常使用的映射信息,解决了页级映射策略中映射信息较大和缓存容量有限的矛盾.但是,DFTL没有充分利用负载的空间局部性特点提高缓存命中率;在缓存失效时频繁的脏映射项换出也会导致大量的映射页写操作;此外,它未能优化垃圾回收过程中有效页迁移导致的写放大问题.针对上述不足,提出一种基于缓存映射项重用距离的地址映射方法IRR-FTL(inter-reference recency-based FTL),通过设置映射页缓存槽,充分挖掘负载空间局部性;基于缓存映射项重用距离实现负载自适应的写缓存映射表冷热分区,并分别采取不同的管理策略,减少映射页写操作;此外,实现基于重用距离的冷热数据分离存储,提高垃圾回收效率.通过采用多种负载对该方法进行验证实验,实验结果表明IRR-FTL相比DFTL缓存命中率提高29.1%,平均响应时间降低了27.3%,擦除次数降低了10.7%. 相似文献

2.

大数据时代与数据重用

《信息与电脑》2018,(5)

随着时间的推移和社会的变革,人们的生活质量已有了很大程度的提高,信息时代的来临使人们正式进入了"大数据时代"。当前,无论是人们的生活,还是工作与学习,都离不开数据的收集和处理,这对互联网技术的数据重用提出了新的要求。原有的数据处理系统已不能完全满足人们的个性化需求,因此,为了更好提高工作效率,满足客户需求,技术人员开始将研究重点放在数据重用上。相似文献

3.

集群下Cholesky分解的数据重用算法

刘凤刘青昆《微计算机应用》2011,32(2):15-20

核外计算中,由于磁盘I/O操作特点是启动开销大,所以对文件的访问时间占的比例较大。如果能减少读取文件操作的次数则可以大幅度地提高运行效率。数据重用是一种有效的减少I/O操作次数的技术。本文将数据分成几个文件,然后将本次Cholesky分解完毕的文件继续的留在内存缓冲区中。当对下一个文件进行分解时,可用上一个刚分解完的文件进行数据的更新。这样就减少了读取数据的I/O操作次数,从而提高了分解效率。相似文献

4.

基于数据重用思想的CMA肓均衡算法

《计算机工程与应用》2008,44(23)

相似文献

5.

一种基于重用距离预测与流检测的高速缓存替换算法

林隽民王炜乔林汤志忠《计算机研究与发展》2012,49(5):1049-1060

传统的缓存替换算法由于不能适应应用程序的流式访问行为而导致缓存性能不佳.设计基于周期检测的预测方法,分析程序访存重用距离的规律性和流式访问的复杂性,提出用重用距离预测能同时适应简单流和复杂流访问模式的RDP算法.RDP的基本思想是预测重用距离并动态维护重用距离计数,动态调整缓存数据的替换顺序,通过流采样缩减存储开销.实验结果表明,RDP算法能够很好地适应程序中多样化的流访问模式,其总体性能优于LRU算法和DIP算法,在32MB缓存上比传统LRU算法平均减少了27.5%的缓存缺失. 相似文献

6.

基于交换机制的环境数据重用方法研究

谢孔树黄晓冬温玮王存仁《计算机应用与软件》2012,29(7):53-55,59

综合自然环境建模与仿真是当前军事建模与仿真领域研究的热点与难点,而在不同应用领域里重用(SNE)数据则是发展趋势。首先,从综合环境数据层次入手,研究基于交换机制的数据重用方法的基本原理和处理过程;然后,通过具体分析SEDRIS(Synthetic Environmental Data Representation and Interchange Specification)来阐述上述方法的可行性与实用性;最后,根据SEDRIS应用上的不足,分析数据交换方法今后的发展方向,为建模与仿真领域的数据重用提供了一些新的思路和见解。相似文献

7.

基于数据重用思想的CMA盲均衡算法

刘世刚葛临东《计算机工程与应用》2008,44(23):105-106

提出一种适用于短突发信号的CMA盲均衡算法。算法基于数据重用思想,使得CMA盲均衡算法在计算复杂度基本不变的情况下,收敛时间大大减少。详细描述了算法的数学表达,并对其性能进行了分析。最后的仿真数据证明,该算法对于短突发信号的盲均衡确实具有较高的实用价值。相似文献

8.

基于产品机制的环境数据重用方法研究

谢孔树黄晓冬王存仁温玮《计算机技术与发展》2012,(4)

综合自然环境建模与仿真是当前军事建模与仿真领域研究的热点与难点,而在不同应用领域的 SNE 数据的重用则是一个发展趋势.首先,从综合环境数据层次入手,研究了基于数据产品机制的数据重用方法的基本原理和处理方法,并分析设计了该数据产品机制的仿真实现方式;然后,通过具体分析 CDB(Common DataBases),来阐述了上述方法的可行性与实用性;最后,根据 CDB 在应用上的不足,分析了数据产品机制今后的发展方向,为建模与仿真领域的数据重用提供了一些新的见解相似文献

9.

面向对象技术和基于数据驱动的软件重用 总被引：4，自引：0，他引：4

王忠群《计算机应用》1999,19(8):11-13

本文阐述了信息系统开发中的一种软件重用技术－数据驱动机制,并通过实例证明这种机制的有效性。相似文献

10.

基于马氏距离的异构网络异常大数据剔除方法

董彦佼李泽峰陈小海《计算机仿真》2022,(1):408-411,445

传统异构网络异常大数据剔除方法存在数据维度较高、噪声较明显问题,导致异常数据剔除率偏低,且方法精度也不够理想.研究提出基于马氏距离的异构网络异常大数据剔除方法.利用改进马氏距离降维处理异构网络数据,分析数据之间相关性,提取网络数据主成分,生成具有较强抗噪性的高斯加权核函数.通过降维处理后的网络数据构建异常大数据信息流模... 相似文献

11.

一种基于SLP的新型编译框架

张素平王冬丁丽丽王鹏翔宫一于海宁《计算机应用研究》2017,34(1)

对于SLP算法不能高效处理并行代码占有率较小的大型应用程序的问题,本文提出并评估了一种新型的基于改进的SLP(Superword level parallel)算法的编译框架。它主要包括三个阶段,首先,将代码中的结构相似的异构语句通过改进的SLP算法尽可能的改为同构语句;然后,用全局的观点,在优化目标代码之前获取其数据模型重用;最后,联合数据布局优化进行进一步的性能提升。本文就此框架做了大量实验,实验结果表明本框架比SLP算法性能更佳,优于它约15.3%。相似文献

12.

Improving Memory Traffic by Assembly-Level Exploitation of Reuses for Vector Registers 总被引：1，自引：0，他引：1

Chang Chih-Yung Chen Tzung-Shi Sheu Jang-Ping 《The Journal of supercomputing》2000,17(2):187-204

In this paper, we propose a compilation scheme to analyze and exploit the implicit reuses of vector register data. According to the reuse analysis, we present a translation strategy that translates the vectorized loops into assembly vector codes with exploitation of vector reuses. Experimental results show that our compilation technique can improve the execution time and traffic between shared memory and vector registers. Techniques discussed here are simple, systematic, and easy to be implemented in the conventional vector compilers or translators to enhance the data locality of vector registers. 相似文献

13.

Yu‐Te Lin Jenq‐Kuen Lee 《Concurrency and Computation》2016,28(5):1629-1654

Multi‐core systems equipped with micro processing units and accelerators such as digital signal processors (DSPs) and graphics processing units (GPUs) have become a major trend in processor design in recent years in attempts to meet ever‐increasing application performance requirements. Open Computing Language (OpenCL) is one of the programming languages that include new extensions proposed to exploit the computing power of these kinds of processors. Among the newly extended language features, the single‐instruction multiple‐data (SIMD) linguistics and vector types are added to OpenCL to exploit hardware features of the accelerators. The addition makes it necessary to consider how traditional compiler data flow analysis can be adopted to meet the optimization requirements of vector linguistics. In this paper, we propose a calculus framework to support the data flow analysis of vector constructs for OpenCL programs that compilers can use to perform SIMD optimizations. We model OpenCL vector operations as data access functions in the style of mathematical functions. We then show that the data flow analysis for OpenCL vector linguistics can be performed based on the data access functions. Based on the information gathered from data flow analysis, we illustrate a set of SIMD optimizations on OpenCL programs. The experimental results incorporating our calculus and our proposed compiler optimizations show that the proposed SIMD optimizations can provide average performance improvements of 22% on x86 CPUs and 4% on advanced micro devices GPUs. For the selected 15 benchmarks, 11 of them are improved on x86 CPUs, and six of them are improved on advanced micro devices GPUs. The proposed framework has the potential to be used to construct other SIMD optimizations on OpenCL programs. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

14.

基于复用距离的cache失效率分析

付雄张昱陈意云《小型微型计算机系统》2006,27(9):1777-1781

复用距离已经成为程序cache行为的一种重要度量标准，但高复杂度和可能的内存溢出问题使得其难以应用．本文在引入最大cache大小的基础上提出一种受限的复用距离分析方法．该方法有效地避免了一般复用距离分析可能导致的内存溢出问题，同时使得复用距离分析达到线性时间复杂度．文章通过对一系列整数和浮点程序的实验说明基于该复用距离分析的cache失效率分析的可行性和正确性．相似文献

15.

Performance Metrics and Models for Shared Cache

下载免费PDF全文

丁晨 ;向晓娅 ;包斌 ;罗昊 ;罗英伟 ;汪小林《计算机科学技术学报》2014,29(4):692-712

Performance metrics and models are prerequisites for scientific understanding and optimization. This paper introduces a new footprint-based theory and reviews the research in the past four decades leading to the new theory. The review groups the past work into metrics and their models in particular those of the reuse distance, metrics conversion, models of shared cache, performance and optimization, and other related techniques. 相似文献

16.

多核程序交互理论及应用

丁晨袁良《计算机工程与科学》2014,36(1):1-5

多核处理器上共享缓存使用效率,即程序局部性是影响并行程序性能的关键因素之一。提出了以足迹为基础的局部性理论。介绍了缺失率、重用距离和足迹之间的转化关系,并利用足迹可组合性特征建立了并行程序局部性预测模型。相似文献

17.

奇偶合并排序的数据级并行实现

张珂良李佳佳陈钢吴百锋《小型微型计算机系统》2012,33(6):1343-1349

针对奇偶合并排序中存在的巨大数据级并行性潜力,通过将其实现于提供了强大数据级并行性的GPU处理器之上而获取较高的加速比.同时,针对OpenCL不支持各工作组间的工作线程的同步问题,提出两种解决方法,一种是通过主机程序控制迭代过程,从而完全避免所有工作线程对于同步操作的需求;另一种是通过桶划分预处理技术将对于同步操作的需求控制在单个工作组,然后利用单个工作组提供的各工作线程间的同步机制以正确的处理同步操作.实验结果表明,按照本文方法实现的程序性能相对于C++STL库中的sort实现有着明显的提高. 相似文献

18.

总被引：1，自引：0，他引：1

M.J. Harvey G. De Fabritiis 《Computer Physics Communications》2011,(4):1093-1099

The use of modern, high-performance graphical processing units (GPUs) for acceleration of scientific computation has been widely reported. The majority of this work has used the CUDA programming model supported exclusively by GPUs manufactured by NVIDIA. An industry standardisation effort has recently produced the OpenCL specification for GPU programming. This offers the benefits of hardware-independence and reduced dependence on proprietary tool-chains. Here we describe a source-to-source translation tool, “Swan” for facilitating the conversion of an existing CUDA code to use the OpenCL model, as a means to aid programmers experienced with CUDA in evaluating OpenCL and alternative hardware. While the performance of equivalent OpenCL and CUDA code on fixed hardware should be comparable, we find that a real-world CUDA application ported to OpenCL exhibits an overall 50% increase in runtime, a reduction in performance attributable to the immaturity of contemporary compilers. The ported application is shown to have platform independence, running on both NVIDIA and AMD GPUs without modification. We conclude that OpenCL is a viable platform for developing portable GPU applications but that the more mature CUDA tools continue to provide best performance.

Program summary

Program title: SwanCatalogue identifier: AEIH_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEIH_v1_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: GNU Public License version 2No. of lines in distributed program, including test data, etc.: 17 736No. of bytes in distributed program, including test data, etc.: 131 177Distribution format: tar.gzProgramming language: CComputer: PCOperating system: LinuxRAM: 256 MbytesClassification: 6.5External routines: NVIDIA CUDA, OpenCLNature of problem: Graphical Processing Units (GPUs) from NVIDIA are preferentially programed with the proprietary CUDA programming toolkit. An alternative programming model promoted as an industry standard, OpenCL, provides similar capabilities to CUDA and is also supported on non-NVIDIA hardware (including multicore ×86 CPUs, AMD GPUs and IBM Cell processors). The adaptation of a program from CUDA to OpenCL is relatively straightforward but laborious. The Swan tool facilitates this conversion.Solution method:Swan performs a translation of CUDA kernel source code into an OpenCL equivalent. It also generates the C source code for entry point functions, simplifying kernel invocation from the host program. A concise host-side API abstracts the CUDA and OpenCL APIs. A program adapted to use Swan has no dependency on the CUDA compiler for the host-side program. The converted program may be built for either CUDA or OpenCL, with the selection made at compile time.Restrictions: No support for CUDA C++ featuresRunning time: Nominal 相似文献

19.

高性能逻辑文件系统设计与实现

赵奕唐荣锋陈欢熊劲马捷《计算机工程》2008,34(6):74-76

服务器端文件系统不仅需要很大的容量,而且要为大量并发访问提供很高的I/O性能。该文提出一种把多个物理文件系统通过软件集成为一个逻辑文件系统的技术,很好地聚合了各个文件系统所在磁盘设备的带宽和容量,综合了不同文件系统在元数据和数据处理性能上的优势。性能测试结果表明,逻辑文件系统技术是一种构造支持高度并发访问的高性能文件系统的有效方法。相似文献

20.

分布存储并行机上的自动数据布局优化模型

下载免费PDF全文

谢幸陈国良武继刚《计算机研究与发展》2000,37(10):1173-1178

在分布式并行机上,数据布局的质量极大的影响着应用程序的执行性能,以往的研究一般将自动数据布局优化问题近似分解为数据对准优化和数据分布优化两步来解决,且对两者的结合只研究了一维的情况,在相关研究工作的基础上,在多维情况下将数据对准优化和数据分布优化结合在一个模型当中,提出了一个数据对准优化与数据分布优化统一的多维静态数据布局模型,避免了采用启发式策略,从而更加精确地描述了自动数据布局优化问题,同时给相似文献