期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Next High Performance and Low Power Flash Memory Package Structure

Jung-Hoon Lee 《计算机科学技术学报》2007,22(4):515-520

In general, NAND flash memory has advantages in low power consumption, storage capacity, and fast erase/write performance in contrast to NOR flash. But, main drawback of the NAND flash memory is the slow access time for random read operations. Therefore, we proposed the new NAND flash memory package for overcoming this major drawback. We present a high performance and low power NAND flash memory system with a dual cache memory. The proposed NAND flash package consists of two parts, i.e., an NAND flash memory module, and a dual cache module. The new NAND flash memory system can achieve dramatically higher performance and lower power consumption compared with any conventionM NAND-type flash memory module. Our results show that the proposed system can reduce about 78% of write operations into the flash memory cell and about 70% of read operations from the flash memory cell by using only additional 3KB cache space. This value represents high potential to achieve low power consumption and high performance gain. 相似文献

2.

HAT： an efficient buffer management method for flash-based hybrid storage systems

Yanfei LV ;Bin CUI ;Xuexuan CHEN ;Jing LI 《Frontiers of Computer Science in China》2014,(3):440-455

Flash solid-state drives （SSDs） provide much faster access to data compared with traditional hard disk drives （HDDs）. The current price and performance of SSD suggest it can be adopted as a data buffer between main memory and HDD, and buffer management policy in such hybrid systems has attracted more and more interest from research community recently. In this paper, we propose a novel approach to manage the buffer in flash-based hybrid storage systems, named hotness aware hit （HAT）. HAT exploits a page reference queue to record the access history as well as the status of accessed pages, i.e., hot, warm, and cold. Additionally, the page reference queue is further split into hot and warm regions which correspond to the memory and flash in general. The HAT approach updates the page status and deals with the page migration in the memory hierarchy according to the current page status and hit position in the page reference queue. Compared with the existing hybrid storage approaches, the proposed HAT can manage the memory and flash cache layers more effectively. Our empirical evaluation on benchmark traces demonstrates the superiority of the proposed strategy against the state-of-the-art competitors. 相似文献

3.

一种联合速率控制和缓冲管理的VBR视频鲁棒传输方法

张源海李凯慧许长桥孙利民《自动化学报》2008,34(3):337-343

In this paper we present an adaptive video transmission framework that integrates rate allocation and buffer control at the source with the playback adjustment mechanism at the receiver. A transmission rate is determined by a rate allocation algorithm which uses the program clock reference (PCR) embedded in the video streams to regulate the transmission rate in a refined way. The server side also maintains multiple buffers for packets of different importance levels to trade off random loss for controlled loss according to the source buffer size, the visual impact, and the playback deadline. An over-boundary playback adjustment mechanism based on proportional-integra (PI) controller is adopted at the receiver to maximize the visual quality of the displayed video according to the overall loss and the receiver buffer occupancy. The performance of our proposed framework is evaluated in terms of peak signal-to-noise ratio (PSNR) in the simulations, and the simulation results demonstrate the improvement of the average PSNR values as well as the better quality of the decoded frames. 相似文献

4.

I/O performance of an RAID-10 style parallel file system 总被引：1，自引：0，他引：1

下载免费PDF全文

DanFeng HongJiang Yi-FengZhu 《计算机科学技术学报》2004,19(6):0-0

Without any additional cost, all the disks on the nodes of a cluster can be connected together through CEFT-PVFS, an RAID-10 style parallel file system, to provide a multi-GB/s parallel I/O performance.I/O response time is one of the most important measures of quality of service for a client. When multiple clients submit data-intensive jobs at the same time, the response time experienced by the user is an indicator of the power of the cluster. In this paper, a queuing model is used to analyze in detail the average response time when multiple clients access CEFT-PVFS. The results reveal that response time is with a function of several operational parameters. The results show that I/O response time decreases with the increases in I/O buffer hit rate for read requests, write buffer size for write requests and the number of server nodes in the parallel file system, while the higher the I/O requests arrival rate, the longer the I/O response time. On the other hand, the collective power of a large cluster supported by CEFT-PVFS is shown to be able to sustain a steady and stable I/O response time for a relatively large range of the request arrival rate. 相似文献

5.

MacroTrend: A Write-Efficient Cache Algorithm for NVM-Based Read Cache

下载免费PDF全文

鲍宁柴云鹏秦啸王传雯《计算机科学技术学报》2022,37(1):207-230

The future storage systems are expected to contain a wide variety of storage media and layers due to the rapid development of NVM(non-volatile memory)techniques.For NVM-based read caches,many kinds of NVM devices cannot stand frequent data updates due to limited write endurance or high energy consumption of writing.However,traditional cache algorithms have to update cached blocks frequently because it is difficult for them to predict long-term popularity according to such limited information about data blocks,such as only a single value or a queue that reflects frequency or recency.In this paper,we propose a new MacroTrend(macroscopic trend)prediction method to discover long-term hot blocks through blocks'macro trends illustrated by their access count histograms.And then a new cache replacement algorithm is designed based on the MacroTrend prediction to greatly reduce the write amount while improving the hit ratio.We conduct extensive experiments driven by a series of real-world traces and find that compared with LRU,MacroTrend can reduce the write amounts of NVM cache devices significantly with similar hit ratios,leading to longer NVM lifetime or less energy consumption. 相似文献

6.

Approaches for constrained parametric curve interpolation 总被引：1，自引：0，他引：1

下载免费PDF全文

张彩明杨兴强汪嘉业《计算机科学技术学报》2003,18(5):0-0

The construction of a GCx cubic interpolating curve that lies on the same side of a given straight line as the data points is studied. The main task is to choose appropriate approaches to modify tangent vectors at the data points for the desired curve. Three types of approaches for changing the magnitudes of the tangent vectors axe presented. The first-type approach modifies the tangent vectors by applying a constraint to the curve segment. The second one does the work by optimization techniques. The third one is a modification of the existing method. Three criteria are presented to compare the three types of approaches with the existing method. The experiments that test the effectiveness of the approaches are included. 相似文献

7.

Two novel iterative algorithms for interference alignment with symbol extensions in the MIMO interference channel 总被引：1，自引：0，他引：1

Chao Wang Ke Deng 《中国科学:信息科学(英文版)》2014,57(4):1-14

Interference alignment(IA)with symbol extensions in the quasi-static flat-fading K-user multipleinput multiple-output(MIMO)interference channel(IC)is considered in this paper.In general,long symbol extensions are required to achieve the optimal fractional degrees of freedom(DOF).However,long symbol extensions over orthogonal dimensions produce structured(diagonal or block diagonal)channel matrices from transmitters to receivers.Most of existing approaches are limited in cases where the channels have some special structures,because they align the interference without preserving the dimensionality of the desired signal explicitly.To overcome this common drawback of most existing IA algorithms,two novel iterative algorithms for IA with symbol extensions are proposed.The first algorithm designs transceivers for IA based on the mean square error(MSE)criterion which minimizes the total MSE of the system while preserving the dimensionality of the desired signal.The novel IA algorithm is a constrained optimization problem which can be solved by Lagrangian method.Its convergence is proven as well.Utilizing the reciprocity of alignment,the second algorithm is proposed based on the maximization of the multidimensional case of the generalized Rayleigh Quotient.It maximizes each receiver’s signal to interference plus noise ratio(SINR)while preserving the dimensionality of the desired signal.In simulation results,we show the superiority of the proposed algorithms in terms of four aspects,i.e.,average sum rate,the fraction of the interfering signal power in the desired signal subspace,bit error rate(BER)and the relative power of the weakest desired data stream. 相似文献

8.

Generation and characterization of orthogonal FH sequences for the cognitive network

GUAN Lei LI Zan SI JiangBo HUANG YangChao 《中国科学:信息科学(英文版)》2015,(2):140-150

The existing orthogonal frequency hopping(FH) sequence cannot support the high throughput and high spectrum efficient cognitive FH(CFH) network due to its small family size, high computational complexity and short period. To overcome these disadvantages, this paper investigates the generation of the orthogonal FH sequence and analyzes its multiple accessibility performance based on the CFH frequency division multipleaccess(FDMA) network model. By the random mapping and cyclical shift replacement(CSR) scheme, a large family size of orthogonal FH sequence with dynamic frequency slot number is generated. In this case, the external interference could be eliminated by avoiding the interfered frequencies, and blocking mutual interference incurred for the packet by the orthogonal frequencies. Moreover, the theoretical relationships of the throughput and transmission delay with respect to the user number and the packet arrival rate are given, which shows that our proposed orthogonal FH sequence could support high throughput and short packet transmission delay in CFH-FDMA network. The simulation results validate our theoretical analysis of the CFH-FDMA network performance, and show that our proposed sequence outperforms the widely used no hit zone FH sequences in terms of uniformity, randomness, Hamming correlation, complexity and sensitivity, etc. 相似文献

9.

Optimizing random write performance of FAST FTL for NAND flash memory

GUO XuFeng WANG YuPing 《中国科学:信息科学(英文版)》2015,58(3):52-65

The NAND flash memory has gained its popularity as a storage device for consumer electronics due to its higher performance and lower power consumption.In most of these devices,an FTL(Flash Translation Layer)is adopted to emulate a block device interface to support the conventional disk-based file systems that make the flash management much easier.Among various FTLs,the FAST(Fully-Associative Sector Translation)FTL has shown superior performance,becoming one of the state-of-the-art approaches.However,the FAST FTL performs poorly while dealing with a huge number of small-sized random writes brought by upper applications such as database transaction processing workloads.The two important reasons are the absence of efficient selection schemes for the reclaiming of random log blocks that leads to large overhead of full merges,and the sequential log block scheme which no longer applies to random writes due to the large costs of partial merges.To overcome the above two defects in the presence of random writes,two techniques have been proposed.The first technique reduced full merge costs by adopting a novel random log block selection algorithm,based on the block associativity and the relevant-valid-page-amount of random log blocks as the key block selection criterion.The second technique replaced the sequential log block with a random log block to eliminate the overhead of partial merges.Experimental results showed that our optimizations can outperform FAST FTL significantly in three aspects:erase counts,page migration amount,and response time.The maximum improvement level in these cases could reach up to 66.8%,98.2%,and 51.0%,respectively. 相似文献

10.

Novel Operation Mechanism of Capacitorless DRAM Cell Using Impact Ionization and GIDL Effects

Huibin Tao Jianing Hou Zhibiao Shao 《计算机技术与应用:英文》2013,(7):351-355

A novel operation mechanism of capacitorless SOl-DRAM （silicon on insulator dynamic random access memory） cell using impact ionization and GIDL （gated-induce drain leakage） effects for write ＂1＂ operation was proposed. The conventional capacitorless DRAM cell with single charge generating effect is either high speed or low power, while the proposed DG-FinFET （double-gate fin field effect transistor） cell employs the efficient integration of impact ionization and GIDL effects by coupling the front and back gates with optimal body doping profile and proper bias conditions, yielding high speed low power performance. The simulation results demonstrate ideal characteristics in both cell operations and power consumption. Low power consumption is achieved by using GIDL current at 0. luA when the coupling between the front and back gates restrains the impact ionization current in the first phase. The write operation of the cell is within Ins attributed to significant current of the impact ionization effect in the second phase. By shortening second phase, power consumption could be further decreased. The ratio of read ＂1＂ and read ＂0＂ current is more than 9.38E5. Moreover, the cell has great retention characteristics. 相似文献

11.

NAND-SPIN-based processing-in-MRAM architecture for convolutional neural network acceleration

Yinglin ZHAO Jianlei YANG Bing LI Xingzhou CHENG Xucheng YE Xueyan WANG Xiaotao JIA Zhaohao WANG Youguang ZHANG Weisheng ZHAO 《中国科学:信息科学(英文版)》2023,(4):244-260

The performance and efficiency of running large-scale datasets on traditional computing systems exhibit critical bottlenecks due to the existing “power wall” and “memory wall” problems. To resolve those problems, processing-in-memory(PIM) architectures are developed to bring computation logic in or near memory to alleviate the bandwidth limitations during data transmission. NAND-like spintronics memory(NAND-SPIN) is one kind of promising magnetoresistive random-access memory(MRAM) with low write... 相似文献

12.

PASS: a simple,efficient parallelism-aware solid state drive I/O scheduler

Hong-yan LI Nai-xue XIONG Ping HUANG Chao GUI 《浙江大学学报:C卷英文版》2014,15(5):321-336

Emerging non-volatile memory technologies, especially flash-based solid state drives （SSDs）, have increasingly been adopted in the storage stack. They provide numerous advantages over traditional mechanically rotating hard disk drives （HDDs） and have a tendency to replace HDDs. Due to the long existence of HDDs as primary building blocks for storage systems, however, much of the system software has been specially designed for HDD and may not be optimal for non-volatile memory media. Therefore, in order to realistically leverage its superior raw performance to the maximum, the existing upper layer software has to be re-evaluated or re-designed. To this end, in this paper, we propose PASS, an optimized I/O scheduler at the Linux block layer to accommodate the changing trend of underlying storage devices toward flash-based SSDs. PASS takes the rich internal parallelism in SSDs into account when dispatching requests to the device driver in order to achieve high performance. Specifically, it parti-tions the logical storage space into fixed-size regions （preferably the component package sizes） as scheduling units. These scheduling units are serviced in a round-robin manner and for every chance that the chosen dispatching unit issues only a batch of either read or write requests to suppress the excessive mutual interference. Additionally, the requests are sorted according to their visiting addresses while waiting in the dispatching queues to exploit high sequential performance of SSD. The experimental results with a variety of workloads have shown that PASS outperforms the four Linux off-the-shelf I/O schedulers by a degree of 3%up to 41%, while at the same time it improves the lifetime significantly, due to reducing the internal write amplification. 相似文献

13.

Droplet: A virtual brush model to simulate Chinese calligraphy and painting 总被引：1，自引：0，他引：1

下载免费PDF全文

Xiao-FengMi MinTang Jin-XiangDong 《计算机科学技术学报》2004,19(3):0-0

This paper proposes a virtual brush model based on droplet operation to simulate Chinese calligraphy and traditional Chinese painting in real time. Two ways of applying droplet model to virtual calligraphy and painting are discussed in detail The second droplet model is more elaborated and can produce more vivid results while being slightly more time-consuming. The novel feature of the proposed droplet virtual brush model successfully enables the simulation painting system to overcome the poor expressional ability of virtual brush based on particle system and avoids the complex evaluation of physical brush with solid model. The model, derived from the actual calligraphy and painting experience, due to the simplicity of the droplet operation and its powerful expressive ability, considerably improves the performance of the simulation system and maintains painting effect comparable with real brush by supporting special Chinese brush effect such as dry brush, feng and stroke diffusion. 相似文献

14.

A study on Waveley Data Compression of a Real—Time Monitoring System for Large Hydraulic Machines

下载免费PDF全文

王海郑莉媛《计算机科学技术学报》2001,16(3):293-296

The general concept of data compression consists in removing the redundancy existing in data to find a more compact representation.This paper is concerned with a new method of compression using the second generation wavelets based on the lifting scheme,which is a simple but powerful wavelet construction method .It has been proved by its successful application to a real-time monitoring system of large hydraulic machines that it is a promising compression method. 相似文献

15.

CSWL: Cross-SSD Wear-Leveling Method in SSD-Based RAID Systems for System Endurance and Performance

下载免费PDF全文

杜溢墨肖侬刘芳陈志广《计算机科学技术学报》2013,28(1):28-41

Flash memory has limited erasure/program cycles.Hence,to meet their advertised capacity all the time,flashbased solid state drives(SSDs) must prolong their life span through a wear-leveling mechanism.As a very important part of flash translation layer(FTL),wear leveling is usually implemented in SSD controllers,which is called internal wear leveling.However,there is no wear leveling among SSDs in SSD-based redundant array of independent disks(RAIDs) systems,making some SSDs wear out faster than others.Once an SSD fails,reconstruction must be triggered immediately,but the cost of this process is so high that both system reliability and availability are affected seriously.We therefore propose cross-SSD wear leveling(CSWL) to enhance the endurance of entire SSD-based RAID systems.Under the workload of random access pattern,parity stripes suffer from much more updates because updating to a data stripe will cause the modification of other all related parity stripes.Based on this principle,we introduce an age-driven parity distribution scheme to guarantee wear leveling among flash SSDs and thereby prolong the endurance of RAID systems.Furthermore,age-driven parity distribution benefits performance by maintaining better load balance.With insignificant overhead,CSWL can significantly improve both the life span and performance of SSD-based RAID. 相似文献

16.

FastQueue：一种高性能的磁盘队列存储管理机制

魏青松卢显良周旭《计算机科学》2003,30(10):81-83

High reliability is the primary requirement for messaging system. Messaging system always utilizes disk queue to temporarily store message to be delivered. Experiments show that Disk queue I/O is the primary performance bottleneck in the messaging system. In this paper we present a high performance disk queue storage management mechanism-FastQueue. The FastQueue utilizes a large file to serve as disk queue to reduce file manage overhead, in which adjacent messages are stored in adjacent disk block. Several messages are written to disk queue in a one large write by Lazy Gathering Write. Several adjacent messages are read into buffer in a one read by Sequential Grouping Prcfetch. The Lazy Gathering Write and Sequential Grouping Prefetch policies take full advantage of the disk bandwidth. Experiment shows that performance of the FastQueue is more than an order of magnitude higher than that of traditional disk queue. 相似文献

17.

Predicting RNA Secondary Structure Using Profile Stochastic Context-Free Grammars and Phylogenic Analysis

下载免费PDF全文

Xiao-Yong Fang Zhi-Gang Luo and Zheng-Hua Wang 《计算机科学技术学报》2008,23(4):582-589

Stochastic context-free grammars （SCFGs） have been applied to predicting RNA secondary structure. The prediction of RNA secondary structure can be facilitated by incorporating with comparative sequence analysis. However, most of existing SCFG-based methods lack explicit phylogenic analysis of homologous RNA sequences, which is probably the reason why these methods are not ideal in practical application. Hence, we present a new SCFG-based method by integrating phylogenic analysis with the newly defined profile SCFG. The method can be summarized as： 1） we define a new profile SCFG, M, to depict consensus secondary structure of multiple RNA sequence alignment; 2） we introduce two distinct hidden Markov models, λ and λ＇, to perform phylogenic analysis of homologous RNA sequences. Here, λ＇ is for non-structural regions of the sequence and λ＇ is for structural regions of the sequence; 3） we merge λ and λ＇ into M to devise a combined model for prediction of RNA secondary structure. We tested our method on data sets constructed from the Rfam database. The sensitivity and specificity of our method are more accurate than those of the predictions by Pfold. 相似文献

18.

Video key frame extraction by unsupervised clustering and feedback adjustment 总被引：1，自引：0，他引：1

下载免费PDF全文

ZHUANG Yueting RUI Yong Thomas S.Huang 《计算机科学技术学报》1999,14(3):283-287

In video information retrieval,key frame extraction has been recognized as one of the important research issues.Although much progress has been made,the existing approaches are either computationally expensive or ineffective in capturing salient visual content.In this paper,we first discuss the importance of key frame extraction and then briefly review and evaluate the existing approaches.To overcome the shortcominge of the existing approaches,we introduce a new algorithm for key frame extraction based on unsupervised clustring.Meanwhile,we provide a feedback chain to adjust the granularity of the extraction result.The proposed algorithm is both computationally simple and able to capture the visual content.The efficiency and effectiveness are validated by large amount of real-world videos. 相似文献

19.

全局信息不全的动态调度问题基于虚拟调度的两级滚动方法

王冰席裕庚《自动化学报》2006,32(1):9-14

This paper addresses the single-machine scheduling problem with release times mini-mizing the total completion time. Under the circumstance of incomplete global information at each decision time, a two-level rolling scheduling strategy (TRSS) is presented to create the global schedule step by step. The estimated global schedules are established based on a dummy schedule of unknown jobs. The first level is the preliminary scheduling based on the predictive window and the second level is the local scheduling for sub-problems based on the rolling window. Performance analysis demonstrates that TRSS can improve the global schedules. Computational results show that the solution quality of TRSS outperforms that of the existing rolling procedure in most cases. 相似文献

20.

Automatic mesh generation on a regular background grid

下载免费PDF全文

刘剑飞《计算机科学技术学报》2002,17(6):0-0

This paper presents and automatic mesh generation procedure on a 2D domain based on a regular background grid.The idea is to devise a robust mesh generation scheme with equal emphasis on quality and efficiency,Instead of using a traditional regular rectangular grid,a mesh of equilateral triangles is employed to ensure triangular element of the best quality will be preserved in the interior of the domain.As for the boundary,it is to be generated by a node/segment insertion process.Nodes are inserted into the background mesh one by one following the sequence of the domain boundary.The local strcture of the mesh is modified based on the Delaunay criterion with the introduction of each node.Those boundary segments.which are not produced in the phase of node insertion,will be recovered through a systematic element swap produced in the phase of node insertion will be recovered through a systematic element swap process.Two theorems will be presented and proved to set up the theoretical basic of the boundary recovery part.Examples will be presented to demonstrate the robustness and the quality of the mesh generated by the proposed technique. 相似文献