期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Fast filtering false active subspaces for efficient high dimensional similarity processing

GuoRen Wang Ge Yu JunChang Xin YuHai Zhao EnDe Zhang 《中国科学F辑(英文版)》2009,52(2):286-294

The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corre-sponding to the request. However,some active query subspaces may contain no query results at all,those are called false active query subspaces. It is obvious that the performance of query processing degrades in the presence of false active query subspaces. Our experiments show that this problem becomes seriously when the data are high dimensional and the number of accesses to false active sub-spaces increases as the dimensionality increases. In order to solve this problem,this paper proposes a space mapping approach to reducing such unnecessary accesses. A given query space can be re-fined by filtering within its mapped space. To do so,a mapping strategy called maxgap is proposed to improve the efficiency of the refinement processing. Based on the mapping strategy,an index structure called MS-tree and algorithms of query processing are presented in this paper. Finally,the performance of MS-tree is compared with that of other competitors in terms of range queries on a real data set. 相似文献

2.

Next High Performance and Low Power Flash Memory Package Structure

下载免费PDF全文

Jung-Hoon Lee 《计算机科学技术学报》2007,22(4):515-520

In general, NAND flash memory has advantages in low power consumption, storage capacity, and fast erase/write performance in contrast to NOR flash. But, main drawback of the NAND flash memory is the slow access time for random read operations. Therefore, we proposed the new NAND flash memory package for overcoming this major drawback. We present a high performance and low power NAND flash memory system with a dual cache memory. The proposed NAND flash package consists of two parts, i.e., an NAND flash memory module, and a dual cache module. The new NAND flash memory system can achieve dramatically higher performance and lower power consumption compared with any conventionM NAND-type flash memory module. Our results show that the proposed system can reduce about 78% of write operations into the flash memory cell and about 70% of read operations from the flash memory cell by using only additional 3KB cache space. This value represents high potential to achieve low power consumption and high performance gain. 相似文献

3.

Void defect detection in ball grid array X-ray images using a new blob filter

Hyun Do NAM 《浙江大学学报:C卷英文版》2012,(11):840-849

Ball grid arrays (BGAs) have been used in the production of electronic devices/assemblies because of their advantages of small size, high I/O port density, etc. However, BGA voids can degrade the performance of the board and cause failure. In this paper, a novel blob filter is proposed to automatically detect BGA voids presented in X-ray images. The proposed blob filter uses the local image gradient magnitude and thus is not influenced by image brightness, void position, or component interference. Different sized average box filters are employed to analyze the image in multi-scale, and as a result, the proposed blob filter is robust to void size. Experimental results show that the proposed method obtains void detection accuracy of up to 93.47% while maintaining a low false ratio. It outperforms another recent algorithm based on edge detection by 40.69% with respect to the average detection accuracy, and by 16.91% with respect to the average false ratio. 相似文献

4.

A memristor-based architecture combining memory and image processing

Jing Zhou XueJun Yang JunJie Wu Xuan Zhu XuDong Fang Da Huang 《中国科学:信息科学(英文版)》2014,57(5):1-12

Image processing is a type of memory-access-intensive application and is applied in many fields.Logic operations are very simple ones in image processing.During these operations,memory access takes a majority of the total time consumed,which puts a great pressure on memory access speed and bandwidth.However,in traditional von Neumann architecture,memory access is the inherent bottleneck of the system;that is,the speed of memory’s data supply is far lower than the data request of processor.Memristor is considered to be the fourth circuit element after resistor,capacitor and inductor.It has the capacity of both processing and memory,which supplies a new idea for solving the"memory wall"problem.In this paper,memristor is used to build an architecture combining computing and memory,where the memory has the ability to handle some simple image processing operations.This architecture can reduce readings and writings of memory effectively,which saves memory bandwidth thus improving the efficiency of the system.Logic operations of images are considered in this paper to validate the architecture.The experimental results and theoretical analysis indicate that the architecture can reduce memory access effectively. 相似文献

5.

Unknown Malicious Document Detection Based on Global Behavior FeatureCSCD

下载免费PDF全文

《信息安全学报》2017,(收录汇总):96-108

Compared with malicious office documents based on macros, malicious office documents based on vulnerability exploitation often do not need target interaction in the attack process, and can complete the attack without target perception. It has become an important means of Advanced Persistent Threat (APT) attack. Therefore, detecting malicious documents based on vulnerability exploitation, especially unknown vulnerability exploitation, plays an important role in discovering APT attacks. The current malicious document detection methods mainly focus on PDF documents. It is mainly divided into two categories: static analysis and dynamic analysis. Static analysis is easy to be evaded by hackers, and can not discovery exploits triggered by remote payload. Dynamic analysis only considers the behaviors of the JavaScript in PDF or document reader’s process, ignoring the indirect attacks against other processes of the system, leads to a detection blind spot. To solve the above problems, we analyze the attack surface of malicious Office documents, come up with a threat model and implement an unknown malicious document detection method based on global behavior feature. In the process of document processing, the whole system behavior features are extracted, and only benign document samples are trained to form a behavior feature database for malicious document detection. In order to reduce false alarm rate, we introduce sensitive behavioral feature in detection. In this paper, 522 benign documents including DOCX, RTF and DOC are trained to obtain the behavior feature database, and then 2088 benign document samples and 211 malicious document samples are tested. Of these, 10 malicious samples are manually crafted to simulate several typical attack scenarios. The experimental results show that this method can detect all malicious samples with a very low false positive rate (0.14%) and is able to detect malicious documents that exploit unknown vulnerabilities. Further experiments show that this method can also be used to detect malicious documents exploiting WPS office software. © 2023 Chinese Academy of Sciences. All rights reserved. 相似文献

6.

Sparse representation based on projection method in online least squares support vector machines 总被引：1，自引：0，他引：1

Lijuan LI Hongye SU Jian CHU 《控制理论与应用(英文版)》2009,7(2):163-168

A sparse approximation algorithm based on projection is presented in this paper in order to overcome the limitation of the non-sparsity of least squares support vector machines （LS-SVM）. The new inputs are projected into the subspace spanned by previous basis vectors （BV） and those inputs whose squared distance from the subspace is higher than a threshold are added in the BV set, while others are rejected. This consequently results in the sparse approximation. In addition, a recursive approach to deleting an exiting vector in the BV set is proposed. Then the online LS-SVM, sparse approximation and BV removal are combined to produce the sparse online LS-SVM algorithm that can control the size of memory irrespective of the processed data size. The suggested algorithm is applied in the online modeling of a pH neutralizing process and the isomerization plant of a refinery, respectively. The detailed comparison of computing time and precision is also given between the suggested algorithm and the nonsparse one. The results show that the proposed algorithm greatly improves the sparsity just with little cost of precision. 相似文献

7.

Energy Efficiency of a Multi-Core Processor by Tag Reduction

下载免费PDF全文

郑龙董冕雄 Kaoru Ota 金海马俊《计算机科学技术学报》2011,26(3):491-503

We consider the energy saving problem for caches on a multi-core processor.In the previous research on low power processors,there are various methods to reduce power dissipation.Tag reduction is one of them.This paper extends the tag reduction technique on a single-core processor to a multi-core processor and investigates the potential of energy saving for multi-core processors.We formulate our approach as an equivalent problem which is to find an assignment of the whole instruction pages in the physical memory to a set of cores such that the tag-reduction conflicts for each core can be mostly avoided or reduced.We then propose three algorithms using different heuristics for this assignment problem.We provide convincing experimental results by collecting experimental data from a real operating system instead of the traditional way using a processor simulator that cannot simulate operating system functions and the full memory hierarchy.Experimental results show that our proposed algorithms can save total energy up to 83.93% on an 8-core processor and 76.16% on a 4-core processor in average compared to the one that the tag-reduction is not used for.They also significantly outperform the tag reduction based algorithm on a single-core processor. 相似文献

8.

Perturbation analysis for the normalized Laplacian matrices in the multiway spectral clustering method

SuMuYa Borjigin ChongHui Guo 《中国科学:信息科学(英文版)》2014,57(11):1-17

In this paper, we present a perturbation analysis for the matrices in the multiway normalized cut spectral clustering method based on the matrix perturbation theory. The analytical results show that the eigenvalues and the eigenspaces of the normalized Laplacian matrices are continuous. Therefore, clustering algorithms can be designed according to the special properties of the normalized Laplacian matrices in the ideal case and the method can be extended to the general case based on the continuity of the eigenvalues and the eigenspaces of the normalized Laplacian matrices. The numerical results are consistent with the theoretical results. 相似文献

9.

Efficient aggregation algorithms on very large compressed data warehouses 总被引：1，自引：0，他引：1

下载免费PDF全文

LI Jianzhong LI Yingshu Jaideep Srivastava 《计算机科学技术学报》2000,15(3):213-229

Multidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing(OLAP).Many efficinet algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed.However,to our knowledge,there is nothing to date in the literature about aggregation algorithms on multidimensional data warehouses that store datasets in mulitidimensional arrays rather than in tables.This paper presents a set of multidimensional aggregation algorithms on very large and compressed multidimensional data warehouses.These algorithms operate directly on compressed datasets in multidimensional data warehouses without the need to first decompress them.They are applicable to a variety of data compression methods.The algorithms have different performance behavior as a function of dataset parameters,sizes of out puts and ain memory availability.The algorithms are described and analyzed with respect to the I/O and CPU costs,A decision procedure to select the most efficient algorithm ,given an aggregation request,is also proposed.The analytical and experimental results show that the algorithms are more efficient than the traditional aggregation algorithms. 相似文献

10.

Context-Based 2D-VLC Entropy Coder in AVS Video Coding Standard

下载免费PDF全文

Qiang Wang De-Bin Zhao and Wen Gao 《计算机科学技术学报》2006,21(3):315-322

In this paper, a Context-based 2D Variable Length Coding （C2DVLC） method for coding the transformed residuals in AVS video coding standard is presented. One feature in C2DVLC is the usage of multiple 2D-VLC tables and another feature is the usage of simple Exponential-Golomb codes. C2DVLC employs context-based adaptive multiple table coding to exploit the statistical correlation between DCT coefficients of one block for higher coding efficiency. ExpGolomb codes are applied to code the pairs of the run-length of zero coefficients and the nonzero coefficient for lower storage requirement. C2DVLC is a low complexity coder in terms of both computational time and memory requirement. The experimental results show that C2DVLC can gain 0.34dB in average for the tested videos when compared with the traditional 2D-VLC coding method like that used in MPEG-2. And compared with CAVLC in H.264/AVC, C2DVLC shows similar coding efficiency. 相似文献

11.

Register allocation for write activity minimization on non-volatile main memory for embedded systems

Yazhi Huang Author VitaeTiantian Liu Author Vitae Chun Jason Xue^{Author Vitae} 《Journal of Systems Architecture》2012,58(1):13-23

Non-volatile memories are good candidates for DRAM replacement as main memory in embedded systems and they have many desirable characteristics. Nevertheless, the disadvantages of non-volatile memory co-exist with its advantages. First, the lifetime of some of the non-volatile memories is limited by the number of erase operations. Second, read and write operations have asymmetric speed or power consumption in non-volatile memory. This paper focuses on the embedded systems using non-volatile memory as main memory. We propose register allocation technique with re-computation to reduce the number of store instructions. When non-volatile memory is applied as the main memory, reducing store instructions will reduce write activities on non-volatile memory. To re-compute the spills effectively during register allocation, a novel potential spill selection strategy is proposed. During this process, live range splitting is utilized to split certain long live ranges such that they are more likely to be assigned into registers. In addition, techniques for re-computation overhead reduction is proposed on systems with multiple functional units. With the proposed approach, the lifetime of non-volatile memory is extended accordingly. The experimental results demonstrate that the proposed technique can efficiently reduce the number of store instructions on systems with non-volatile memory by 33% on average. 相似文献

12.

On reducing load/store latencies of cache accesses

Yuan-Shin Hwang Jia-Jhe Li 《Journal of Systems Architecture》2010,56(1):1-15

Effective address calculations for load and store instructions need to compete for ALU with other instructions and hence extra latencies might be incurred to data cache accesses. Fast address generation is an approach proposed to reduce cache access latencies. This paper presents a fast address generator that can eliminate most of the effective address computations by storing computed effective addresses of previous load/store instructions in a dummy register file. Experimental results show that this fast address generator can reduce effective address computations of load and store instructions by about 74% on average for SPECint2000 benchmarks and cut the execution times by 8.5%. Furthermore, when multiple dummy register files are deployed, this fast address generator eliminates over 90% of effective address computations of load and store instructions and improves the average execution times by 9.3%. 相似文献

13.

Linked instruction caches for enhancing power efficiency of embedded systems

Chang-Jung Ku Ching-Wen Chen An Hsia Chun-Lin Chen 《Microprocessors and Microsystems》2014

The power consumed by memory systems accounts for 45% of the total power consumed by an embedded system, and the power consumed during a memory access is 10 times higher than during a cache access. Thus, increasing the cache hit rate can effectively reduce the power consumption of the memory system and improve system performance. In this study, we increased the cache hit rate and reduced the cache-access power consumption by developing a new cache architecture known as a single linked cache (SLC) that stores frequently executed instructions. SLC has the features of low power consumption and low access delay, similar to a direct mapping cache, and a high cache hit rate similar to a two way-set associative cache by adding a new link field. In addition, we developed another design known as a multiple linked caches (MLC) to further reduce the power consumption during each cache access and avoid unnecessary cache accesses when the requested data is absent from the cache. In MLC, the linked cache is split into several small linked caches that store frequently executed instructions to reduce the power consumption during each access. To avoid unnecessary cache accesses when a requested instruction is not in the linked caches, the addresses of the frequently executed blocks are recorded in the branch target buffer (BTB). By consulting the BTB, a processor can access the memory to obtain the requested instruction directly if the instruction is not in the cache. In the simulation results, our method performed better than selective compression, traditional cache, and filter cache in terms of the cache hit rate, power consumption, and execution time. 相似文献

14.

面向微处理器猜测执行过程中预载入数据的Cache污染控制方法

张骏《小型微型计算机系统》2012,33(5):987-994

“存储墙”问题已经成为处理器性能提升的主要障碍,而处理器内核猜测执行预测路径上访存指令时预载入的存储器数据所导致Cache污染会严重影响处理器性能.本文提出一种针对猜测执行过程中预载入数据的Cache污染控制方法CSDA.首先,利用置信度评估技术从所有预测路径中分离出错误概率较大的路径.然后,根据低置信度污染型访存指令识别历史表将低置信度预测路径上的访存指令划分为预取型和污染型,为污染型的访存指令建立低优先级Load/Store队列,并采用污染数据Cache存储污染数据.仿真结果表明,在双核模式下,CSDA策略相对于baseline结构来说,L1 D-Cache缺失率降低幅度从9％-23％,平均降低了17％;L2 Cache缺失率的下降范围从1.02％-14.39％,平均为5.67％;IPC的提升幅度从0.19％ -5.59％,平均为2.21％. 相似文献

15.

Node discovery scheme of DDS for combat management system

《Computer Standards & Interfaces》2015

In this paper, a novel discovery scheme using modified counting Bloom filters is proposed for data distribution service (DDS) for real-time distributed system. In a large scale network for combat management system (CMS), a lot of memory is required to store all the information. In addition, high network traffic can become problematic. In many cases, most of the information stored is not needed by the DDS's endpoints but occupy memory storage. To reduce the size of information sent and stored, a discovery process combined with counting Bloom filters is proposed. This paper presents delay time for filters construction and total discovery time needed in DDS's discovery process. Simulation results show that the proposed method gives low delay time and zero false positive probability. 相似文献

16.

VLIW处理器循环指令缓冲器设计与实现

李勇胡慧俐杨焕荣《计算机应用》2014,34(4):1005-1009

数字信号处理软件中循环程序在执行时间上占有很大比例,用指令缓冲器暂存循环代码可以减少程序存储器的访问次数,提高处理器性能。在VLIW处理器指令流水线中增加一个支持循环指令的缓冲器,该缓冲器能够缓存循环程序指令,并以软件流水的形式向功能部件派发循环程序指令。这样循环程序代码只需访存一次而执行多次,大大减少了访存次数。在循环指令运行期间,缓冲器发出信号使程序存储器进入睡眠状态可以降低处理器功耗。典型的应用程序测试表明,使用了循环缓冲后,取指流水线空闲率可达90%以上,处理器整体性能提高10%左右,而循环缓冲的硬件面积开销大约占取指流水线的9%。相似文献

17.

Memory Renaming: Fast, Early and Accurate Processing of Memory Communication

Gary S. Tyson Todd M. Austin 《International journal of parallel programming》1999,27(5):357-380

As processors continue to exploit more instruction level parallelism, greater demands are placed on the performance of the memory system. In this paper, we introduce a novel modification of the processor pipeline called memory renaming . Memory renaming applies register access techniques to load and store instructions to speed the processing of memory traffic. The approach works by accurately predicting memory communication early in the pipeline and then re - mapping the communication to fast physical registers. This work extends previous studies of data value and dependence speculation. When memory renaming is added to the processor pipeline, renaming can be applied to 30-50 % of all memory references, translating to an overall improvement in execution time of up to 14 % for current pipeline configurations. As store forward delay times grow larger, renaming support can lead to performance improvements of as much as 42 %. Furthermore, this improvement is seen across all memory segments—including the heap segment which has often been difficult to manage efficiently. 相似文献

18.

Scalable Load and Store Processing in Latency-Tolerant Processors

Gandhi A. Akkary H. Rajwar R. Srinivasan S.T. Lai K. 《Micro, IEEE》2006,26(1):30-39

New load and store processing algorithms let memory-latency-tolerant architectures sustain thousands of in-flight instructions without scaling cycle-critical fully-associative load and store queues. These algorithms rely on redoing some stores after fetching cache miss data from memory (to fix memory dependences). Doing so provides better power and area characteristics than constantly enforcing memory dependences among a several loads and stores, many of which have unknown addresses. 相似文献

19.

基于时间提取的冲击波超压测试系统设计

张晋文王文廉黄晓敏赵晨阳张志杰《测控技术》2015,34(4):47-50

针对爆炸试验环境的冲击波超压测试,提出一种基于FPGA控制与Flash存储的可多次触发冲击波超压存储测试系统.将大容量的Flash存储分割成多个数据存储空间,以实现系统多次触发和采集存储.相比常规的单次触发存储测试系统,可防止误触发引起的测试失效,提高测试的可靠性.并利用绝对时间来提取有效信号的数据段,提高了数据回收效率,适合多点的分布式测试.该系统通过动态特性标定,并在爆炸试验中获得有效数据.试验结果表明,该系统具有很好的可靠性. 相似文献

20.

动态二进制翻译中不对界问题的处理

崔进鲜庞建民岳峰张一弛张刚《计算机工程与科学》2010,32(9):95-97

复杂指令集计算机体系结构向精简指令集计算机体系结构的动态二进制翻译过程中经常出现地址不对界的问题。本文以I386到Alpha平台的动态二进制翻译为例,研究了内存映射时的不对界和数据存取时的不对界问题,提出了一种改进的内存映射方法以及在中间表示层处理不对界地址访存问题的方案,有效地解决了此类问题。经实验验证,该方法正确并有较高效率。相似文献