期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

琚小明张皆浩张逸中《计算机应用》2013,33(5):1459-1462

高可靠性的系统都要求具备实时错误检测。针对内建错误检测,提出了三种在线模型的自我实时检测方法。错误检测模型利用了现场可编程门阵列(FPGA)中的两个管道,通过比较当前配置信息与FPGA外配置内存中的原始信息是否一致,可以实时地检测错误,而且可以通过比较它们的配置数据来定位那些具有单粒子翻转(SEU)错误的逻辑块。仿真测试结果表明所提出的方法比在线BIST有着更好的性能。相似文献

2.

基于FPGA动态重构的卷积神经网络硬件架构设计

《微型机与应用》2019,(3):77-81

为了解决卷积神经网络硬件实现阶段的资源限制问题,提出了基于FPGA动态重构的卷积神经网络加速器设计。首先,设计了卷积神经网络加速器的整体并行策略和VLSI架构,并针对卷积神经网络的功能模块进行了流水线设计。其次,对卷积神经网络加速器进行动态重构设计,建立动态重构区域及其模块功能划分;并选用BPI Flash存储配置文件,通过内部配置端口读取配置文件对动态重构区域进行动态配置。实验结果表明,针对Lenet-5手写体识别网络,基于动态重构设计的加速器与相应的静态设计相比,使用的Slice LUTs、Slice Registers与DSP资源分别减少44%、27. 8%与71%。与基于软件平台实现作对比,系统执行时间大幅度降低。但是由于内部配置端口的带宽限制,重构配置时间延长了整个卷积网络的执行时间。相似文献

3.

RETRACTED: Efficient soft error resiliency by multi-match packet classification using scalable TCAM implementation in FPGA

《Microprocessors and Microsystems》2020

相似文献

4.

基于FPGA的NoC硬件系统设计 总被引：1，自引：0，他引：1

许川佩唐海胡聪《电子技术应用》2012,38(2):117-119,123

设计了基于FPGA的片上网络系统硬件平台。系统由大容量的FPGA、存储器、高速A/D与D/A、通信接口和一个扩展的ARM9系统组成。完成了集高速数字信号处理、视频编解码和网络传输功能与一体的多核系统设计。针对典型的3×3 2D Mesh结构的NoC系统应用进行了探讨,阐述了NoC系统设计过程中的关键技术,并使用SigXplorer软件对系统的信号完整性解决方案进行了PCB的反射与串扰仿真。相似文献

5.

FPGA based hardware acceleration for elliptic curve public key cryptosystems

M. Ernst B. Henhapl S. Klupsch S. Huss 《Journal of Systems and Software》2004,70(3):299-313

This paper addresses public key cryptosystems based on elliptic curves, which are aimed to high-performance digital signature schemes. Elliptic curve algorithms are characterized by the fact that one can work with considerably shorter keys compared to the RSA approach at the same level of security. A general and highly efficient method for mapping the most time-critical operations to a configurable co-processor is proposed. By means of real-time measurements the resulting performance values are compared to previously published state of the art hardware implementations.

A generator based approach is advocated for that purpose which supports application specific co-processor configurations in a flexible and straight forward way. Such a configurable CryptoProcessor has been integrated into a Java-based digital signature environment resulting in a considerable increase of its performance. The outlined approach combines in an unique way the advantages of mapping functionality to either hardware or software and it results in high-speed cryptosystems which are both portable and easy to update according to future security requirements. 相似文献

6.

一种基于TMS320C6A8168的FPGA动态配置方法

《电子技术应用》2016,(9)

针对基带处理系统中FPGA传统上电配置中存在的速度和灵活性等问题,提出一种基于TMS320C6A8168的SD卡和网口动态加载FPGA配置文件的方案。该方案以含有4片FPGA和1片C6A8168 ARM处理器所组成的嵌入式系统作为平台,通过修改U-boot中的代码使得基带系统上电运行U-boot时能够选择性地加载PC中FPGA的配置文件,从而使FPGA完成相应的物理层算法及硬件加速。有效实现了对FPGA的配置,提高了FPGA系统配置的灵活性,在基带处理系统中有很好的应用前景。相似文献

7.

Efficient hardware trojan diagnosis in SRAM based on FPGA processors using inject detect masking algorithm for multimedia signal Processors

《Microprocessors and Microsystems》2020

The multimedia processor is the most powerful and challenging application in real-time world where Hardware Trojans (HTs) is a significant threat in most of the electronic devices which use Integrated Circuit (IC) as a crucial component. Since IC is manufactured by most of the untrusted designers, there is a possibility of inserting malicious attacks in any stages of fabrication. It is mainly added by an antagonist into the storage cell to make a detection process is a complex task, which creates an impact in the function of the device. To mitigate these issues, an IDM (Inject Detect Masking) algorithm is proposed, and it is implemented in a Look-Up Table (LUT) design, which exploited Stability Enhancing Static Random Access Memory (SESRAM) cell for storing the data bits. HT is injected at the output of the SESRAM cell, and then masking is applied to mitigate the HT. The proposed Inject Detect Masking (IDM) algorithm is designed and simulated in Tanner EDA with 125 nm technology. It is used to multimedia signal processors world in real-time applications to achieve better response in processor end solutions. It increases the detection rate by 8.88%, 8.88%, 5.37%, 4.25% and correction coverage by 5.26%, 28.20%, 21.95%, 13.63%, 11.11%, 7.52% when compared with Online Checking Technique, Simultaneous Orthogonal Matching Pursuit (SOMP) algorithm, Multiple Excitation of Rare Switching (MERS), LMDet and Clustering Ensemble-based Detection respectively. 相似文献

8.

捷联惯性传感器多余度配置的误差标定技术研究

华冰刘建业熊智李荣冰《传感器与微系统》2005,24(5):31-33

对捷联惯性传感器多余度配置系统的标定技术进行了研究。详细分析了多余度惯性传感器各参数的测量原理及计算公式,针对典型的非正交配置(六传感器正十二面体)的多余度惯性测量单元(IMU),提出了一种简易的且具有较高精度的误差模型参数静态标定方法,给出了计算误差模型参数的数学推导过程和解析表达式。仿真结果表明:该计算方法精度较高,可以有效估计出多余度IMU的误差模型参数,提高了惯导精度。相似文献

9.

Selective dynamic serialization for reducing energy consumption in hardware transactional memory systems

Epifanio Gaona J. Rubén Titos-Gil Juan Fernández Manuel E. Acacio 《The Journal of supercomputing》2014,68(2):914-934

In the search for new paradigms to simplify multithreaded programming, Transactional Memory (TM) is currently being advocated as a promising alternative to deadlock-prone lock-based synchronization. In this way, future many-core CMP architectures may need to provide hardware support for TM. On the other hand, power dissipation constitutes a first class consideration in multicore processor designs. In this work, we propose Selective Dynamic Serialization (SDS) as a new technique to improve energy consumption without degrading performance in applications with conflicting transactions by avoiding wasted work due to aborted transactions. Our proposal, which is implemented on top of a hardware transactional memory (HTM) system with an eager conflict management policy, detects and serializes conflicting transactions dynamically (at run-time). In its simplest form, in case of conflict, one transaction is allowed to continue whilst the rest are completely stalled. Once the executing transaction has finished, it wakes up several of the stalling transactions. More elaborated implementations of SDS try to delay this behavior until serialization of transactions is profitable, achieving the best trade-off between performance, energy savings and network traffic. SDS implementations differ from each other in the condition that triggers the serialization mode. We have evaluated several SDS schemes using GEMS, a full-system simulator implementing the LogTM-SE Eager–Eager HTM system, and several benchmarks from the STAMP suite. Results for a 16-core CMP show that SDS obtains reductions of 6 % on average in energy consumption (more than 20 % in high contention scenarios) in a wide range of benchmarks without affecting, on average, execution time. At the same time, network traffic level is also reduced by 22 %. 相似文献

10.

基于FPGA硬件策略的入侵检测的研究

孙海军张波《计算机工程与设计》2009,30(22)

通过对入侵检测原理的分析,提出了一个基于FPGA硬件策略的IDS原型.数据分发功能由数据预处理模块完成,对数据包的分析检测通过多软核并行处理的方法来提速,并采用协处理器来加速完成模式匹配过程,系统的控制和管理功能由主控模块来完成,可以根据需要增加硬件和自定义指令来提高系统性能.实验结果表明,在入侵检测系统中采用硬件策略比软件实现具有更好的性能. 相似文献

11.

Efficient execution of speculative threads and transactions with hardware transactional memory

《Future Generation Computer Systems》2014

Thread-level speculation (TLS) was researched to automatically parallelize portions of serial programs for execution, and transactional memory (TM) was studied as a promising alternative of lock for parallel programming due to its simplicity. Both TLS and TM require similar underlying support. In the paper, we present SeTM (sequential transactional memory), a hardware enhanced TM system which supports TLS at minor extra cost. Signature is an effective way to buffer speculative states in TM and TLS. But it cripples TM and TLS performance due to its false-positive in terms of conflict detection, especially for conflict-intensive TLS. SeTM adopts R/W bits and signature concurrently to ameliorate this bad influence. Additionally, SeTM introduces the fast rollback mechanism, which provides fast abort recovery for eager log-based HTM and TLS. The most important contribution of SeTM is the conflict-tolerant mechanism, which tolerates some ambiguous data conflicts in TLS. Finally, in order to achieve an efficient execution for these un-order transactions, we add an extra ordering mechanism for SeTM. With this ordering mechanism, the transactions in TM can also gain the performance improvement with the support of conflict-tolerant mechanism. Our evaluation major on TM and TLS separately. For the TLS applications, six representative benchmarks have been adopted to evaluate the above model. Our experimental results show that our scheme improves the execution performance of most tested codes at a modest hardware cost. For a set of important scientific loops, we report the highest speedup of 6.5 with 15 cores. Besides, experimental results also show good scalability of SeTM system. For the TM applications, with respect to LogTM-SE, the benchmarks from STAMP also gain performance improvement signally. 相似文献

12.

基于FPGA的FLAC音频硬解码的设计与实现

《电子技术应用》2016,(2):21-24

针对高保真FLAC音频播放系统中软件解码效率低下、占用系统资源大的问题,提出一种基于FPGA的FLAC音频硬解码的设计方案。分析了FLAC音频基本编解码原理,并详细介绍了基于现场可编程门阵列(FPGA)器件的FLAC解码器各模块的设计思想和实现。利用Verilog语言在Quartus II的开发环境中进行设计输入与仿真验证。实验测试结果表明,该FLAC解码器设计灵活、工作稳定可靠、解码效率高,可作为IP核应用于不同SoC的无损音频播放系统中。相似文献

13.

Efficient hardware architecture based on generalized Hebbian algorithm for texture classification

Shiow-Jyu LinAuthor Vitae Yi-Tsan HungAuthor VitaeWen-Jyi HwangAuthor Vitae 《Neurocomputing》2011,74(17):3248-3256

The objective of this paper is to present an efficient hardware architecture for generalized Hebbian algorithm (GHA). In the architecture, the principal component computation and weight vector updating of the GHA are operated in parallel, so that the throughput of the circuit can be significantly enhanced. In addition, the weight vector updating process is separated into a number of stages for lowering area costs and increasing computational speed. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is designed. It is embedded in a system-on-programmable-chip (SOPC) platform for physical performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. 相似文献

14.

压力传感器动态误差修正方法的FPGA实现 总被引：2，自引：0，他引：2

杨文杰张志杰王代华陈青青《传感技术学报》2017,30(3)

为了实时修正由于压力传感器动态特性引起的动态误差,提出了一种基于IIR数字补偿滤波器的FPGA实现方案.该方案首先依据压力传感器动态标定时的输入和输出数据利用改进的最小二乘算法建立全面描述传感器系统的数学模型,继而运用零极点配置方法重新配置模型零极点得到最优IIR补偿器模型及参数,其次在保证补偿器性能无失真或失真很小的基础上使用MATLAB工具量化补偿器模型参数,最后在以FPGA为控制核心的数据采集及存储系统的基础上应用量化的IIR补偿器模型参数设计了IIR补偿器软核,从而实现传感器动态误差的实时修正.实验结果表明:该方案能够实时有效地修正传感器动态误差. 相似文献

15.

Efficient techniques for performing an irregular computation on distributed memory machines

HEMAL V. SHAH JOSÉ A. B. FORTES 《International journal of systems science》2013,44(11):1101-1113

Two techniques to perform an irregularly structured Gröbner basis computation (a basic method used in symbolic polynomial manipulation) on distributed memory machines are developed. The first technique, based on relaxation of dependencies present in the sequential computation, exploits coarse-grain parallelism. In this so-called relaxation approach, at every step, each processor reduces a local pair if available, communicates the result and status information from other processors, and updates the local set of pairs and basis. The basis is replicated on each processor while the set of pairs is distributed across the processors. The computation terminates when no pairs are left to be reduced on each processor. A relaxation algorithm based on this approach, along with its experimental results, are provided. The other technique, named quasi-barrier, is developed to enhance the performance of the relaxation algorithm. Using this technique, load balance and performance can be improved by synchronizing p processors when a fraction of the active tasks are completed. The performance enhancement is significant for large numbers of processors when the distribution of pair reduction times is close to exponential. The experimental results obtained on Intel Paragon and IBM SP2 demonstrate the effectiveness of these techniques. 相似文献

16.

基于FPGA的全搜索运动估计硬件电路设计

童桢王祖强杨恒《电子技术应用》2014,(7):44-47

设计了一种分层的二维阵列全搜索运动估计硬件电路。与传统的二维阵列全搜索运动估计电路相比,它在处理单元(PE)的并行结构设计以及存储器设计方面作出了改进,节约了硬件资源和编码时间。根据各模块的时序关系合理安排并行流水线结构,采用一列像素并行处理,实现了运动估计实时编码。相似文献

17.

基于FPGA中断管理的研究及硬件化设计

李岩贾小梨迟欢欢《电子技术应用》2011,37(9):49-52

为了满足嵌入式操作系统中实时性要求,提出了基于FPGA的中断管理方法.给出了中断管理模块的结构模型,并采用VHDL硬件描述语言将中断管理模块由硬件实现.针对中断请求和响应方式的不同特点,将其分为系统中断管理和用户中断管理,主要设计了中断源、中断嵌套和时钟节拍中断等管理的逻辑电路.通过仿真实验表明,该结构模型所采用的中断... 相似文献

18.

Artificial neural networks based dynamic priority arbitration for asynchronous flow control

Naqvi Syed Rameez Akram Tallha Haider Sajjad Ali Kamran Muhammad 《Neural computing & applications》2018,29(7):627-637

Accesses to physical links in Networks-on-Chip need to be appropriately arbitrated to avoid collisions. In the case of asynchronous routers, this arbitration between various clients, carrying messages with different service levels, is managed by dedicated circuits called arbiters. The latter are accustomed to allocate the shared resource to each client in a round-robin fashion; however, they may be tuned to favor certain messages more frequently by means of various digital design techniques. In this work, we make use of artificial neural networks to propose a mechanism to dynamically compute priority for each message by defining a few constraints. Based on these constraints, we first build a mathematical model for the objective function, and propose two algorithms for vector selection and resource allocation to train the artificial neural networks. We carry out a detailed comparison between seven different learning algorithms, and observe their effectiveness in terms of prediction efficiency for the application of dynamic priority arbitration. The decision is based on input parameters: available tokens, service levels, and an active request from each client. The performance of the learning algorithms has been analyzed in terms of mean squared error, true acceptance rate, number of epochs and execution time, so as to ensure mutual exclusion.

相似文献

19.

基于三层级低开销的FPGA多比特翻转缓解技术

《电子技术应用》2018,(4)

商用现货型FPGA被认为是解决目前空间应用对处理能力需求不断增加的唯一途径,由于其对多比特翻转的敏感性,需要针对空间应用的单粒子效应采取专门的设计加固技术。提出了基于用户逻辑层、配置存储器层和控制层3个层级的容错技术框架。在用户逻辑层,提出了一种新型的低开销的FTR策略用于用户逻辑的错误检测;在配置存储器级,提出了基于模块和帧的动态部分可重构策略用于处理配置存储器的错误;在控制级,以Xilinx ZYNQ片上系统型FPGA为目标,利用其嵌入的硬核处理器进行基于检查点和卷回体制的电路状态保存和恢复。整个容错技术框架在7级流水的LEON3开源器处理器中进行了故障注入的试验验证,试验结果显示在增加85%的LUT资源和125%的触发器资源使用条件下,99.997%注入的故障得到了及时纠正。相似文献

20.

基于FPGA的无线接入点硬件平台的设计与实现*

方林波侯义斌黄樟钦彭冬鸣张勇《计算机应用研究》2007,24(11):257-259

基于认证、安全及QoS等方面的研究需要,提出了基于FPGA的HostAP设计方案,并搭建了无线接入点的硬件平台,在此平台上进行了硬件的裁减与集成;设计和实现了无线接入点的硬件电路,完成了无线接入点的高速PCB布局布线.在PCB布局布线的过程中,重点解决了高速印制电路板设计中的传输线效应问题,最后进行了实际的系统测试. 相似文献