期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

马绪健刘姝高铭泽董秀则《计算机应用研究》2023,40(6):1825-1828+1844

GIFT算法作为PRESENT算法的改进版本,结构上更加简洁高效,在FPGA上运行时,性能仍然存在提升空间。对此提出了一种新的实现方案,通过将算法的40轮迭代计算优化为20轮迭,并将加解密与轮密钥生成操作并行执行。在xc6slx16 FPGA平台综合后,频率可达194 MHz,吞吐量可达1.2 Gbps,消耗时钟周期21个,结果表明,所提方法相比现有工作具有更好的性能表现和更少的时钟周期消耗,实现在FPGA上高速运行是切实可行的。相似文献

2.

一种基于FPGA的SOM神经网络算法的并行实现

下载免费PDF全文

孔超李占才王沁李昂钱艺《计算机工程》2007,33(19):236-237

分析了SOM神经网络算法在FPGA实现过程中要考虑的2个主要问题：并行性和有限字长效应。通过分析，提出了一种实现该算法的高并行体系结构并给出了该体系结构中关键模块的具体实现电路。根据计算机仿真以及在FPGA上的实现所得到的结果表明，该体系结构在保证神经网络性能的同时，可以使电路具有较高的处理速度。相似文献

3.

基于子项空间技术的低复杂度FIR滤波器实现 总被引：2，自引：2，他引：0

徐红叶丰黄朝耿《电子技术应用》2014,(6)

基于子项空间共享技术,利用硬件描述语言编程,在FPGA上对FIR数字滤波器进行了实现。该设计将常系数乘法模块用加法和移位操作来实现,并利用子项共享有效地减少加法器个数。综合结果表明,所提方法可以有效节省硬件资源,降低实现成本,适用于低功耗数字系统设计。相似文献

4.

基于FPGA的微惯性测量组合电路设计

徐苛杰何鹏举张冰《传感技术学报》2006,19(6):2536-2539,2543

为了适应战术技术需要,微惯性测量装置要求具有体积小、重量轻、速度快、实时性能高等特性,目前还没有较好的方法来实现.本文提出了一种基于FPGA的硬件体系结构,在FPGA片内使用硬件描述语言编程构建了微惯性测量组合的信号采集、处理与输出电路平台,在应用中取得了较好的效果.该电路设计具有较强的通用性,在选用不同敏感元件时可通过在线编程迅速重构FPGA片内系统,形成新的微惯性测量组合. 相似文献

5.

适用不同Brunnstrom等级患者基于表面肌电信号的动作识别方法

王丰焱张道辉李自由赵新刚《机器人》2020,42(6):661-671,685

针对不同患病程度的脑卒中患者运动意图识别率低的问题,提出了一种适用于不同Brunnstrom等级患者基于表面肌电信号（sEMG）的动作识别方法．首先将所有等级患者sEMG数据进行融合,使用tsfresh库提取特征,然后基于随机森林（random forest,RF）模型筛选特征,并利用筛选的特征训练动作分类模型．进一步,通过研究动作和康复等级的关系,确定了康复评估动作并设计了康复等级自动评估算法．为了验证所提方法的有效性,在24例患者sEMG数据上进行了测试,实验结果表明所提方法能够将9种动作和6类康复等级的平均识别精度分别提升至89.81%和94%．基于所提方法构建的手部康复机器人系统能够实现康复等级自动评估．相似文献

6.

基于改进随机森林算法的工业过程运行状态评价

常玉清孙雪婷钟林生王福利刘英娇《自动化学报》2021,47(9):2214-2225

运行状态评价是指在过程正常生产的前提下, 进一步判断生产过程运行状态的优劣. 针对复杂工业过程定量信息与定性信息共存的情况, 本文提出了一种基于随机森林的工业过程运行状态评价方法. 针对随机森林中决策树信息存在冗余的问题, 基于互信息将传统随机森林中的决策树进行分组, 并选出每组中最优的决策树组成新的随机森林. 同时为了强化评价精度高的决策树和弱化评价精度低的决策树对最终评价结果的影响, 使用加权投票机制取代传统众数投票方法, 最终构成一种基于互信息的加权随机森林算法(Mutual information weighted random forest, MIWRF). 对于在线评价, 本文通过计算在线数据处于各个等级的概率, 并且结合提出的在线评价策略, 判定当前样本运行状态等级. 为了验证所提算法的有效性, 将所提方法应用于湿法冶金浸出过程, 实验结果表明, 相对于传统随机森林算法, MIWRF 降低了模型的复杂度, 同时提高了运行状态评价精度. 相似文献

7.

AS5643协议处理FPGA的仿真验证

马宁田泽史嘉涛赵志强《计算机技术与发展》2014,(5):153-156

现代FPGA设计中,仿真验证是证明FPGA设计能正确实现其功能的过程,是保证FPGA设计质量的有效手段之一。文中在分析AS5643协议的基础上,搭建了有效可靠的虚拟验证平台,重点研究了虚拟验证平台的构建方法,并开发相应的功能模型和测试用例。通过把这些功能模型挂接在FPGA的外部接口上,将初始化信息写入到相应的寄存器和配置DPRAM中,达到模拟FPGA的工作过程来进行各项测试工作。该验证平台适用于AS5643协议处理专用FPGA,验证方法提高了验证效率,缩短了整个设计验证周期。相似文献

8.

一种新型无线传感器网络节点设计

《微型机与应用》2014,(12):48-50

提出了一种基于FPGA动态局部重构技术的无线传感器网络节点设计方案,通过FPGA高效的计算能力来提高节点的处理能力,同时采用动态局部重构技术进行功耗控制。根据所提方案进行了硬件平台的设计,并在此平台上对可重构的流程及实现方法进行了验证。实验结果表明,该方案能够实现无线传感器网络节点的部分可重构,在减小功耗的同时具有较强的运算能力。相似文献

9.

图形算法融合处理体系结构设计与实现

孙富明李笑盈王沁《计算机辅助设计与图形学学报》2010,22(9)

为构建面向不同图形处理算法应用的统一实现平台,提出一种面向硬件实现的多种算法融合处理体系结构.该结构将通路控制、参数控制、复用控制、状态检测等方式与数学运算库有机结合,采取串行结构将多种图形处理算法进行统一实现;在此基础上,将纹理映射算法和深度图像三维变换算法进行融合,实现了面向FPGA的设计.最后在FPGA平台上进行了验证与资源分析,取得了良好的预期效果. 相似文献

10.

基于空间变换的随机森林算法

关晓蔷王文剑庞继芳孟银凤《计算机研究与发展》2021,58(11):2485-2499

随机森林是机器学习领域中一种常用的分类算法,具有适用范围广且不易过拟合等优点.为了提高随机森林处理多分类问题的能力,提出一种基于空间变换的随机森林算法(space transformation based random forest algorithm,ST-RF).首先,给出一种考虑优先类别的线性判别分析方法(priority class based linear discriminant analysis,PCLDA),利用针对优先类别的投影矩阵对样本进行空间变换,以增强优先类别样本与其他类别样本的区分效果进而,将PCLDA方法引入随机森林构建过程中,在为每棵决策树随机选择一个优先类别保证随机森林多样性的基础上,利用PCLDA方法创建侧重于不同优先类别的决策树,以提高单棵决策树的分类准确性,从而实现集成模型整体分类性能的有效提升最后,在10个标准数据集上对ST-RF算法与7种典型随机森林算法进行比较分析,验证所提算法的有效性,并将基于PCLDA的空间变换策略应用到对比算法中,对改进前后的算法性能进行比较分析.实验结果表明:ST-RF算法在处理多分类问题方面具有明显优势,所提出的空间变换策略具有较强的普适性,可以显著提升原算法的分类性能. 相似文献

11.

A fully pipelined FPGA accelerator for scale invariant feature transform keypoint descriptor matching

《Microprocessors and Microsystems》2020

相似文献

12.

基于Galois线性反馈移位寄存器的随机数产生

谷晓忱张民选《计算机工程与科学》2011,33(5):44

随着FPGA计算能力的不断提高,使用FPGA进行计算加速的研究越来越多。在这些加速对象中,有许多应用都需要使用到随机数生成器。本文应用Leap Forward方法,提出了一种基于Galois类型线性反馈移位寄存器产生随机数的硬件结构。详细分析了该硬件结构中转换矩阵的特征,给出了提高工作速度和减小硬件面积的方法。应用该硬件结构,本文在Xilinx Vertex 6 FPGA上设计实现了16位输出的随机数产生器。实验结果显示,该随机数产生器仅使用了6个slices资源,工作速度可以达到951MHz,产生随机数的吞吐率可以达到15.2Gbps。文中使用K-S方法对所产生随机数的质量进行了检测,并给出了所产生的105个随机数的CDF曲线与理论CDF的比对结果。相似文献

13.

A novel architecture design for VLSI implementation of integer DCT in HEVC standard

Loukil Hassen Masmoudi Nouri 《Multimedia Tools and Applications》2020,79(33-34):23977-23993

This paper presents novel hardware of a unified architecture to compute the 4?×?4, 8?×?8, 16?×?16 and 32?×?32 efficient two dimensional (2-D) integer DCT using one block 1-D DCT for the HEVC standard with less complexity and material design. As HEVC large transforms suffer from the huge number of computations especially multiplications, this paper presents a proposition of a modified algorithm reducing the computational complexity. The goal is to ensure the maximum circuit reuse during the computation while keeping the same quality of encoded videos. The hardware architecture is described in VHDL language and synthesized on Altera FPGA. The hardware architecture throughput reaches a processing rate up to 52 million of pixels per second at 90 MHz frequency clock. An IP core is presented using the embedded video system on a programmable chip (SoPC) for implementation and validation of the proposed design. Finally, the proposed architecture has significant advantages in terms of hardware cost and improved performance compared to related work existing in the literature. This architecture can be used in ultra-high definition real-time TV coding (UHD) applications.

相似文献

14.

Architecture design of the high-throughput compensator and interpolator for the H.265/HEVC encoder

Grzegorz Pastuszak Maciej Trochimiuk 《Journal of Real-Time Image Processing》2016,11(4):663-673

This paper presents the architecture of the high-throughput compensator and the interpolator used in the motion estimation of the H.265/HEVC encoder. The architecture can process 8×8 blocks in each clock cycle. The design allows the random order of checked coding blocks and motion vectors. This feature makes the architecture suitable for different search algorithms. The interpolator embeds 64 multiplierless reconfigurable filter cores to support computations for different fractional-pel positions. Synthesis results show that the design can operate at 200 and 400 MHz when implemented in FPGA Arria II and TSMC 90 nm, respectively. The computational scalability enables the proposed architecture to trade the throughput for the compression efficiency. If 2160p@30fps video is encoded, the design clocked at 400 MHz can check about 100 motion vectors for 8×8 blocks. 相似文献

15.

基于宽度混合森林回归的城市固废焚烧过程二噁英排放软测量

夏恒汤健崔璨麟乔俊飞《自动化学报》2023,49(2):343-365

二噁英是城市固废焚烧过程排放的痕量有机污染物.受限于相关技术的复杂度和高成本,二噁英排放浓度检测的大时滞已成为制约城市固废焚烧过程优化控制的关键因素之一.虽然具有低成本、快响应、高精度等特点的数据驱动软测量模型能够有效解决上述问题,但二噁英建模方法必须要契合数据的小样本、高维度特性.对此,提出了由特征映射层、潜在特征提取层、特征增强层和增量学习层组成的宽度混合森林回归软测量方法.首先,构建由随机森林和完全随机森林构成的混合森林组进行高维特征映射;其次,依据贡献率对全联接混合矩阵进行潜在特征提取,采用信息度量准则保证潜在有价值信息的最大化传递和最小化冗余,降低模型的复杂度和计算消耗;然后,基于所提取潜在信息训练特征增强层以增强特征表征能力;最后,通过增量式学习策略构建增量学习层后采用Moore-Penrose伪逆获得权重矩阵.在基准数据集和城市固废焚烧过程二噁英数据集上的实验结果表明了方法的有效性和优越性. 相似文献

16.

SIMD architecture on FPGA for scientific computing aboard a space instrument

《Journal of Systems Architecture》2016

In this paper we propose a SIMD multiprocessor architecture to reach high performance in floating point operations by using FPGA devices. This architecture is used in an instrument that carries out the scientific analysis aboard the ESA's Solar Orbiter mission. We present a programming language and a compiler able to automatize the SIMD configuration process by using an initial sequential code. The proposed architecture squeezes the FPGA resources in order to reach the time constraints. The achieved FPGA system improves the ground-based system performance based on commercial CPUs regarding time and power consumption. 相似文献

17.

TLP-LDPC: Three-Level Parallel FPGA Architecture for Fast Prototyping of LDPC Decoder Using High-Level Synthesis

下载免费PDF全文

Yi-Fan Zhang Lei Sun Qiang Cao 《计算机科学技术学报》2022,37(6):1290-1306

Low-Density Parity-heck Codes (LDPC) with excellent error-correction capabilities have been widely used in both data communication and storage fields, to construct reliable cyber-physical systems that are resilient to real-world noises. Fast prototyping field-programmable gate array (FPGA)-based decoder is essential to achieve high decoding performance while accelerating the development process. This paper proposes a three-level parallel architecture, TLP-LDPC, to achieve high throughput by fully exploiting the characteristics of both LDPC and underlying hardware while effectively scaling to large-size FPGA platforms. The three-level parallel architecture contains a low-level decoding unit, a mid-level multi-unit decoding core, and a high-level multi-core decoder. The low-level decoding unit is a basic LDPC computation component that effectively combines the features of the LDPC algorithm and hardware with the specific structure (e.g., Look-Up-Table, LUT) of the FPGA and eliminates potential data conflicts. The mid-level decoding core integrates the input/output and multiple decoding units in a well-balancing pipelined fashion. The top-level multi-core architecture conveniently makes full use of board-level resources to improve the overall throughput. We develop an LDPC C++ code with dedicated pragmas and leverage HLS tools to implement the TLP-LDPC architecture. Experimental results show that TLP-LDPC achieves 9.63 Gbps end-to-end decoding throughput on a Xilinx Alveo U50 platform, 3.9x higher than existing HLS-based FPGA implementations. 相似文献

18.

FPGA-implementation of atan(Y/X) based on logarithmic transformation and LUT-based techniques

R. Gutierrez V. Torres J. Valls 《Journal of Systems Architecture》2010,56(11):588-596

This paper presents an architecture for the computation of the atan(Y/X) operation suitable for broadband communications systems where a throughput between 20 and 40 MHz is required. The proposed architecture implements a division operation of two inputs by means of a logarithmic transformation, in which the division can be performed with a subtraction. A combination of non-uniform segmentation and multipartite LUT technique is proposed for the arctangent of the logarithm approximation. The architecture was implemented in a Xilinx FPGA device achieving higher throughput than the approach based on CORDIC algorithm and lower area than previous LUT-based approaches. 相似文献

19.

基于JBits的一种可重构数据处理系统可靠性研究 总被引：1，自引：0，他引：1

任小西李仁发金声震张克环吴强《计算机研究与发展》2007,44(4):722-728

空间太阳望远镜(SST)是一颗对太阳进行观测的科学卫星,它使用FPGA芯片对每天采集的大量数据进行预处理.高昂的建造费用和恶劣的工作环境,确保SST数据的高可靠性成为一项艰巨任务.改进了常规TMR结构,提出一种基于配置数据的可重构硬件故障检测和修复方法,使用JBits工具简化对配置数据的各种操作.此结构和方法能及时检测到故障,通过硬件重构消除故障,提高系统可靠性.采用Markov过程理论对系统可靠性进行分析,结果表明可靠性可得到显著提高. 相似文献

20.

Chaos blended cellular automata on fractals: the effective way of reconfigurable hardware assisted medical image privacy

Sivaraman R. Vijaykumar Ajay Savarinathan Prem Jayapalan Avila 《Multimedia Tools and Applications》2022,81(23):33087-33106

In more recent times data continues to be generated at a very unprecedented scale. This is a result of the pervasive nature of modern-day digitisation. As such, it is absolutely critical that this data only be accessed by the trusted parties concerned in an effort to maintain the privacy of individuals. One particular type data that could severely compromise the identity and privacy of an individual is ‘medical data’. With a focus on medical images, this work proposes a novel ‘fractalized’ chaos-cellular automata encryption scheme, implemented on Cyclone IV EP2C35F672C6 FPGA, resulting in a hardware-based concurrent security solution. The scheme entails three stages of diffusion, which arise from different mechanisms. In tandem with the diffusion process is the “On the Fly” process of confusion governed by a Linear feedback Shift Register (LFSR), all of which in implemented by applying the nature of fractals. The security architecture occupies 16,351 Logic Elements (LEs) with 230 registers on the target FPGA with the power dissipation of 133.39 mW. Further, the encryption achieves near zero correlation with the average entropy of 15.17156 that ensures the statistical properties. In addition, the security framework requires 12.13 ms to encrypt a 256?×?256?×?16 DICOM image which results in the throughput of 86.44 Mbps. The proposed encryption resists the brute force attack and chosen plain text attack by achieving a very large span of keyspace.

相似文献