期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Low-complexity FFT/IFFT IP hardware macrocells for OFDM and MIMO–OFDM CMOS transceivers

Sergio Saponara Nicola E. L’Insalata Luca Fanucci 《Microprocessors and Microsystems》2009,33(3):191-200

The paper presents an automated environment for fast design space exploration and automatic generation of FFT/IFFT macrocells with minimum circuit and memory complexity within the numerical accuracy budget of the target application. The effectiveness of the tool is demonstrated through FPGA and CMOS implementations (90 nm, 65 nm and 45 nm technologies) of the baseband processing in embedded OFDM transceivers. Compared with state-of-art FFT/IFFT IP cores, the proposed work provides macrocells with lower circuit complexity while keeping the same system performance (throughput, transform size and accuracy) and is the first addressing the requirements of all OFDM standards including MIMO systems: 802.11 WLAN, 802.16 WMAN, Digital Audio and Video Broadcasting in terrestrial, handheld and hybrid satellite-scenarios, Ultra Wide Band, Broadband on Power Lines, xDSL. 相似文献

2.

Hardware Chip Performance of CORDIC Based OFDM Transceiver for Wireless Communication

Amit Kumar Adesh Kumar Geetam Singh Tomar 《计算机系统科学与工程》2022,40(2):645-460

相似文献

3.

OFDM雷达信号源设计与实现

张卫薄超顾红苏卫民《数据采集与处理》2014,29(4):584-589

为了验证正交频分复用(Orthogonal frequency division multiplex,OFDM）信号雷达的可行性,本文在研究信号产生原理和波形设计方法的基础上,提出了一种基于现场可编程门阵列技术和快速傅里叶变换算法的雷达信号实时产生方案。该方案采用数字可编程技术实现正交多载波调制,信号参数和调制方式具有可重配置的特点,满足多种场合的应用要求。本文针对OFDM 信号产生的关键步骤展开讨论,并给出测试结果。外场试验验证了该方案的可行性。相似文献

4.

MB-OFDM UWB通信系统维特比解码器的实现

下载免费PDF全文

徐卓王雪静叶凡任俊彦《计算机工程》2008,34(18):117-119

提出一种应用于多波段正交频分复用(MB-OFDM)超宽带通信系统的维特比解码器的设计方案,分析MB-OFDM所采用的卷积/凿孔码及相应的维特比解码算法的性能。为了达到系统要求的最高数据传输率、保持硬件开销的经济性,结合滑动窗口和折叠2种方法设计解码器的硬件结构。在低速工作模式下,部分处理单元被禁用,以节省功耗。该设计经Xilinx Virtex-4 FPGA验证,最高译码速率可达432 Mb/s。相似文献

5.

Enhancing IEEE 802.11a/n with dynamic single-user OFDM adaptation

James Gross Marc Emmelmann Oscar Puñal Adam Wolisz 《Performance Evaluation》2009,66(3-5):240-257

Earlier paper have demonstrated that the achievable throughput of OFDM systems can benefit significantly from individual modulation/transmit power selection on a per sub-carrier basis according to the actual gain of individual sub-carriers (so called dynamic OFDM scheme). Usage of such an approach requires, however, providing support for additional functionalities such as: acquisition of the sub-carrier gains, signaling of the modulation types used between sender and receiver, etc. Therefore dynamic OFDM is actively pursued for future radio interfaces, rather than considered as extension of existing OFDM based standards. In this paper we introduce a proposal on how the widely accepted IEEE 802.11a/g systems as well as the emerging IEEE 802.11n system might be extended to support the dynamic OFDM in a single-user (point-to-point) setting. The presented approach guarantees backward compatibility to legacy devices. We address these issues by presenting (a) a set of protocol modifications required to incorporate dynamic OFDM in 802.11a/g/n; and (b) a performance evaluation of the suggested extension (referred to further on as single-user 802.11 DYN mode). Although 802.11n already includes advanced MAC and PHY features, i.e., frame aggregation and MIMO transmissions, our performance evaluation demonstrates that a further improvement is achievable by incorporating dynamic OFDM. 相似文献

6.

Architectural design and FPGA implementation of radix-4 CORDIC processor

Kaushik Bhattacharyya Rakesh Biswas Anindya Sundar Dhar Swapna Banerjee 《Microprocessors and Microsystems》2010,34(2-4):96-101

A new scaled radix-4 CORDIC architecture that incorporates pipelining and parallelism is presented. The latency of the architecture is n/2 clock cycles and throughput rate is one valid result per n/2 clocks for n bit precision. A 16 bit radix-4 CORDIC architecture is implemented on the available FPGA platform. The corresponding latency of the architecture is eight clock cycles and throughput rate is one valid result per eight clock cycles. The entire scaled architecture operates at 56.96 MHz of clock rate with a power consumption of 380 mW. The speed can be enhanced with the upgraded version of FPGA device. A speed-area optimized processor is obtained through this architecture and is suitable for real time applications. 相似文献

7.

基于FPGA的高速实时FFT处理器设计

付宜利王光国靳保《微计算机信息》2007,23(5):194-195

为满足机器人敏感皮肤实时信号处理的要求,系统采用FPGA来实现快速傅里叶变换(FFT)算法。本文在分析了基-2FFT算法的基础上,采用同步流水线结构,利用现场可编程门阵列(FPGA)完成256点16位复数点FFT。实验结果表明,使用FPGA实现FFT具有很好的实时性,能满足机器人敏感皮肤实时信号处理的要求。相似文献

8.

Massively parallel acceleration of a document-similarity classifier to detect web attacks

Craig UlmerAuthor VitaeMaya GokhaleAuthor Vitae Brian GallagherAuthor VitaePhilip TopAuthor Vitae Tina Eliassi-RadAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(2):225-235

This paper describes our approach to adapting a text document similarity classifier based on the Term Frequency Inverse Document Frequency (TFIDF) metric to two massively multi-core hardware platforms. The TFIDF classifier is used to detect web attacks in HTTP data. In our parallel hardware approaches, we design streaming, real time classifiers by simplifying the sequential algorithm and manipulating the classifier’s model to allow decision information to be represented compactly. Parallel implementations on the Tilera 64-core System on Chip and the Xilinx Virtex 5-LX FPGA are presented. For the Tilera, we employ a reduced state machine to recognize dictionary terms without requiring explicit tokenization, and achieve throughput of 37 MB/s at a slightly reduced accuracy. For the FPGA, we have developed a set of software tools to help automate the process of converting training data to synthesizable hardware and to provide a means of trading off between accuracy and resource utilization. The Xilinx Virtex 5-LX implementation requires 0.2% of the memory used by the original algorithm. At 166 MB/s (80X the software) the hardware implementation is able to achieve Gigabit network throughput at the same accuracy as the original algorithm. 相似文献

9.

一种新的CMMB系统OFDM调制技术设计与实现方法*

郝禄国陈蕉容刘立程《计算机应用研究》2012,29(1):174-176

研究了CMMB标准第一部分(STiMi子系统)中的OFDM调制技术,提出了一种新的基于FPGA的OFDM调制技术设计与实现方法,并改进了OFDM技术的核心算法IFFT的流水线结构,大大降低了FPGA乘法器的数量,节约了硬件资源。经FPGA实现与验证,该方法能够正确、高效地实现CMMB系统的OFDM调制技术。相似文献

10.

高吞吐率浮点FFT处理器的FPGA实现研究 总被引：3，自引：0，他引：3

下载免费PDF全文

牟胜梅杨晓东《计算机工程与科学》2008,30(7):98-99

受浮点操作的长流水线延迟及FPGA片上RAM端口数目的限制,传统H可处理器的吞吐率通常只能达到每周期输出一个复数结果。本文用FPGA设计并实现了一种高吞吐率的IEEE754标准单精度浮点FFT处理器,通过改进蝶形计算单元的结构并重新组织FPGA片上RAM的访问,该处理器每周期平均可输出约两个复数计算结果,吞吐率约为传统FFT处理器吞吐率的两倍。对于1024点FFT变换,可在（512＋10）＊10=5220周期内完成。相似文献

11.

General memory efficient packet matching FPGA architecture for future high-speed networks

《Microprocessors and Microsystems》2020

Packet classification (matching) is one of the critical operations in networking widely used in many different devices and tasks ranging from switching or routing to a variety of monitoring and security applications like firewall or IDS. To satisfy the ever-growing performance demands of current and future high-speed networks, specially designed hardware accelerated architectures implementing packet classification are necessary. These demands are now growing to such an extent, that in order to keep up with the rising throughputs of network links, the FPGA accelerated architectures are required to perform matching of multiple packets in every single clock cycle. To meet this requirement a simple replication approach can be utilized – instantiate multiple copies of a processing pipeline matching incoming packets in parallel. However, simple replication of pipelines inseparably brings a significant increase in utilization of FPGA resources of all types, which is especially costly for rather scarce on-chip memories used in matching tables.We propose and examine a unique parallel hardware architecture for hash-based exact match classification of multiple packets in each clock cycle that offers a reduction of memory replication requirements. The core idea of the proposed architecture is to exploit the basic memory organization structure present in all modern FPGAs, where hundreds of individual block or distributed memory tiles are available and can be accessed (addressed) independently. This way, we are able to maintain a rather high throughput of matching multiple packets per clock cycle even without fully replicated memory resources in matching tables. Our results show that the designed approach can use on-chip memory resources very efficiently and even scales exceptionally well with increased capacities of match tables. For example, the proposed architecture is able to achieve a throughput of more than 2 Tbps (over 3 000 Mpps) with an effective capacity of more than 40 000 IPv4 flow records at the cost of only a few hundred block memory tiles (366 BlockRAM for Xilinx or 672 M20K for Intel FPGAs) utilizing only a small fraction of available logic resources (around 68 000 LUTs for Xilinx or 95 000 ALMs for Intel). 相似文献

12.

基于MB-OFDM的多媒体传感器网络峰均比特性研究

刘明珠于淇修德斌杨莘元《传感器与微系统》2008,27(5):52-55

针对多媒体无线传感器网络数据处理和无线数据传输中信息量大、功耗限制严等具体问题,提出采用宽带数据传输技术——多频带正交频分复用(MB-OFDM)调制技术实现对大量多媒体传感信息的高速、可靠和实时处理。通过分析OFDM调制技术特点,研究其峰均功率比(PAPR)特性,提出了有效改善PAPR的插入相位识别码元序列技术方案。仿真结果表明:该方案可以用于提高多媒体传感器网络数据传输的可靠性,减少频带资源占用,减小系统发射功耗。相似文献

13.

Evolutionary circuit design for fast FPGA-based classification of network application protocols

《Applied Soft Computing》2016

The evolutionary design can produce fast and efficient implementations of digital circuits. It is shown in this paper how evolved circuits, optimized for the latency and area, can increase the throughput of a manually designed classifier of application protocols. The classifier is intended for high speed networks operating at 100 Gbps. Because a very low latency is the main design constraint, the classifier is constructed as a combinational circuit in a field programmable gate array (FPGA). The classification is performed using the first packet carrying the application payload. The improvements in latency (and area) obtained by Cartesian genetic programming are validated using a professional FPGA design tool. The quality of classification is evaluated by means of real network data. All results are compared with commonly used classifiers based on regular expressions describing application protocols. 相似文献

14.

基于OFDM的MIMO-TDCS设计及性能研究

莫建云任清华《电子技术应用》2012,38(11):122-125

针对多输入多输出变换域通信系统(MIMO-TDCS)存在频谱利用率低的问题,基于OFDM系统所使用的FFT/IFFT数据调制解调思想,提出了一种在频域上对TDCS符号进行多数据符号加载的系统。理论上系统能够在一个TDCS符号上传输多个数据符号,能有效地提高频带利用率,并且解调过程运用FFT变换可简化解调结构的复杂度。通过仿真分析,验证了在符号传输功率一定并保证一定误码率性能的条件下,基于OFDM的MIMO-TDCS能够有效提高频谱利用率。相似文献

15.

A high-density multi-point LAPS set-up using a VCSEL array and FPGA control

Torsten WagnerAuthor Vitae Carl Frederik B. WernerAuthor Vitae Ko-Ichiro MiyamotoAuthor VitaeMichael J. SchöningAuthor Vitae Tatsuo YoshinobuAuthor Vitae 《Sensors and actuators. B, Chemical》2011,154(2):124-128

A new LAPS (light-addressable potentiometric sensor) set-up will be introduced, in which the light sources are miniaturised by the utilisation of a VCSEL (vertical-cavity surface-emitting laser) array to increase the measurement spot density. An FPGA (field-programmable gate array) is used to generate modulation signals for individual illumination of each measurement spot of the LAPS. The new set-up can operate a large number of measurement spots simultaneously by reading out the sum photocurrent and separate the signals of the individual measurement spots by an FFT analysis. The frequency, amplitude and offset of the modulation signal can be configured for each measurement spot by software. The new system can be combined with a positioning stage allowing the parallel read out of a single line of measurement spots and a scan perpendicular to that line in a similar manner, like for an optical scanner set-up. First measurements demonstrate the functionality of the new LAPS set-up as a chemical imaging system. 相似文献

16.

A novel concurrent error detection scheme for FFT networks

Tao D.L. Hartmann C.R.P. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(2):198-221

The algorithm-based fault tolerance techniques have been proposed to obtain reliable results at very low hardware overhead. Even though 100% fault coverage can be theoretically obtained by using these techniques, the system performance, i.e., fault coverage and throughput, can be drastically reduced due to many practical problems, e.g., round-off errors. A novel algorithm-based fault tolerance scheme is proposed for fast Fourier transform (FFT) networks. It is shown that the proposed scheme achieves 100% fault coverage theoretically. An accurate measure of the fault coverage for FFT networks is provided by taking the round-off error into account. The proposed scheme is shown to provide concurrent error detection capability to FFT networks with low hardware overhead, high throughput, and high fault coverage 相似文献

17.

A new systolic multiprocessor architecture for real-time soft tomography algorithms

《Parallel Computing》2016

In this paper, a new systolic multiprocessor architecture for soft tomography algorithms that explores the intrinsic parallelisms and hardware resources which are available in recent Field Programmable Gate Arrays architectures is presented. The soft tomography algorithms such as Electrical Capacitance Tomography (ECT), Magnetic Inductance Tomography (MIT), and Electrical Impedance Tomography (EIT), while they use different sensors and data acquisition modules, they feature common computation requirements which consist of intensive matrix multiplications and fast/frequent memory accesses. Using the variable bit-width and fixed-point multipliers array available in the DSP blocks, which cooperatively perform the partial matrix product with associated Arithmetic and Logic Units (ALU), and distributed memory available in Stratix V FPGA, a dedicated scalable architecture is suggested to host the Landweber algorithm. The experimental results indicate that 16,949 frames of (32 × 32 pixels) can be reconstructed in one second if each element of the matrix is attributed to 18 bits and using a clock frequency of 400 MHz. This is more than enough in most process imaging applications. In addition, the accuracy of the image reconstruction using 18 bits/operand is found to be acceptable since it exceeds 86%. More accuracy can be achieved up to 99% if 36 bits/operand are used which leads to an image reconstruction throughput of 1272 frames /s (for image size 32 × 32). 相似文献

18.

基于存储技术的高速嵌入式处理器的设计与实现 总被引：1，自引：0，他引：1

张钦韩承德《计算机学报》2007,30(5):831-837

SoPC(片上可编程系统,System on a Programmable Chip)在嵌入式系统中有着广泛的应用,通常用FPGA(现场可编程门阵列,Field Programmable Gate Array)实现.一类嵌入式处理器,例如小波变换处理器、压缩和解压缩处理器、FFT处理器,都可以采用基于存储技术的设计方法.FPGA的片内存储资源相对较少,如何有效地利用FPGA的片内存储资源实现高速的嵌入式处理器成为需要研究的问题.文中以FFT处理器为例说明这种方法的有效性,通过采用一种地址映射调度策略和两种无冲突操作数地址映射方式,减少了所使用的FPGA片内存储资源,提高了处理速度.该FFT处理器在实际系统中起到了关键作用. 相似文献

19.

FPGA-implementation of atan(Y/X) based on logarithmic transformation and LUT-based techniques

R. Gutierrez V. Torres J. Valls 《Journal of Systems Architecture》2010,56(11):588-596

This paper presents an architecture for the computation of the atan(Y/X) operation suitable for broadband communications systems where a throughput between 20 and 40 MHz is required. The proposed architecture implements a division operation of two inputs by means of a logarithmic transformation, in which the division can be performed with a subtraction. A combination of non-uniform segmentation and multipartite LUT technique is proposed for the arctangent of the logarithm approximation. The architecture was implemented in a Xilinx FPGA device achieving higher throughput than the approach based on CORDIC algorithm and lower area than previous LUT-based approaches. 相似文献

20.

An FPGA-based queue management system for high speed networking devices

《Microprocessors and Microsystems》2004,28(5-6):223-236

One of the main bottlenecks when designing a network system is very often its memory subsystem. This is mainly due to the extremely high speed of the state-of-the-art network links and to the fact that in order to support advanced Quality of Service (QoS), a large number of independent queues is desirable. In this paper we describe the architecture and performance of a memory manager, the QMS that is tailored to FPGA technology and can provide up to 6.2 Gbps of aggregate throughput, while handling 32 K independent queues. The presented system supports a complete instruction set and thus we believe it can be used as a hardware component in any suitable networking system. It also supports a large number of different interfaces, and it is designed in a very scalable way. The QMS uses a double data rate DRAM for data storage and a typical SRAM for keeping data structures-pointers, therefore minimizing the system's cost. In order to deal with the problems of refreshing and bank conflicts in the DRAM, several optimization techniques have been employed. In this paper we also present the architecture of a network-processing device that fully utilizes the advanced feature of the QMS. The QMS consists of 8500 Slices in a XILINX FPGA and works at 125 MHz. 相似文献