期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Implementation of LTE system on an SDR platform using CUDA and UHD

Saehee Bang Chiyoung Ahn Yong Jin Seungwon Choi John Glossner Sungsoo Ahn 《Analog Integrated Circuits and Signal Processing》2014,78(3):599-610

In this paper, we present an implementation of a long term evolution (LTE) system on a software defined radio (SDR) platform using a conventional personal computer that adopts a graphic processing unit (GPU) and a universal software radio peripheral2 (USRP2) with a URSP hardware driver (UHD) to implement an SDR software modem and a radio frequency transceiver, respectively. The central processing unit executes C++ control code that can access the USRP2 via the UHD. We have adopted the Ettus Research UHD due to its high degree of flexibility in the design of the transceiver chain. By taking advantage of this benefit, a simple cognitive radio engine has been implemented using libraries provided by the UHD. We have implemented the software modem on a GPU that is suitable for parallel computing due to its powerful arithmetic and logic units. A parallel programming method is proposed that exploits the single instruction multiple data architecture of the GPU. We focus on the implementation of the Turbo decoder due to its high computational requirements and difficulty in parallelizing the algorithm. The implemented system is analyzed primarily in terms of computation time using the compute unified device architecture profiler. From our experimental tests using the implemented system, we have measured the total processing time for a single frame of both transmit and receive LTE data. We find that it takes 5.00 and 8.58 ms for transmit and receive, respectively. This confirms that the implemented system is capable of real-time processing of all the baseband signal processing algorithms required for LTE systems. 相似文献

2.

Implementation of an SDR platform using GPU and its application to a 2?��?2 MIMO WiMAX system

Chiyoung Ahn June Kim Jaehyuk Ju Jinho Choi Byungcho Choi Seungwon Choi 《Analog Integrated Circuits and Signal Processing》2011,69(2-3):107-117

Conventional communication systems have been implemented using digital signal processors (DSPs) and/or field programmable gate arrays (FPGAs), especially for software defined radio (SDR) functionality. We propose a scheme that uses a graphics processing unit (GPU) in place of the conventional DSPs or FPGAs for the implementation of an SDR-based communication system. The GPU, a high-speed parallel processor with multiple arithmetic logic units, is adopted for the signal processing of the physical layer required for the parallel processing in an SDR system. The compute unified device architecture (CUDA) based on the C language provides a software development kit (SDK) for the modem application of the GPU. Therefore we utilize the CUDA SDK to implement the real-time modem function. This paper presents an implementation of a 2 × 2 multiple-input multiple-output (MIMO) WiMAX system employing a GPU as the real-time modem. By installing a radio frequency module on top of the GPU modem, we implement a real-time transmission system for video data. The performance of the proposed GPU-based system is demonstrated by comparing its operation time against that of the conventional DSP-based system. 相似文献

3.

Implementation of parallel lattice reduction-aided MIMO detector using graphics processing unit

Hyunwook Yang Taehyun Kim Chiyoung Ahn June Kim Seungwon Choi John Glossner 《Analog Integrated Circuits and Signal Processing》2012,73(2):559-567

Since H. Yao proposed the lattice reduction (LR)-aided detection algorithm for the MIMO detector, one can exploit the diversity gain provided by the LR method to achieve performance comparable to the maximum likelihood (ML) algorithm but with complexity close to the simple linear detection algorithms such as zero forcing (ZF), minimum mean squared error, and successive interference cancellation, etc. In this paper, in order to reduce the processing time of the LR-aided detector, a graphics processing unit (GPU) has been proposed as the main modem processor in such a way that the detections can be performed in parallel using multiple threads in the GPU. A 2X2 multiple input multiple output (MIMO) WiMAX system has been implemented using a GPU to verify that various MIMO detection algorithms such as ZF, ML, and LR-aided methods can be processed in real-time. From the experimental results, we show that GPUs can realize a 2X2 WiMAX MIMO system adopting an LR-aided detector in real-time. We achieve a processing time of 2.75?ms which meets the downlink duration specification of 3?ms. BER performance of experimental tests also indicates that the LR-aided MIMO detector can fully exploit diversity gain as well as ML detector. 相似文献

4.

Integration of Dataflow-Based Heterogeneous Multiprocessor Scheduling Techniques in GNU Radio

George F. Zaki William Plishker Shuvra S. Bhattacharyya Charles Clancy John Kuykendall 《Journal of Signal Processing Systems》2013,70(2):177-191

相似文献

5.

Multi-radio coexistence and collaboration on an SDR platform

Tommi Zetterman Antti Piipponen Kalle Raiskila Sverre Slotte 《Analog Integrated Circuits and Signal Processing》2011,69(2-3):329-339

In order to support the simultaneous use of both legacy and new radios in a multi-radio handset, a Software Defined Radio (SDR) platform needs to offer coexistence mechanisms and services for radios. This paper proposes an SDR control framework to provide the coexistence services and common interfaces for them. The multi-radio control in proposed platform is divided into two parts, the light-weight dynamic scheduling with tight real-time constraints to solve the temporal interoperability issues between radios, and the semi-dynamic admission control to perform the resource allocation when a radio changes its behavioral pattern. The control framework was implemented on a SDR technology demonstrator, to show how multiple simultaneously active radios are controlled, and how the coexistence mechanism can be used to provide tangible benefits to the SDR modem user, like the ability to utilize fine-grained spectral holes. 相似文献

6.

A software-defined communications baseband design 总被引：1，自引：0，他引：1

Glossner J. Iancu D. Jin Lu Hokenek E. Moudgill M. 《Communications Magazine, IEEE》2003,41(1):120-128

Software-defined radios offer a programmable and dynamically reconfigurable method of reusing hardware to implement the physical layer processing of multiple communications systems. An SDR can dynamically change protocols and update communications systems over the air as a service provider allows. In this article we discuss a baseband solution for an SDR system and describe a 2 Mb/s WCDMA design with GSM/GPRS and 802.11b capability that executes all physical layer processing completely in software. We describe the WCDMA communications protocols with a focus on latency reduction and unique implementation techniques. We also describe the underlying technology that enables software execution. Our solution is programmed in C and executed on a multithreaded processor in real time. 相似文献

7.

A Reconfigurable ASIP for Convolutional and Turbo Decoding in an SDR Environment

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(10):1309-1320

Future mobile and wireless communication networks require flexible modem architectures to support seamless services between different network standards. Hence, a common hardware platform that can support multiple protocols implemented or controlled by software, generally referred to as software defined radio (SDR), is essential. This paper presents a family of dynamically reconfigurable application-specific instruction-set processors (ASIPs) for channel coding in wireless communication systems. As a weakly programmable intellectual property (IP) core, it can implement trellis-based channel decoding in a SDR environment. It features binary convolutional decoding, and turbo decoding for binary as well as duobinary turbo codes for all current and upcoming standards. The ASIP consists of a specialized pipeline with 15 stages and a dedicated communication and memory infrastructure. Logic synthesis revealed a maximum clock frequency of 400 MHz and an area of 0.11 mm$^{2}$ for the processor's logic using a low power 65-nm technology. Memories require another 0.31 mm$^{2}$ . Simulation results for Viterbi and turbo decoding demonstrate maximum throughput of 196 and 34 Mb/s, respectively. The ASIP hence outperforms state-of-the-art decoder architectures targeting software defined radio by at least a factor of three while consuming only 60% or less of the logic area. 相似文献

8.

基于CPU+GPU混合架构的实时成像系统设计与实现

下载免费PDF全文

张彦彬丁晟高雁李松键朱金中孙友礼张陨石《太赫兹科学与电子信息学报》2019,17(1):146-151

雷达成像处理需要更大宽带以实现更高的距离分辨力,同时还需要更多的脉冲积累获得更高的方位像分辨力,因此雷达成像处理过程计算量巨大。如何实现未来超带宽雷达的实时成像处理是一项艰巨挑战。图形处理器(GPU)以卓越的浮点性能和访存带宽,成为并行加速应用平台的有力候选者。设计了一种基于CPU+GPU平台并面向合成孔径雷达/逆合成孔径雷达(SAR/ISAR)的实时成像系统方案,并将该方案实体化。实验表明,该成像系统能够实现实时SAR/ISAR成像,同时该实时成像系统也可用于电子对抗领域,在干扰方法和效果研究中起到重要作用。相似文献

9.

Handoff with DSP Support: Enabling Seamless Voice Communications across Heterogeneous Telephony Systems on Dual-Mode Mobile Devices

Hsieh Hung-Yun Li Chung-Wei Lin Hsiao-Pu 《Mobile Computing, IEEE Transactions on》2009,8(1):93-108

In this paper we investigate the problem of voice communications across heterogeneous telephony systems on dual-mode (WiFi and GSM) mobile devices. Since GSM is a circuit-switched telephony system, existing solutions that are based on packet-switched network protocols cannot be used. We show in this paper that an enabling technology for seamless voice communications across circuit-switched and packet-switched telephony systems is the support of digital signal processing (DSP) techniques during handoffs. To substantiate our argument, we start with a framework based on the Session Initiation Protocol (SIP) for vertical handoffs on dual-mode mobile devices. We then identify the key obstacle in achieving seamless handoffs across circuit-switched and packet-switched systems, and explain why DSP support is necessary in this context. We propose a solution that incorporates time alignment and time scaling algorithms during handoffs for supporting seamless voice communications across heterogeneous telephony systems. We conduct testbed experiments using a GSM-WiFi dual-mode notebook and evaluate the quality of speech when the call is migrated from WiFi to GSM networks. Evaluation results show that such a cross-disciplinary solution involving signal processing and networking can effectively support seamless voice communications across heterogeneous telephony systems. 相似文献

10.

Design of a digital FM demodulator based on a 2nd-order all-digital phase-locked loop

Juan Pablo Martinez Brito Sergio Bampi 《Analog Integrated Circuits and Signal Processing》2008,57(1-2):97-105

Software-defined radio (SDR) is a revolution in radio design due to the ability to create radios that can self-adapt on the fly. In SDR devices, all of the signal processing is implemented in the digital domain, mainly on DSP blocks or by DSP software. By simply downloading a new program, a SDR device is able to interoperate with different wireless protocols, incorporate new services, and upgrade to new standards. Therefore, massively parallel signal processing at higher frequencies are needed to implement a realistic SDR. Thus, FPGAs have been used extensively for implementing essential functions in SDR architectures at lower frequencies. In this paper, we explore the design of a digital FM receiver using the approach of an All-Digital Phase Locked-Loop (ADPLL). The circuit is designed in VHDL, then synthesized and simulated using LeonardoSpectrum Level 3 and ModelSim SE 6, respectively. It operates at a frequency up to 150 MHz and occupies the area of roughly 15 K logic gates. 相似文献

11.

GSWO: A programming model for GPU-enabled parallelization of sliding window operations in image processing

《Signal Processing: Image Communication》2016

Sliding Window Operations (SWOs) are widely used in image processing applications. They often have to be performed repeatedly across the target image, which can demand significant computing resources when processing large images with large windows. In applications in which real-time performance is essential, running these filters on a CPU often fails to deliver results within an acceptable timeframe. The emergence of sophisticated graphic processing units (GPUs) presents an opportunity to address this challenge. However, GPU programming requires a steep learning curve and is error-prone for novices, so the availability of a tool that can produce a GPU implementation automatically from the original CPU source code can provide an attractive means by which the GPU power can be harnessed effectively. This paper presents a GPU-enabled programming model, called GSWO, which can assist GPU novices by converting their SWO-based image processing applications from the original C/C++ source code to CUDA code in a highly automated manner. This model includes a new set of simple SWO pragmas to generate GPU kernels and to support effective GPU memory management. We have implemented this programming model based on a CPU-to-GPU translator (C2GPU). Evaluations have been performed on a number of typical SWO image filters and applications. The experimental results show that the GSWO model is capable of efficiently accelerating these applications, with improved applicability and a speed-up of performance compared to several leading CPU-to-GPU source-to-source translators. 相似文献

12.

Obstacle-avoiding rectilinear Steiner tree construction in sequential and parallel approach

Wing-Kai Chow Liang Li Evangeline F.Y. Young Chiu-Wing Sham 《Integration, the VLSI Journal》2014

The Rectilinear Steiner Minimum Tree (RSMT) problem is a fundamental one in VLSI physical design. In this paper, we present a maze routing based heuristics to solve the obstacle-avoiding RSMT (OARSMT) problem. Our approach can handle multi-pin nets in good quality and reasonable running time. We also present an implementation of the heuristics in parallel approach with the aid of graphic processing units (GPU). The parallel algorithm is implemented by using CUDA and has been tested on a NVIDIA graphic card. Our experimental results show that our parallel algorithm has promising speedups over our sequential approach. This work demonstrates that we can apply a parallel algorithm to solve the OARSMT problem with the aid of GPU. 相似文献

13.

On-Site Volume Rendering with GPU-Enabled Devices

Muhammad Mobeen Movania Wei Ming Chiew Feng Lin 《Wireless Personal Communications》2014,76(4):795-812

Now that high-performance computing systems can rely more on a cloud based infrastructure, it becomes much more important to have ubiquitous data processing and visualization capability. This will allow data sharing among numerous clients using shared data repositories through a secure web server. Thanks to the wide availability of GPU support in today’s mobile devices such as smart phones and tablets, as well as the recently published WebGL standard, pervasive computing for high-quality and real-time volume rendering may be realized on such high-performance platforms. We have invented two high-performance volume renderers, namely, single-pass GPU ray caster and fast 3D texture slicer, for both mobile and desktop platforms. Rigorous experiments and performance assessments reveal that the proposed mobile 3D image rendering system outperforms the existing approaches in the literature. 相似文献

14.

双模对讲机中数字编码静噪系统的实现 总被引：1，自引：0，他引：1

许科黄磊崔慧娟唐昆《信息技术》2011,(10):90-93

由于数字对讲机的频谱利用率比模拟对讲机高得多,还由于数字对讲机能够提供模拟对讲机无法达到的数据处理种类及灵活性,因此模拟对讲机的数字化改造进程已势不可挡.在这个过渡时期,需要开发数模兼容的双模对讲机,以满足市场需要并实现平缓过渡.双模对讲机的数字基带信号处理系统已经用数字信号处理器(DSP)实现,以往使用专用芯片实现的... 相似文献

15.

GPU Computing 总被引：9，自引：0，他引：9

Owens J.D. Houston M. Luebke D. Green S. Stone J.E. Phillips J.C. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》2008,96(5):879-899

The graphics processing unit (GPU) has become an integral part of today's mainstream computing systems. Over the past six years, there has been a marked increase in the performance and capabilities of GPUs. The modern GPU is not only a powerful graphics engine but also a highly parallel programmable processor featuring peak arithmetic and memory bandwidth that substantially outpaces its CPU counterpart. The GPU's rapid increase in both programmability and capability has spawned a research community that has successfully mapped a broad range of computationally demanding, complex problems to the GPU. This effort in general-purpose computing on the GPU, also known as GPU computing, has positioned the GPU as a compelling alternative to traditional microprocessors in high-performance computer systems of the future. We describe the background, hardware, and programming model for GPU computing, summarize the state of the art in tools and techniques, and present four GPU computing successes in game physics and computational biophysics that deliver order-of-magnitude performance gains over optimized CPU applications. 相似文献

16.

A RAKE Receiver With an ICI/ISI Equalizer for a CCK Modem

Yusung Lee Hyuncheol Park 《Vehicular Technology, IEEE Transactions on》2009,58(1):198-206

In this paper, we first derive the theoretical performance of a complementary code keying (CCK) code on an additive white Gaussian noise (AWGN) channel and over a multipath channel. To derive the error performance, we use the weight and cross-correlation distributions of the CCK code for optimal and suboptimal decoding, respectively, based on union bound. In addition, we propose a RAKE receiver for a CCK modem, which is suitable for a multipath environment with a large delay spread. The RAKE receiver principle is acceptable for modest multipath because it can coherently combine multipath components to provide signal-to-noise ratio (SNR) enhancement. However, as the delay spread is larger and the data rate of systems goes higher, intersymbol interference (ISI) generated due to multipath environments are increased. To handle the increasing ISI, the CCK modem needs an equalization technique to remove the ISI, together with RAKE processing. Thus, our proposed system is based on a channel matched filter (CMF) with a decision feedback equalizer (DFE). The CMF is applied for RAKE processing, whereas the DFE structure is used for ISI cancellation. In our system, ISI is calculated and removed by using a decoded CCK codeword. 相似文献

17.

Development of wireless brain computer interface with embedded multitask scheduling and its application on real-time driver's drowsiness detection and warning 总被引：1，自引：0，他引：1

Lin CT Chen YC Huang TY Chiu TT Ko LW Liang SF Hsieh HY Hsu SH Duann JR 《IEEE transactions on bio-medical engineering》2008,55(5):1582-1591

Biomedical signal monitoring systems have been rapidly advanced with electronic and information technologies in recent years. However, most of the existing physiological signal monitoring systems can only record the signals without the capability of automatic analysis. In this paper, we proposed a novel brain-computer interface (BCI) system that can acquire and analyze electroencephalogram (EEG) signals in real-time to monitor human physiological as well as cognitive states, and, in turn, provide warning signals to the users when needed. The BCI system consists of a four-channel biosignal acquisition/amplification module, a wireless transmission module, a dual-core signal processing unit, and a host system for display and storage. The embedded dual-core processing system with multitask scheduling capability was proposed to acquire and process the input EEG signals in real time. In addition, the wireless transmission module, which eliminates the inconvenience of wiring, can be switched between radio frequency (RF) and Bluetooth according to the transmission distance. Finally, the real-time EEG-based drowsiness monitoring and warning algorithms were implemented and integrated into the system to close the loop of the BCI system. The practical online testing demonstrates the feasibility of using the proposed system with the ability of real-time processing, automatic analysis, and online warning feedback in real-world operation and living environments. 相似文献

18.

A fair comparison of SAC-OCDMA system configurations based on two dimensional cyclic shift code and spectral direct detection

Alayedi Mohanad Cherifi Abdelhamid Ferhat Hamida Abdelhak Mrabet Hichem 《Telecommunication Systems》2022,79(2):193-212

This paper investigates shortcomings that limit the performance of optical code division multiple access (OCDMA) systems including the low cardinality and data rate as well as the high power at reception. The main drawback for such systems known as multiple access interference accompanying by phase induced intensity noise is also investigated to effeciencly propose a novel two dimensional cyclic shift (2D-CS) code to be implemented in non-coherent OCDMA systems. The developed code is based on a one dimensional cyclic shift (1D-CS) code previously provided by research works processing spectral amplitude coding for optical code division multiple access (SAC-OCDMA) systems. Numerical results obtained by this study are therefore compared to previous studies employing different codes like two dimensional extended double weight (2D-EDW), two dimensional flexible cross correlation/modified double weight (2D-FCC/MDW), two dimensional perfect difference (2D-PD), two dimensional diluted perfect difference (2D-DPD), two dimensional multi service (2D-MS) and two dimensional zero cross correlation/multi diagonal (2D-ZCC/MD) codes. Accordingly, it is demonstrated that the proposed 2D-CS code outperforms all codes given previously in terms of system capacity where the small increasing percentage is about 40% compared to 2D-ZCC/MD and 2D-MS. Systems using 2D-CS code can support until 203 simultaneous users with a total code length equal to 171. System performance investigation leads to a BER and Q-Factor closely to1.0E?12 and 1.0E?27, and 6.6 dB and 10.6 dB at 20 km of single mode fiber length using white light source and Laser, respectively. Furthermore, such a code can be easily adopted by OCDMA systems for a long distance up to approximately 55 and 100 km.

相似文献

19.

GPU加速三维面形测量

下载免费PDF全文

赵亚龙刘守起张启灿《红外与激光工程》2018,47(3):317003-0317003(7)

随着通用计算和图形显示需求的不断增加,图形处理器（Graphics Processing Unit,GPU）在医学、科学计算、图像处理等领域得到了广泛的应用。但它在三维测量领域的应用还只是一个开始。文中基于傅里叶变换轮廓术（Fourier Transform Profilometry,FTP）和三频外差法设计了两套三维测量系统,并利用计算统一设备架构（Compute Unified Device Architecture,CUDA）方法,加速了静态或动态物体的三维重建。在三频外差测量系统中,需要利用高速数字投影模块和相机,同步触发采集小视场表面的12个变形条纹图,然后对图像数据进行处理。实验结果表明:对12幅1 360 pixel1 024 pixel大小的图像进行相位展开运算,GPU方法比CPU方法的效率提高了2 089倍。在基于FTP方法的测量系统中,摄像机只需记录一幅变形条纹图,然后拷贝到显存中,并用CUDA编程的算法进行处理,进而重建出物体的三维面形。基于GPU的FTP方法对一幅1 024 pixel1 280 pixel大小的图像进行计算,其计算时间比CPU方法缩短了27倍。相似文献

20.

Real-time implementation of a reconfigurable IMT-2000 base stationchannel modem

Murotake D. Oafes L. Fuchs A. 《Communications Magazine, IEEE》2000,38(2):148-152

Research suggests that joint methods combining smart antennas, RAKE reception, multi-user detection or other adaptive methods may be practically implemented for IMT-2000 channel modems using computationally simplified algorithms. Using software-defined radio methods, these modems can be employed in a new generation of adaptive multimode base stations which permit software reconfiguration from second- to third-generation air interfaces. Practical implementation is made possible by corresponding advances in hardware technology, including new processors and high-bandwidth I/O fabrics which replace traditional computer buses with their inherent limitations in bandwidth and scalability. In this article adaptive processing research is reviewed, implementation requirements for second- and third-generation base stations are considered, and the capabilities of selected new monolithic silicon devices are examined. A possible implementation approach for a reconfigurable multimode base station channel modem using software defined radio (SDR) design methods is proposed 相似文献