期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A new, cellular automaton-based, nearest neighbor patternclassifier and its VLSI implementation

Tzionas P.G. Tsalides P.G. Thanailakis A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1994,2(3):343-353

A new, parallel, nearest-neighbor (NN) pattern classifier, based on a 2D Cellular Automaton (CA) architecture, is presented in this paper. The proposed classifier is both time and space efficient, when compared with already existing NN classifiers, since it does not require complex distance calculations and ordering of distances, and storage requirements are kept minimal since each cell stores information only about its nearest neighborhood. The proposed classifier produces piece-wise linear discriminant curves between clusters of points of complex shape (nonlinearly separable) using the computational geometry concept known as the Voronoi diagram, which is established through CA evolution. These curves are established during an “off-line” operation and, thus, the subsequent classification of unknown patterns is achieved very fast. The VLSI design and implementation of a nearest neighborhood processor of the proposed 2D CA architecture is also presented in this paper 相似文献

2.

Highly-Parallel Decoding Architectures for Convolutional Turbo Codes

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(10):1147-1151

Highly parallel decoders for convolutional turbo codes have been studied by proposing two parallel decoding architectures and a design approach of parallel interleavers. To solve the memory conflict problem of extrinsic information in a parallel decoder, a block-like approach in which data is written row-by-row and read diagonal-wise is proposed for designing collision-free parallel interleavers. Furthermore, a warm-up-free parallel sliding window architecture is proposed for long turbo codes to maximize the decoding speeds of parallel decoders. The proposed architecture increases decoding speed by 6%-34% at a cost of a storage increase of 1% for an eight-parallel decoder. For short turbo codes (e.g., length of 512 bits), a warm-up-free parallel window architecture is proposed to double the speed at the cost of a hardware increase of 12% 相似文献

3.

An advanced system for the automatic classification of multitemporal SAR images 总被引：2，自引：0，他引：2

Bruzzone L. Marconcini M. Wegmuller U. Wiesmann A. 《Geoscience and Remote Sensing, IEEE Transactions on》2004,42(6):1321-1334

A novel system for the classification of multitemporal synthetic aperture radar (SAR) images is presented. It has been developed by integrating an analysis of the multitemporal SAR signal physics with a pattern recognition approach. The system is made up of a feature-extraction module and a neural-network classifier, as well as a set of standard preprocessing procedures. The feature-extraction module derives a set of features from a series of multitemporal SAR images. These features are based on the concepts of long-term coherence and backscattering temporal variability and have been defined according to an analysis of the multitemporal SAR signal behavior in the presence of different land-cover classes. The neural-network classifier (which is based on a radial basis function neural architecture) properly exploits the multitemporal features for producing accurate land-cover maps. Thanks to the effectiveness of the extracted features, the number of measures that can be provided as input to the classifier is significantly smaller than the number of available multitemporal images. This reduces the complexity of the neural architecture (and consequently increases the generalization capabilities of the classifier) and relaxes the requirements relating to the number of training patterns to be used for classifier learning. Experimental results (obtained on a multitemporal series of European Remote Sensing 1 satellite SAR images) confirm the effectiveness of the proposed system, which exhibits both high classification accuracy and good stability versus parameter settings. These results also point out that properly integrating a pattern recognition procedure (based on machine learning) with an accurate feature extraction phase (based on the SAR sensor physics understanding) represents an effective approach to SAR data analysis. 相似文献

4.

Classification of multisensor remote-sensing images by structuredneural networks

Serpico S.B. Roli F. 《Geoscience and Remote Sensing, IEEE Transactions on》1995,33(3):562-578

Proposes the application of structured neural networks to classification of multisensor remote-sensing images. The purpose of the approach is to allow the interpretation of the “network behavior”, as it can be utilized by photointerpreters for the validation of the neural classifier. In addition, this approach gives a criterion for defining the network architecture, so avoiding the classical trial-and-error process. First of all, the architecture of structured multilayer feedforward networks is tailored to a multisensor classification problem. Then, such networks are trained to solve the problem by the error backpropagation algorithm. Finally, they are transformed into equivalent networks to obtain a simplified representation. The resulting equivalent networks may be interpreted as a hierarchical arrangement of “committees” that accomplish the classification task by checking on a set of explicit constraints on input data. Experimental results on a multisensor (optical and SAR) data set are described in terms of both classification accuracy and network interpretation. Comparisons with fully connected neural networks and with the k-nearest neighbor classifier are also made 相似文献

5.

Parallel Computation of Adaptive Filtering Algorithms on Multi-Core Systems

Dong-hwan Lee Jaewoo Ahn Wonyong Sung 《Journal of Signal Processing Systems》2012,69(3):253-265

The performance of recent CPUs has been rapidly increasing with the help of parallel architectural supports, such as SIMD (Single Instruction Multiple Data) extensions and multi-core architecture. However, efficient use of such parallel supports for adaptive filtering is difficult due to feedback loops that induce the data dependency problem. In this paper, efficient parallel computation of adaptive filters is studied for multi-core architecture with SIMD arithmetic support. Control- and data-level parallel computation methods are considered, where the former finds parallelism in the evaluation of one output sample, while the latter processes multiple output samples at a time to increase the degree of parallelism. The control-level parallel approach frequently utilizes the pipelining technique to uncover the parallelism, whereas the data-level approach employs a parallel computation method for linear recurrence equations to resolve the dependency. Not only adaptive transversal LMS (Least Mean Square) but also gradient adaptive lattice (GAL) and QR-decomposition based least-square lattice (QRD-LSL) filters are implemented on a PC that employs both SIMD and multi-core architecture. 相似文献

6.

High performance, high throughput turbo/SOVA decoder design

Zhongfeng Wang Parhi K.K. 《Communications, IEEE Transactions on》2003,51(4):570-579

Two efficient approaches are proposed to improve the performance of soft-output Viterbi (1998) algorithm (SOVA)-based turbo decoders. In the first approach, an easily obtainable variable and a simple mapping function are used to compute a target scaling factor to normalize the extrinsic information output from turbo decoders. An extra coding gain of 0.5 dB can be obtained with additive white Gaussian noise channels. This approach does not introduce extra latency and the hardware overhead is negligible. In the second approach, an adaptive upper bound based on the channel reliability is set for computing the metric difference between competing paths. By combining the two approaches, we show that the new SOVA-based turbo decoders can approach maximum a posteriori probability (MAP)-based turbo decoders within 0.1 dB when the target bit-error rate (BER) is moderately low (e.g., BER<10/sup -4/ for 1/2 rate codes). Following this, practical implementation issues are discussed and finite precision simulation results are provided. An area-efficient parallel decoding architecture is presented in this paper as an effective approach to design high-throughput turbo/SOVA decoders. With the efficient parallel architecture, multiple times throughput of a conventional serial decoder can be obtained by increasing the overall hardware by a small percentage. To resolve the problem of multiple memory accesses per cycle for the efficient parallel architecture, a novel two-level hierarchical interleaver architecture is proposed. Simulation results show that the proposed interleaver architecture performs as well as random interleavers, while requiring much less storage of random patterns. 相似文献

7.

Low Frequency Architecture for Multi-Lamp CCFL Systems With Capacitive Ignition

Doshi M. Zane R. Azcondo F.J. 《Display Technology, Journal of》2009,5(5):152-161

A low frequency architecture is proposed for driving parallel cold cathode fluorescent lamps (CCFLs) in large screen liquid crystal display (LCD) TV backlighting applications. Key to the architecture is a proposed capacitive coupling approach for aiding lamp ignition. A dc voltage is applied to the lamp electrodes while an ac voltage is applied to an external plate for capacitive coupling. The result is reliable, simultaneous ignition of parallel lamps with a required applied dc voltage near the lamp steady-state operating voltage. The complete system architecture includes a single high voltage converter, a pulse lamp ignition circuit, current control circuits and a single backlight controller. The topology is capable of driving a large number of parallel lamps with independent lamp current regulation, while avoiding ac coupling losses in steady-state operation and achieving significant reduction in reactive components when compared to typical high frequency ac ballast designs. Experimental results are presented for a system of four parallel 250 mm length lamps, demonstrating simultaneous parallel lamp ignition and dc current regulation. 相似文献

8.

GANGLION-a fast field-programmable gate array implementation of aconnectionist classifier

Cox C.E. Blanz W.E. 《Solid-State Circuits, IEEE Journal of》1992,27(3):288-299

71 The architecture, implementation, and application of GANGLION, a totally digital connectionist classifier, are described. This fully interconnected feedforward net with one hidden layer is capable of generating 4.48 billion interconnection/s. The architecture is realized on a single 9U VME card and is built entirely from off-the-shelf components. The very high throughput of 20 million decision/s is achieved by making efficient use of field-programmable gate arrays. Specifically, the authors take advantage of the reprogrammability of the devices to automatically generate new custom hardware for each application of the classifier 相似文献

9.

Exploiting Thread‐Level Parallelism in Lockstep Execution by Partially Duplicating a Single Pipeline

Jaegeun Oh Seok Joong Hwang Huong Giang Nguyen Areum Kim Seon Wook Kim Chulwoo Kim Jong‐Kook Kim 《ETRI Journal》2008,30(4):576-586

In most parallel loops of embedded applications, every iteration executes the exact same sequence of instructions while manipulating different data. This fact motivates a new compiler‐hardware orchestrated execution framework in which all parallel threads share one fetch unit and one decode unit but have their own execution, memory, and write‐back units. This resource sharing enables parallel threads to execute in lockstep with minimal hardware extension and compiler support. Our proposed architecture, called multithreaded lockstep execution processor (MLEP), is a compromise between the single‐instruction multiple‐data (SIMD) and symmetric multithreading/chip multiprocessor (SMT/CMP) solutions. The proposed approach is more favorable than a typical SIMD execution in terms of degree of parallelism, range of applicability, and code generation, and can save more power and chip area than the SMT/CMP approach without significant performance degradation. For the architecture verification, we extend a commercial 32‐bit embedded core AE32000C and synthesize it on Xilinx FPGA. Compared to the original architecture, our approach is 13.5% faster with a 2‐way MLEP and 33.7% faster with a 4‐way MLEP in EEMBC benchmarks which are automatically parallelized by the Intel compiler. 相似文献

10.

Block-LDPC: a practical LDPC coding system design approach

Hao Zhong Tong Zhang 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(4):766-775

This paper presents a joint low-density parity-check (LDPC) code-encoder-decoder design approach, called Block-LDPC, for practical LDPC coding system implementations. The key idea is to construct LDPC codes subject to certain hardware-oriented constraints that ensure the effective encoder and decoder hardware implementations. We develop a set of hardware-oriented constraints, subject to which a semi-random approach is used to construct Block-LDPC codes with good error-correcting performance. Correspondingly, we develop an efficient encoding strategy and a pipelined partially parallel Block-LDPC encoder architecture, and a partially parallel Block-LDPC decoder architecture. We present the estimation of Block-LDPC coding system implementation key metrics including the throughput and hardware complexity for both encoder and decoder. The good error-correcting performance of Block-LDPC codes has been demonstrated through computer simulations. With the effective encoder/decoder design and good error-correcting performance, Block-LDPC provides a promising vehicle for real-life LDPC coding system implementations. 相似文献

11.

Efficient pipelined flow classification for intelligent data processing in IoT

Seyed Navid Mousavi Fengping Chen Mahdi Abbasi Mohammad R. Khosravi Milad Rafiee 《Digital Communications & Networks》2022,8(4):561-575

The packet classification is a fundamental process in provisioning security and quality of service for many intelligent network-embedded systems running in the Internet of Things (IoT). In recent years, researchers have tried to develop hardware-based solutions for the classification of Internet packets. Due to higher throughput and shorter delays, these solutions are considered as a major key to improving the quality of services. Most of these efforts have attempted to implement a software algorithm on the FPGA to reduce the processing time and enhance the throughput. The proposed architectures, however, cannot reach a compromise among power consumption, memory usage, and throughput rate. In view of this, the architecture proposed in this paper contains a pipeline-based micro-core that is used in network processors to classify packets. To this end, three architectures have been implemented using the proposed micro-core. The first architecture performs parallel classification based on header fields. The second one classifies packets in a serial manner. The last architecture is the pipeline-based classifier, which can increase performance by nine times. The proposed architectures have been implemented on an FPGA chip. The results are indicative of a reduction in memory usage as well as an increase in speedup and throughput. The architecture has a power consumption of is 1.294w, and its throughput with a frequency of 233 ?MHz exceeds 147 Gbps. 相似文献

12.

An analog VLSI implementation of a feature extractor for real timeoptical character recognition

Bo G.M. Caviglia D.D. Valle M. 《Solid-State Circuits, IEEE Journal of》1998,33(4):556-564

The architecture, the design, and the analog very large scale integration (VLSI) implementation of a feature extractor chip for optical character recognition (OCR) systems are described. The chip extracts a set of 112 feature values coded by current signals from a 32×24 digital pixel matrix, representing the input character. Such features are applied to a classifier (for example, a neural classifier) performing the recognition task. The measurements performed on that chip confirm its functionality. The chip can be used with a segmented and nonsegmented string of characters. A throughput of about 140 kChar/s is achieved for the segmented case, while a throughput of about 450 kChar/s is achieved for the nonsegmented case. The OCR architecture has been functionally validated. A set of numerical handwritten characters has been processed by the chip and the measured output features (after a normalization operation) have been used as input for neural network classifier; implemented by a software simulator which performs the recognition task. The resulting classification error rate (4.3%) has been successfully compared with those obtained by a high level model of this chip, and the results validate the entire architecture 相似文献

13.

Mixtures of boosted classifiers for frontal face detection

Julien Meynet Vlad Popovici Jean-Philippe Thiran 《Signal, Image and Video Processing》2007,1(1):29-38

相似文献

14.

Design of a real-time face detection architecture for heterogeneous systems-on-chips

《Integration, the VLSI Journal》2020

Object detection represents one of the most important and challenging task in computer vision applications. Boosting-based approaches deal with computational intensive operations and they involve several sequential tasks that make very difficult developing hardware implementations with high parallelism level. This work presents a new hardware architecture able to perform object detection based on a cascade classifier in real-time and resource-constrained systems. As case study, the proposed architecture has been tailored to accomplish the face detection task and integrated within a complete heterogeneous embedded system based on a Xilinx Zynq-7000 FPGA-based System-on-Chip. Experimental results show that, thanks to the proposed parallel processing scheme and the runtime adaptable strategy to slide sub-windows across the input image, the novel design achieves a frame rate up to 125fps for the QVGA resolution, thus significantly outperforming previous works. Such a performance is obtained by using less than 10% of on-chip available logic resources with a power consumption of 377 mW at the 100 MHz clock frequency. 相似文献

15.

基于人工神经网络的通信信号分类识别

冯涛《无线电工程》2006,36(6):24-26

通信信号的分类识别是一种典型的统计模式识别问题。系统地论述了通信信号特征选择、特征提取和分类识别的原理和方法。设计了人工神经网络分类器,包括神经网络模型的选择、分类器的输入输出表示、神经网络拓扑结构和训练算法,并提出了分层结构的神经网络分类器。相似文献

16.

Design and analysis of very high-speed network architectures

Chlamtac I. Ganz A. 《Communications, IEEE Transactions on》1988,36(3):252-262

Communication architectures for very-high-speed networks are dealt with. The use of high communication speed increases the ratio between the end-to-end propagation delay and the packet transmission time. This increase restricts the utilization of the high system bandwidth in broadcast channel-based systems, causing a rapid performance deterioration. A communication system architecture characterized by the use of several parallel channels and design of the nodes' channel interface is presented. The channel-division approach is introduced, showing that for a given system bandwidth the total system capacity will be increased by bandwidth division and parallel communication. An analytic model of this system is developed, from which the proposed system's performance is obtained and performance bounds determined for multichannel slotted finite systems. The results show that the architecture has a potential to improve significantly the system performance compared to conventional single-channel-based systems. Furthermore, for a given network configuration an optimal architecture can be found which simultaneously maximizes the system throughput and minimizes the average packet delay 相似文献

17.

基于MapReduce技术的并行集成分类算法

琚春华邹江波张芮魏建良《电信科学》2012,28(7):40-47

由于计算机内存资源限制,分类器组合的有效性及最优性选择是机器学习领域的主要研究内容。经典的集成分类算法在处理小数据集时,拥有较高的分类准确性,但面对大量数据时,由于多基分类器学习、分类共用1台计算机资源,导致运算效率较低,这显然不适合处理当今的海量数据。针对已有集成分类算法只适合作用于小规模数据集的缺点,剖析了集成分类器的特性,采用基于聚合方式的集成分类器和云计算的MapReduce技术设计了并行集成分类算法(EMapReduce),达到并行处理大规模数据的目的。并在Amazon计算集群上模拟实验,实验结果表明该算法具有一定的高效性和可行性。相似文献

18.

Sequential encoding of Reed-Solomon codes using discrete-time delaylines

Tong P. Ruetz P. 《Communications, IEEE Transactions on》1994,42(1):2-5

Presents an architecture for the efficient encoding of Reed-Solomon codes, with or without interleaving. This architecture utilizes a clock whose rate is r times the symbol rate, where r is the redundancy of the code. The finite field operations are performed in a sequential manner, requiring only one finite field multiplier and one finite field adder. All memory elements (except one symbol register) are consolidated into a discrete-time delay line, which can be easily implemented with a random access memory. This approach alleviates the clock skew problem and leads to significant hardware savings over the usual parallel approach, when the redundancy and/or interleaving depth are large. The architecture can be easily reconfigured for changes in the generator polynomial of the code, the amount of redundancy, and the interleaving depth 相似文献

19.

低秩大模板二维卷积算法的脉动阵列设计

杨绿溪王保云何振亚《电子与信息学报》1997,19(1):6-10

本文针对低秩大模板二维卷积运算的特点,给出了其快速算法,并利用基于相关图的三步骤映射法设计了其脉动阵列实现结构。该结构并行效率高,并可达到线性加速比。相似文献

20.

基于FPGA的数字高清晰度电视视频解码器的设计和实现

周萍俞斯乐《电子与信息学报》1998,20(6):799-805

本文介绍了一个能实时解码基于MPEG-2的高清晰度电视(HDTV)编码流的视频解码器的设计方案及其实现。在设计中采用大量FPGA以及能实现高速处理的并行处理技术和流水线工作方式,并研究了由并行处理而导致的运动补偿越界等特殊问题的解决途径。论文阐明了解码器的总体结构和各主要电路的组成以及整个解码过程的具体实现。相似文献