首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   17篇
  免费   0篇
电工技术   2篇
无线电   11篇
自动化技术   4篇
  2022年   3篇
  2020年   1篇
  2015年   1篇
  2014年   2篇
  2013年   2篇
  2012年   2篇
  2011年   1篇
  2010年   1篇
  2008年   2篇
  2004年   2篇
排序方式: 共有17条查询结果,搜索用时 15 毫秒
1.
With gate counts of ten million, field-programmable gate arrays (FPGAs) are becoming suitable for floating-point computations. Addition is the most complex operation in a floating-point unit and can cause major delay while requiring a significant area. Over the years, the VLSI community has developed many floating-point adder algorithms aimed primarily at reducing the overall latency. An efficient design of the floating-point adder offers major area and performance improvements for FPGAs. Given recent advances in FPGA architecture and area density, latency has become the main focus in attempts to improve performance. This paper studies the implementation of standard; leading-one predictor (LOP); and far and close datapath (2-path) floating-point addition algorithms in FPGAs. Each algorithm has complex sub-operations which contribute significantly to the overall latency of the design. Each of the sub-operations is researched for different implementations and is then synthesized onto a Xilinx Virtex-II Pro FPGA device. Standard and LOP algorithms are also pipelined into five stages and compared with the Xilinx IP. According to the results, the standard algorithm is the best implementation with respect to area, but has a large overall latency of 27.059 ns while occupying 541 slices. The LOP algorithm reduces latency by 6.5% at the cost of a 38% increase in area compared to the standard algorithm. The 2-path implementation shows a 19% reduction in latency with an added expense of 88% in area compared to the standard algorithm. The five-stage standard pipeline implementation shows a 6.4% improvement in clock speed compared to the Xilinx IP with a 23% smaller area requirement. The five-stage pipelined LOP implementation shows a 22% improvement in clock speed compared to the Xilinx IP at a cost of 15% more area.  相似文献   
2.
The architecture of a field-programmable gate-array (FPGA) implementation of a low-density parity-check (LDPC) decoder for the Digital Video Broadcasting – Second Generation via Satellite (DVB-S2) standard is presented. Algorithms are devised to systematically apply the values given in DVB-S2 to implement a memory mapping scheme, which allows for 360 functional units (FUs) to be used in decoding and supports both normal and short frames. A design of a parity-check module (PCM) is presented that verifies the parity-check equations of the LDPC codes. Furthermore, a special characteristic of five of the codes defined in DVB-S2 and their influence on the decoder design is discussed.Two versions of the LDPC decoder are synthesized for two families of FPGAs. The results show that the decoder presented uses fewer hardware resources than a DVB-S2 LDPC decoder found in the current literature that also uses FPGA, while improving the maximum frequency of the decoder.  相似文献   
3.
Given the popularity of decimal arithmetic, hardware implementation of decimal operations has been a hot topic of research in recent decades. Besides the four basic operations, the square root can be implemented as an instruction directly in the hardware, which improves the performance of the decimal floating-point unit in the processors. Hardware implementation of decimal square rooters is usually done using either functional or digit-recurrence algorithms. Functional algorithms, entailing multiplication per iteration, seem inadequate to use for decimal square roots, given the high cost of decimal multipliers. On the other hand, digit-recurrence square root algorithms, particularly SRT (this method is named after its creators, Sweeney, Robertson, and Tocher) algorithms, are simple and well suited for decimal arithmetic. This paper, with the intention of reducing the latency of the decimal square root operation while maintaining a reasonable cost, proposes an SRT algorithm and the corresponding hardware architecture to compute the decimal square root. The proposed fixed-point square root design requires n+3 cycles to compute an n-digit root; the synthesis results show an area cost of about 31K NAND2 and a cycle time of 40 FO4. These results reveal the 14 % speed advantage of the proposed decimal square root architecture over the fastest previous work (which uses a functional algorithm) with about a quarter of the area.  相似文献   
4.
This paper proposes a novel method to improve the utilization efficiency and performance of field-programmable gate arrays (FPGAs). The proposed method, ExorBDD, uses a stage of exclusive-sum-of-product (ESOP) minimization, followed by a stage of decomposition using binary decision diagrams (BDDs). For exclusive OR (XOR)?intensive circuits, experiments were conducted on 19 MCNC benchmark parity circuits (ranging from 5 to 25 inputs), as they are the most representative case of XOR-intensive circuits. The results using the proposed approach show significant improvements over Exorcism4, BDS, and commercial tools. On average, the new approach uses only 30.3% as many look-up tables as are used by Xilinx tools (and only 16.4% in comparison to Altera). On average, the new approach has a maximum combinational path delay of 89.2% compared to the delay with Xilinx tools (80.3% compared to Altera). Experiments were also conducted on non-XOR-intensive circuits. These results show that ExorBDD also performs well for arbitrary circuits.  相似文献   
5.
In this paper, we propose an AND/XOR-based technology mapping method for efficient realization of parity prediction functions in field programmable gate arrays (FPGAs). Due to the fixed size of the programmable blocks in an FPGA, decomposing a circuit into sub-circuits with appropriate number of inputs can achieve an excellent implementation efficiency. Specifically, the proposed technology mapping method is based on Davio expansion theorem. The AND/XOR nature of the proposed method allows it to operate on XOR intensive circuits, such as parity prediction functions, efficiently. We conduct experiments using the parity prediction functions with respect to MCNC benchmark circuits. With the proposed approach, the number of configurable logic blocks (CLBs) is reduced by 67.6% (compared to speed-optimized results) and 57.7% (compared to area-optimized results), respectively. The total equivalent gate counts are reduced by 65.5%, maximum combinational path delay is reduced by 56.7%, and maximum net delay is reduced by 80.5% compared to conventional methods.  相似文献   
6.
This paper presents a decimal logarithmic converter based on the decimal first-order polynomial (linear) approximation algorithm. The proposed approach is mainly based on a look-up table, followed a decimal linear approximation step. Compared with a binary-based decimal linear approximation algorithm (Algorithm 1), the proposed algorithm (Algorithm 2) is error-free in the conversion between the decimal and the binary formats. The proposed architecture is implemented by the combinational logic in the binary coded decimal (BCD) encoding on Virtex5 XC5VLX110T FPGA. The results of the comparison show that the hardware performance of Algorithm 2 can run 2.15 times faster than Algorithm 1, with the expense of 1.14 times more area.  相似文献   
7.
In this paper, a generic asynchronous First In First Out (FIFO) based WISHBONE compatible plug and play Network Interface (NI) for Network on Chip (NoC) is designed and verified. Four different types of encoded asynchronous FIFOs namely binary, Gray, one-hot and Johnson are designed and analyzed. It is found that Gray-code asynchronous FIFO is the best to handle the asynchronous clock domain issues in NI. The control signals of the WISHBONE bus wrappers from/to asynchronous FIFOs and packing/unpacking modules are asserted concurrently at the same rising edge of the respective router and IP clocks to reduce the latency. The same NI has been utilized for transferring data between synchronous as well as asynchronous clock domains irrespective of clock frequency and phase differences. The proposed NI ensures the seamless high data throughput between the routers and IP cores with minimal latency, higher throughput, higher speed and utilized lesser area compared to the existing design.  相似文献   
8.
Applied Intelligence - The performance of vision systems can be affected when used in severe weather conditions such as heavy rain or snow. Rain streak removal is an ill posed problem as they can...  相似文献   
9.
Multidimensional Systems and Signal Processing - Computed tomography (CT) is widely used to locate pulmonary nodules for preliminary diagnosis of the lung cancer. However, due to high visual...  相似文献   
10.
Analog Integrated Circuits and Signal Processing - This paper presents a low nonlinearity, four channel Gated Ring Oscillator (GRO) based Time-to-Digital Converters (TDC) in Xilinx 28 nm...  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号