首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
In this paper, we present a design of video and audio single chip encoder/decoder for portable multimedia application. The single‐chip called as video audio signal processor (VASP) consists of a video signal processing block and an audio signal processing block. This chip has mixed hardware/software architecture to combine performance and flexibility. We designed the chip by partitioning between video and audio block. The video signal processing block was designed to implement hardwired solution of pixel input/output, full pixel motion estimation, half pixel motion estimation, discrete cosine transform, quantization, run length coding, host interface, and 16 bits RISC type internal controller. The audio signal processing block is implemented with software solution using a 16 bits fixed point DSP. This chip contains 142,300 gates, 22 kbits FIFO, 107 kbits SRAM, and 556 kbits ROM, and the chip size is 9.02 mm ×9.06 mm which is fabricated using 0.5 micron 3‐layer metal CMOS technology.  相似文献   

3.
A novel multisampling time-domain architecture for CMOS imagers with synchronous readout and wide dynamic range is proposed. The proposed multisampling architecture requires only a single bit per pixel memory instead of 8 bits which is typical for time-domain active pixel architectures. The goal is to obtain a time-domain imager with high dynamic range that requires lower number of transistors per pixel in order to achieve higher fill-factor. The maximum frame rate is analyzed as a function of number of bits and array size. The analysis shows that it is possible to achieve high frame rates and operate in video mode having 10 bit pixel data resolution. Also, we present analysis of the impact of comparator offset voltage on the fixed pattern noise. The architecture was implemented in an imager prototype with 32 × 32 pixel array fabricated in AMS CMOS 0.35 μm and was characterized for sensitivity, noise and color response. The pixel size is 30 μm × 26 μm and it is composed of an n+/psub photodiode, a comparator and a D flip-flop with a 16% fill-factor.  相似文献   

4.
Many VLSI architectures for computing the discrete wavelet transform (DWT) were presented, but the parallel input data sequence and the programmability of the 2-D DWT were rarely mentioned. In this paper, we present a parallel-processing VLSI architecture to compute the programmable 2-D DWT, including various wavelet filter lengths and various wavelet transform levels. The proposed architecture is very regular and easy for extension. To eliminate high frequency components, the pixel values outside the boundary of the image are mirror-extended as the symmetric wavelet transform (SWT) and the mirror-extension is realized via the routing network. Owing to the property of the parallel processing, we adopt the row-based recursive pyramid algorithm (RPA), similar to 1-D RPA, as the data scheduling. This design has been implemented and fabricated in a 0.35 m 1P4M CMOS technology and the working frequency is 50 MHz. The chip size is about 5200 m × 2500 m. For a 256 × 256 image, the chip can perform 30 frames per second with the filter length varying from 2 to 20 and with various levels. The proposed architecture is suitable for real-time applications such as JPEG 2000.  相似文献   

5.
This paper presents a new edge‐protection algorithm and its very large scale integration (VLSI) architecture for block artifact reduction. Unlike previous approaches using block classification, our algorithm utilizes pixel classification to categorize each pixel into one of two classes, namely smooth region and edge region, which are described by the edge‐protection maps. Based on these maps, a two‐step adaptive filter which includes offset filtering and edge‐preserving filtering is used to remove block artifacts. A pipelined VLSI architecture of the proposed deblocking algorithm for HD video processing is also presented in this paper. A memory‐reduced architecture for a block buffer is used to optimize memory usage. The architecture of the proposed deblocking filter is verified on FPGA Cyclone II and implemented using the ANAM 0.25 µm CMOS cell library. Our experimental results show that our proposed algorithm effectively reduces block artifacts while preserving the details. The PSNR performance of our algorithm using pixel classification is better than that of previous algorithms using block classification.  相似文献   

6.
A novel concept for global shutter CMOS image sensors with wide dynamic range (WDR) implementation is presented. The proposed imager is based on the multisampling WDR approach and it allows an efficient global shutter pixel implementation achieving small pixel size and high fill factor. The proposed imager provides wide DR by applying adaptive exposure time to each pixel, according to the local illumination intensity level. Two pixel configurations, employing different kinds of a 1-bit in-pixel memory were implemented. An imager, including two different pixels was designed and simulated in 0.18-mum CMOS technology. System architecture and operation are discussed and simulation results are presented.  相似文献   

7.
JPEG 2000 is one of the most popular image compression standards offering significant performance advantages over previous image standards. High computational complexity of the JPEG 2000 algorithms makes it necessary to employ methods that overcomes the bottlenecks of the system and hence an efficient solution is imperative. One such crucial algorithms in JPEG 2000 is arithmetic coding and is completely based on bit level operations. In this paper, an efficient hardware implementation of arithmetic coding is proposed which uses efficient pipelining and parallel processing for intermediate blocks. The idea is to provide a two-symbol coding engine, which is efficient in terms of performance, memory and hardware. This architecture is implemented in Verilog hardware definition language and synthesized using Altera field programmable gate array. The only memory unit used in this design is a FIFO (first in first out) of 256 bits to store the CX-D pairs at the input, which is negligible compared to the existing arithmetic coding hardware designs. The simulation and synthesis results show that the operating frequency of the proposed architecture is greater than 100 MHz and it achieves a throughput of 212 Msymbols/sec, which is double the throughput of conventional one-symbol implementation and enables at least 50% throughput increase compared to the existing two-symbol architectures.  相似文献   

8.
An integrated 1024×1024 CMOS image sensor with programmable region-of-interest (ROI) readout and multiexposure technique has been developed and successfully tested. Size and position of the ROI is programmed based on multiples of a minimum readout kernel of 32×32 pixels. Since the dynamic range of the irradiance normally exceeds the electrical dynamic range of the imager that can be covered using a single integration time, a multiexposure technique has been implemented in the imager. Subsequent sensor images are acquired using different integration times and recomputed to form a single composite image. A newly developed algorithm performing the recomputation is presented. The chip has been realized in a 0.5-μm n-well standard CMOS process. The pixel pitch is 10 μm2 and the total chip area is 164 mm 2  相似文献   

9.
提出一种基于提升算法(lifting scheme)实现JPEG2000编码系统中的二维离散小波变换(Discrete Wavelet Transform)的并行阵列式的VLSI结构设计方法.该结构由一个行处理器和一个列处理器组成,行、列处理器通过时分复用同时进行滤波,用优化的移位加操作替代乘法操作,采用嵌入式数据延拓算法处理边界延拓.整个结构采用流水线设计方法,减少了运算量,提高了硬件资源利用率,该结构可应用于JPEG2000图像编码芯片中.  相似文献   

10.
High efficiency video coding (HEVC) transform algorithm for residual coding uses 2-dimensional (2D) 4×4 transforms with higher precision than H.264's 4×4 transforms, resulting in increased hardware complexity. In this paper, we present a shared architecture that can compute the 4×4 forward discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) of HEVC using a new mapping scheme in the video processor array structure. The architecture is implemented with only adders and shifts to an area-efficient design. The proposed architecture is synthesized using ISE14.7 and implemented using the BEE4 platform with the Virtex-6 FF1759 LX550T field programmable gate array (FPGA). The result shows that the video processor array structure achieves a maximum operation frequency of 165.2 MHz. The architecture and its implementation are presented in this paper to demonstrate its programmable and high performance.  相似文献   

11.
基于中波制冷型640 pixel512 pixel(15m)凝视焦平面探测器,设计了远距离探测/识别的高清晰大变倍比连续变焦热像仪。该热像仪变倍比为35,长焦处瞬时视场(IFOV)为0.027 mrad/pixel,在标准大气环境中观察视场角为3528,能够对4 m3 m2.3 m尺寸的车辆进行55 km处探测、15 km处识别(识别概率为50%),满足现代光电武器系统的远距离作战需要。热像仪采用平滑的变倍-补偿曲线光学系统设计、单导轨/双滑块变焦结构技术、自适应伺服控制技术以及红外图像增强技术,获得了在整个大变倍比的连续变焦过程中图像始终清晰并且无间断点。在奈奎斯特频率处(18 cyc/mrad)进行最小可分辨温差(MRTD)测试,测试结果表明该热像仪性能优良,证明该热像仪的各项关键技术实现了从理论设计到整机系统的工程化研究。  相似文献   

12.
提出了一种基于提升算法的二维离散5/3小波变换(DWT)高效并行VLSI结构设计方法。该方法使得行和列滤波器同时进行滤波,采用流水线设计方法处理,在保证同样的精度下,大大减少了运算量,提高了变换速度,节约了硬件资源。该方法已通过了VerilogHDL行为级仿真验证,可作为单独的IP核应用在JPEG2000图像编、解码芯片中。该结构可推广到9/7小波提升结构。  相似文献   

13.
Four image reorganization ICs that enable real-time difference encoding for hierarchical lossless image compression are reported. Two image reorganization processors are realized on the focal-plane and two are designed for hybridization to a separate imager IC. The two focal-plane ICs represent the first integration of a 256×256 buried-channel frame-transfer CCD image sensor with additional charge-domain circuitry to enable image reformatting at video rates (28 frames/s). The four ICs generate pyramidal pixel output in 3×3 blocks with the center pixel first. Pixel data reorganization is performed through simultaneous readout of three rows of data, followed by pixel resequencing and sampling to provide differential output. A novel architecture provides simultaneous readout of multiple imager rows on the focal-plane ICs. The ICs have achieved a charge-transfer efficiency (CTE) of 0.99996 in the conventional horizontal and vertical CCD registers, and a CTE of 0.99994 in the SP3 registers  相似文献   

14.
Parallel image processing with the block data parallel architecture   总被引:2,自引:0,他引:2  
Many digital signal and image processing algorithms can be speeded up by executing them in parallel on multiple processors. The speed of parallel execution is limited by the need for communication and synchronization between processors. In this paper, we present a paradigm for parallel processing that we call the block data flow paradigm (BDFP). The goal of this paradigm is to reduce interprocessor communication and relax the synchronization requirements for such applications. We present the block data parallel architecture which implements this paradigm, and we present methods for mapping algorithms onto this architecture. We illustrate this methodology for several applications including two-dimensional (2-D) digital filters, the 2-D discrete cosine transform, QR decomposition of a matrix and Cholesky factorization of a matrix. We analyze the resulting system performance for these applications with regard to speedup and efficiency as the number of processors increases. Our results demonstrate that the block data parallel architecture is a flexible, high-performance solution for numerous digital signal and image processing algorithms  相似文献   

15.
Image compression algorithms employ computationally expensive spatial convolutional transforms. The CMOS image sensor performs spatially compressing image quantization on the focal plane yielding digital output at a rate proportional to the mere information rate of the video. A bank of column-parallel first-order incremental DeltaSigma-modulated analog-to-digital converters (ADCs) performs column-wise distributed focal-plane oversampling of up to eight adjacent pixels and concurrent weighted average quantization. Number of samples per pixel and switched-capacitor sampling sequence order set the amplitude and sign of the pixel coefficient, respectively. A simple digital delay and adder loop performs spatial accumulation over up to eight adjacent ADC outputs during readout. This amounts to computing a two-dimensional block matrix transform with up to 8times8-pixel programmable kernel in parallel for all columns. Noise shaping reduces power dissipation below that of a conventional digital imager while the need for a peripheral DSP is eliminated. A 128times128 active pixel array integrated with a bank of 128 DeltaSigma-modulated ADCs was fabricated in a 0.35-mum CMOS technology. The 3.1 mm times 1.9-mm prototype captures 8-bit digital video at 30 frames/s and yields 4 GMACS projected computational throughput when scaled to HDTV 1080i resolution in discrete cosine transform (DCT) compression  相似文献   

16.
A novel wide dynamic range (WDR) snapshot active pixel sensor for ultra-low power applications is presented. The proposed imager allows capturing of fast moving objects in the field of view and provides WDR by applying adaptive exposure time to each pixel, according to the local illumination intensity level. Driven by low-power dissipation requirements, the proposed pixel is operated by dual low voltage supplies (1.2 and 1.8 V) and utilizes an advanced low-power sensor design methodology. A test chip of a 32*32 array has been implemented in a standard 0.35-/spl mu/m CMOS technology. A single pixel occupies 18*32 /spl mu/m area and is expected to dissipate 18.5 nW at video rate. System architecture and operation are discussed and simulation results are presented.  相似文献   

17.
At present, almost all digital images are stored and transferred in their compressed format in which discrete cosine transform (DCT)-based compression remains one of the most important data compression techniques due to the efforts from JPEG. In order to save the computation and memory cost, it is desirable to have image processing operations such as feature extraction, image indexing, and pattern classifications implemented directly in the DCT domain. To this end, we present in this paper a generalized analysis of spatial relationships between the DCTs of any block and its sub-blocks. The results reveal that DCT coefficients of any block can be directly obtained from the DCT coefficients of its sub-blocks and that the interblock relationship remains linear. It is useful in extracting global features in the compressed domain for general image processing tasks such as those widely used in pyramid algorithms and image indexing. In addition, due to the fact that the corresponding coefficient matrix of the linear combination is sparse, the computational complexity of the proposed algorithms is significantly lower than that of the existing methods  相似文献   

18.
An imager with an integrated fully programmable bit-serial column-parallel processor is proposed to meet the demand for a compact and versatile system-on-imager chip for consumer applications. The on-imager processor is targeting a computationally intensive low-level image processing task. The processor is physically arranged as a densely packed 2-D processing element (PE) array at an imager column level. The digital processor has a multiple-instruction-multiple-data (MIMD) architecture configuring multiple column-parallel single-instruction-multiple-data (SIMD) processors. The prototype imager chip with 128 times 128 pixels and 4 times 128 PE array designed with 0.6-mum technology was fabricated, and its functionality was tested. The estimation of performance level of the proposed processor architecture with an advanced technology such as the 0.09-mum process technology shows that the proposed imager chip architecture has a potential of giga sum operations per second per square millimeter class processing performance.  相似文献   

19.
In this paper, a computational digital pixel sensor (DPS) equipped with an on-chip image-processing capability has been developed. In order to resolve the interconnection bottleneck between the sensor array and on-chip processing units, a new block-readout architecture has been proposed and implemented on the chip. The data from the sensor array are read out in a form of a pixel block compatible to kernel image processing, and they are processed in parallel by on-chip processing units. Such an architecture has enabled us to carry out an efficient kernel processing using a linear array of single-instruction–multiple-data processing units. In order to demonstrate the advantage of such an architecture, a rank-order filtering circuit has been implemented on the chip as a case study of the on-chip image processing. In this paper, a binary-search rank-order filtering algorithm has been implemented in a simple circuitry. A proof-of-concept chip having an array of 64$times$48 pixels was designed and fabricated using a 0.35-${rm mu}hbox{m}$ CMOS technology, and the concept has been verified by the measurement of fabricated chips.   相似文献   

20.
为将环形光谱仪下传的原始数据快速转换为待反演的亮度数据,建立了紫外环形光谱仪科学数据处理系统。采用文档-视图的数据流架构,保证数据流之间的无干扰处理;采用暗像元均值对探测器噪声进行校正;采用比例法将像元响应换算为标准积分时间和增益条件下的响应值;采用无须通道光谱响应函数满足高斯函数的光谱积分法将像元响应换算为亮度,并提供指定区域的相对标准偏差统计作为图像转换正确性的快速判据。实验结果表明:系统在测试环境中用352s时间内流畅处理了环形光谱仪在987s内产生的2.19GB图像数据,生成4.96GB亮度数据,亮度换算后精度为10~(-7)lm。系统为卫星运控中心与用户之间提供了快速、准确的数据接口链路,保证了数据反演的准确性和实效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号