首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this paper, we derive a technique for analysis of local distortions which affect data in real-world applications. In the paper, we focus on image data, specifically handwritten characters. Given a reference image and a distorted copy of it, the method is able to efficiently determine the rotations, translations, scaling, and any other distortions that have been applied. Because the method is robust, it is also able to estimate distortions for two unrelated images, thus determining the distortions that would be required to cause the two images to resemble each other. The approach is based on a polynomial series expansion using matrix powers of linear transformation matrices. The technique has applications in pattern recognition in the presence of distortions.  相似文献   

2.
In this paper, we present faster than real-time implementation of a class of dense stereo vision algorithms on a low-power massively parallel SIMD architecture, the CSX700. With two cores, each with 96 Processing Elements, this SIMD architecture provides a peak computation power of 96 GFLOPS while consuming only 9 Watts, making it an excellent candidate for embedded computing applications. Exploiting full features of this architecture, we have developed schemes for an efficient parallel implementation with minimum of overhead. For the sum of squared differences (SSD) algorithm and for VGA (640 × 480) images with disparity ranges of 16 and 32, we achieve a performance of 179 and 94 frames per second (fps), respectively. For the HDTV (1,280 × 720) images with disparity ranges of 16 and 32, we achieve a performance of 67 and 35 fps, respectively. We have also implemented more accurate, and hence more computationally expensive variants of the SSD, and for most cases, particularly for VGA images, we have achieved faster than real-time performance. Our results clearly demonstrate that, by developing careful parallelization schemes, the CSX architecture can provide excellent performance and flexibility for various embedded vision applications.  相似文献   

3.
多源数据行人重识别研究综述   总被引:4,自引:3,他引:1  
叶钰  王正  梁超  韩镇  陈军  胡瑞敏 《自动化学报》2020,46(9):1869-1884
行人重识别是近年来计算机视觉领域的热点问题, 经过多年的发展, 基于可见光图像的一般行人重识别技术已经趋近成熟. 然而, 目前的研究多基于一个相对理想的假设, 即行人图像都是在光照充足的条件下拍摄的高分辨率图像. 因此虽然大多数的研究都能取得较为满意的效果, 但在实际环境中并不适用. 多源数据行人重识别即利用多种行人信息进行行人匹配的问题. 除了需要解决一般行人重识别所面临的问题外, 多源数据行人重识别技术还需要解决不同类型行人信息与一般行人图片相互匹配时的差异问题, 如低分辨率图像、红外图像、深度图像、文本信息和素描图像等. 因此, 与一般行人重识别方法相比, 多源数据行人重识别研究更具实用性, 同时也更具有挑战性. 本文首先介绍了一般行人重识别的发展现状和所面临的问题, 然后比较了多源数据行人重识别与一般行人重识别的区别, 并根据不同数据类型总结了5 类多源数据行人重识别问题, 分别从方法、数据集两个方面对现有工作做了归纳和分析. 与一般行人重识别技术相比, 多源数据行人重识别的优点是可以充分利用各类数据学习跨模态和类型的特征转换. 最后, 本文讨论了多源数据行人重识别未来的发展.  相似文献   

4.
In this paper, we propose an efficient no-reference image quality assessment (NR-IQA) method dubbed Center-Surround based Blind Image Quality Assessment (CS-BIQA). Our proposed method employs the Difference of Gaussian (DoG) model to decompose images into several frequency bands, considering the center-surround effect and multi-channel attribute of human visual system (HVS). The integrated natural scene statistics (NSS) features can be further derived from all DoG bands. After that, regression models between the integrated features and associated subjective assessment scores are learned on the training dataset. Subsequently, the learned models are used to predict the quality scores of test images. The main contribution of this paper is twofold. Firstly, the empirical distributions of DoG bands of images are proven to be a Gaussian-like distribution. And thus, the NSS features can be employed to represent the perceptual quality of images. Secondly, different types of distortions are observed to affect different frequency components of images. So, the integrated features extracted from multi-frequency bands are employed in CS-BIQA to achieve stronger distinguishable capability of image quality. Excessive experiments are conducted to indicate that our proposed CS-BIQA metric can represent the perceptual characteristics of HVS. The results on popular IQA databases demonstrate that the CS-BIQA metric is competitive with the state-of-the-art relevant IQA metrics. Furthermore, our proposed method has very low computational complexity, making it more suitable for real-time applications.  相似文献   

5.
《Real》1999,5(6):385-395
In this paper, we present a way to improve the computational speed of image contrast enhancement using low-cost FPGA-based hardware primarily targeted to X-ray images. In particular, we consider an enhancement method that consists of filtering followed by histogram modification. Filtering is done via the high boost filter (HBF) which is based on unsharp masking, and the histogram modification which is based on global histogram equalization (GHE). An image enhancement co-processor, IMECO, concept is proposed that enables efficient hardware implementation of enhancement procedures and hardware/software co-design to achieve high-performance low-cost solutions. The co-processor runs on an FPGA prototyping ISA-bus board. At this stage it consists of two hardware functional units that implement HBF and GHE and can be downloaded onto the board sequentially or reside on the board at the same time. These units represent an embryo of virtual hardware units that form a library of image enhancement algorithms. These algorithms can be easily integrated into software templates. In our trials with chest X-ray images, performance improvement over software-only implementations is more than two orders of magnitude, thus providing real-time or near-real-time image enhancement as required in target applications.  相似文献   

6.
Histograms are used to analyze and index images. They have been found experimentally to have low sensitivity to certain types of image morphisms, for example, viewpoint changes and object deformations. The precise effect of these image morphisms on the histogram, however, has not been studied. In this work we derive the complete class of local transformations that preserve or scale the magnitude of the histogram of all images. We also derive a more general class of local transformations that preserve the histogram relative to a particular image. To achieve this, the transformations are represented as solutions to families of vector fields acting on the image. The local effect of fixed points of the fields on the histograms is also analyzed. The analytical results are verified with several examples. We also discuss several applications and the significance of these transformations for histogram indexing.  相似文献   

7.
Texture classification is one of the most important tasks in computer vision field and it has been extensively investigated in the last several decades. Previous texture classification methods mainly used the template matching based methods such as Support Vector Machine and k-Nearest-Neighbour for classification. Given enough training images the state-of-the-art texture classification methods could achieve very high classification accuracies on some benchmark databases. However, when the number of training images is limited, which usually happens in real-world applications because of the high cost of obtaining labelled data, the classification accuracies of those state-of-the-art methods would deteriorate due to the overfitting effect. In this paper we aim to develop a novel framework that could correctly classify textural images with only a small number of training images. By taking into account the repetition and sparsity property of textures we propose a sparse representation based multi-manifold analysis framework for texture classification from few training images. A set of new training samples are generated from each training image by a scale and spatial pyramid, and then the training samples belonging to each class are modelled by a manifold based on sparse representation. We learn a dictionary of sparse representation and a projection matrix for each class and classify the test images based on the projected reconstruction errors. The framework provides a more compact model than the template matching based texture classification methods, and mitigates the overfitting effect. Experimental results show that the proposed method could achieve reasonably high generalization capability even with as few as 3 training images, and significantly outperforms the state-of-the-art texture classification approaches on three benchmark datasets.  相似文献   

8.
This paper proposes a data-hiding technique for binary images in morphological transform domain for authentication purpose. To achieve blind watermark extraction, it is difficult to use the detail coefficients directly as a location map to determine the data-hiding locations. Hence, we view flipping an edge pixel in binary images as shifting the edge location one pixel horizontally and vertically. Based on this observation, we propose an interlaced morphological binary wavelet transform to track the shifted edges, which thus facilitates blind watermark extraction and incorporation of cryptographic signature. Unlike existing block-based approach, in which the block size is constrained by 3times3 pixels or larger, we process an image in 2times2 pixel blocks. This allows flexibility in tracking the edges and also achieves low computational complexity. The two processing cases that flipping the candidates of one does not affect the flippability conditions of another are employed for orthogonal embedding, which renders more suitable candidates can be identified such that a larger capacity can be achieved. A novel effective Backward-Forward Minimization method is proposed, which considers both backwardly those neighboring processed embeddable candidates and forwardly those unprocessed flippable candidates that may be affected by flipping the current pixel. In this way, the total visual distortion can be minimized. Experimental results demonstrate the validity of our arguments.  相似文献   

9.
The information of e-commerce images varies and different users may focus on different contents of the same image for different purpose. So the research on recommendation by computers is becoming more and more important. But retrieval based only on keywords obviously falls short for massive numbers of resource images. In this paper, we focus on a recommendation system of goods images based on image content. Goods images have a relatively homogenous background and have a wide range of applications. The recommendation consists of three stages. First, the image is pre-processed by removing the background. Second, a weighted representation model is proposed to represent the image. The separated features are extracted and normalized, and then the weights of each feature are computed based on the samples browsed by the users. Third, a feature indexing scheme is put forward based on the proposed representation. A binary-tree is used for the indexing, and a binary-tree updating algorithm is also given. Finally, the recommended images are given by a features combination searching scheme. Experimental results on a real goods image database show that our algorithm can achieve high accuracy in recommending similar goods images with high speed.  相似文献   

10.
In some orthopaedic applications such as the design of custom-made hip prostheses, reconstruction of the bone morphology is a fundamental step. Different methods are available to extract the geometry of the femoral medullary canal from computed tomography (CT) images. In this research, an automatic procedure (border-tracing method) for the extraction of bone contours was implemented and validated. A composite replica of the human femur was scanned and the CT images processed using three different methods, a manual procedure; the border-tracing algorithm; and a threshold-based method. The resulting contours were used to estimate the accuracy of the implemented procedure. The two software techniques were more accurate than the manual procedure. Then, these two procedures were applied to an in vivo CT data set in order to determine to most critical region for repeatability. Only for the images located in this region, the repeatability measurement was carried out for six in vivo CT data sets to evaluate the inter-femur repeatability. The border-tracing method was found to achieve the highest repeatability.  相似文献   

11.
在产品表面缺陷智能检测过程中,存在缺陷样本收集困难、样本不平衡、目标尺寸小和难以定位等问题。针对磁芯表面缺陷检测中存在的问题进行了研究,提出了一种基于深度学习的图像增强和检测方法,首先利用结合高斯混合模型的深度卷积生成对抗网络生成磁芯缺陷图像,然后结合泊松融合方法产生增强的数据集,最后基于YOLO-v3网络,实现了磁芯表面缺陷的智能检测。实验表明,该方法能够生成质量更高、缺陷更明显的图像,检测准确度提升了5.6%。  相似文献   

12.
In this paper, we propose two novel methods for face recognition under arbitrary unknown lighting by using spherical harmonics illumination representation, which require only one training image per subject and no 3D shape information. Our methods are based on the result which demonstrated that the set of images of a convex Lambertian object obtained under a wide variety of lighting conditions can be approximated accurately by a low-dimensional linear subspace. We provide two methods to estimate the spherical harmonic basis images spanning this space from just one image. Our first method builds the statistical model based on a collection of 2D basis images. We demonstrate that, by using the learned statistics, we can estimate the spherical harmonic basis images from just one image taken under arbitrary illumination conditions if there is no pose variation. Compared to the first method, the second method builds the statistical models directly in 3D spaces by combining the spherical harmonic illumination representation and a 3D morphable model of human faces to recover basis images from images across both poses and illuminations. After estimating the basis images, we use the same recognition scheme for both methods: we recognize the face for which there exists a weighted combination of basis images that is the closest to the test face image. We provide a series of experiments that achieve high recognition rates, under a wide range of illumination conditions, including multiple sources of illumination. Our methods achieve comparable levels of accuracy with methods that have much more onerous training data requirements. Comparison of the two methods is also provided.  相似文献   

13.
基于图像的虚拟试衣能将目标服装图像合成到人物图像上,此任务近年来因其在电子商务和时装图像编辑上广泛应用而备受关注.针对该任务的特点和已有方法的缺陷,提出一种两阶段可调节感知蒸馏方法(TS-APD).该方法包括3个步骤:①分别对服装图像和人物图像预训练2个语义分割网络,生成更准确的服装前景分割和上衣分割;②利用这2个语义...  相似文献   

14.
基于图像融合的木板表面缺陷特征提取方法研究   总被引:1,自引:0,他引:1  
木材和实木家具表面在生产过程中有时会出现裂纹、凹点等缺陷,不同纹理背景和油漆反光会给缺陷识别带来很大困难。为了识别木板表面缺陷,通过光源对同一木板表面在4个不同角度照明并获取相应的4幅图像,组成图像序列,以获得更丰富的细节信息。提出一种基于主元分析法的图像序列融合方法,其融合了一组图像序列所包括的4幅图像的互补性信息,获取的融合结果可使缺陷特征更加明显。该方法引入了主元子空间之间的概念,可以在保留原有数据信息特征的基础上,提取主要信息。实验结果表明,基于主元分析方法的图像序列融合能更好地提取木板表面缺陷特征。所获得的特征图像可用于下一步对缺陷进行自动识别和分类。  相似文献   

15.
In factories, it has recently become very important to detect defects such as cracks in products automatically. In order to achieve this, auto crack detection systems using photo images from digital cameras have been proposed. However, in conventional methods using edge lines detected and extracted by something such as a Sobel filter, it is difficult to distinguish between the original lines on the product surface and those of cracks, especially in the case of noisy images. In order to overcome these difficulties, we have proposed a new method using rotational morphology. Rotational morphology is a kind of mathematical morphology with rotated structuring elements. Finally, some simulations are carried out to confirm the effectiveness of our proposed method.  相似文献   

16.
Multispectral pedestrian detection is an important functionality in various computer vision applications such as robot sensing, security surveillance, and autonomous driving. In this paper, our motivation is to automatically adapt a generic pedestrian detector trained in a visible source domain to a new multispectral target domain without any manual annotation efforts. For this purpose, we present an auto-annotation framework to iteratively label pedestrian instances in visible and thermal channels by leveraging the complementary information of multispectral data. A distinct target is temporally tracked through image sequences to generate more confident labels. The predicted pedestrians in two individual channels are merged through a label fusion scheme to generate multispectral pedestrian annotations. The obtained annotations are then fed to a two-stream region proposal network (TS-RPN) to learn the multispectral features on both visible and thermal images for robust pedestrian detection. Experimental results on KAIST multispectral dataset show that our proposed unsupervised approach using auto-annotated training data can achieve performance comparable to state-of-the-art deep neural networks (DNNs) based pedestrian detectors trained using manual labels.  相似文献   

17.
In central catadioptric systems 3D lines are projected into conics. In this paper we present a new approach to extract conics in the raw catadioptric image, which correspond to projected straight lines in the scene. Using the internal calibration and two image points we are able to compute analytically these conics which we name hypercatadioptric line images. We obtain the error propagation from the image points to the 3D line projection in function of the calibration parameters. We also perform an exhaustive analysis on the elements that can affect the conic extraction accuracy. Besides that, we exploit the presence of parallel lines in man-made environments to compute the dominant vanishing points (VPs) in the omnidirectional image. In order to obtain the intersection of two of these conics we analyze the self-polar triangle common to this pair. With the information contained in the vanishing points we are able to obtain the 3D orientation of the catadioptric system. This method can be used either in a vertical stabilization system required by autonomous navigation or to rectify images required in applications where the vertical orientation of the catadioptric system is assumed. We use synthetic and real images to test the proposed method. We evaluate the 3D orientation accuracy with a ground truth given by a goniometer and with an inertial measurement unit (IMU). We also test our approach performing vertical and full rectifications in sequences of real images.  相似文献   

18.
Recently, as Web and various databases contain a large number of images, content-based image retrieval (CBIR) applications are greatly needed. This paper proposes a new image retrieval system using color-spatial information from those applications.First, this paper suggests two kinds of indexing keys to prune away irrelevant images to a given query image: major colors' set (MCS) signature related with color information and distribution block signature (DBS) related with spatial information. After successively applying these filters to a large database, we get only small amount of high potential candidates that are somewhat similar to a query image. Then we make use of the quad modeling (QM) method to set the initial weights of two-dimensional cell in a query image according to each major color. Finally, we retrieve more similar images from the database by comparing a query image with candidate images through a similarity measuring function associated with the weights. In that procedure, we use a new relevance feedback mechanism. This feedback enhances the retrieval effectiveness by dynamically modulating the weights of color-spatial information. Experiments show that the proposed system is not only efficient but also effective.  相似文献   

19.
Context-aware applications, as a typical type of self-adaptive software systems, are receiving increasing attention. These applications continually adapt to environmental changes in an autonomic way. However, their adaptation may contain defects when the complexity of modeling all environmental changes is beyond a developer's ability. Such defects can cause failure to the adaptation and result in application crash or freezing. Relating these failures back to responsible defects is challenging. In this paper we propose a novel approach, called Adam, to assist identifying defects in the context-aware adaptation. Adam monitors runtime errors for an application, logs relevant error information, and relates them to responsible defects in this application. To make our Adam approach feasible, we investigate the error types that are commonly exhibited by various failures reported in context-aware applications. Adam detects these errors in order to identify responsible defects in context-aware applications. To detect these errors, Adam formally models the adaptation semantics for context-aware applications, and integrates into them a set of assertion checkers with respect to these error types. We experimentally evaluated Adam through three context-aware applications. The experiments reported promising results that Adam can effectively detect errors, identify their responsible defects in applications, and give useful hints on how these defects can be fixed.  相似文献   

20.
Illumination invariant face recognition using near-infrared images   总被引:4,自引:0,他引:4  
Most current face recognition systems are designed for indoor, cooperative-user applications. However, even in thus-constrained applications, most existing systems, academic and commercial, are compromised in accuracy by changes in environmental illumination. In this paper, we present a novel solution for illumination invariant face recognition for indoor, cooperative-user applications. First, we present an active near infrared (NIR) imaging system that is able to produce face images of good condition regardless of visible lights in the environment. Second, we show that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone; based on this, we use local binary pattern (LBP) features to compensate for the monotonic transform, thus deriving an illumination invariant face representation. Then, we present methods for face recognition using NIR images; statistical learning algorithms are used to extract most discriminative features from a large pool of invariant LBP features and construct a highly accurate face matching engine. Finally, we present a system that is able to achieve accurate and fast face recognition in practice, in which a method is provided to deal with specular reflections of active NIR lights on eyeglasses, a critical issue in active NIR image-based face recognition. Extensive, comparative results are provided to evaluate the imaging hardware, the face and eye detection algorithms, and the face recognition algorithms and systems, with respect to various factors, including illumination, eyeglasses, time lapse, and ethnic groups  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号