首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
The human visual system has the impressive ability to quickly extract simple, global, curvilinear structure from input that may locally not even contain small fragments of this structure. Curves are easy to see globally even when they are locally broken, blurred, or jagged. Because the character of curve input can change with the scale at which it is considered, a hierarchical “pyramid” data structure is suggested. This paper describes a simple curve extraction process involving only local isotropic parallel operations. The noise-cleaned input image is smoothed and subsampled into a pyramid of lower-resolution versions by recursive computation of Gaussian-weighted sums. Curves are localized to thin strings of ridges and peaks at each scale. The method is compared with more abstract, essentially one-dimensional contour summarization processes.  相似文献   

2.
Light field imaging can capture both spatial and angular information of a 3D scene and is considered as a prospective acquisition and display solution to supply a more natural and fatigue-free 3D visualization. However, one problem that occupies an important position to deal with the light field data is the sheer size of data volume. In this context, efficient coding schemes for this particular type of image are needed. In this paper, we propose a hybrid linear weighted prediction and intra block copy based light field image codec architecture based on high efficiency video coding screen content coding extensions (HEVC SCC) standard to effectively compress the light field image data. In order to improve the prediction accuracy, a linear weighted prediction method is integrated into HEVC SCC standard, where a locally correction weighted based method is used to derive the weight coefficient vector. However, for the non-homogenous texture area, a best match in linear weighted prediction method does not necessarily lead to a good prediction of the coding block. In order to alleviate such shortcoming, the proposed hybrid codec architecture explores the idea of using the intra block copy scheme to find the best prediction of the coding block based on rate-distortion optimization. For the reason that the used “try all then select best” intra mode decision method is time-consuming, we further propose a fast mode decision scheme for the hybrid codec architecture to reduce the computation complexity. Experimental results demonstrate the advantage of the proposed hybrid codec architecture in terms of different quality metrics as well as the visual quality of views rendered from decompressed light field content, compared to the HEVC intra-prediction method and several other prediction methods in this field.  相似文献   

3.
4.
Automatic image annotation aims at predicting a set of semantic labels for an image. Because of large annotation vocabulary, there exist large variations in the number of images corresponding to different labels (“class-imbalance”). Additionally, due to the limitations of human annotation, several images are not annotated with all the relevant labels (“incomplete-labelling”). These two issues affect the performance of most of the existing image annotation models. In this work, we propose 2-pass k-nearest neighbour (2PKNN) algorithm. It is a two-step variant of the classical k-nearest neighbour algorithm, that tries to address these issues in the image annotation task. The first step of 2PKNN uses “image-to-label” similarities, while the second step uses “image-to-image” similarities, thus combining the benefits of both. We also propose a metric learning framework over 2PKNN. This is done in a large margin set-up by generalizing a well-known (single-label) classification metric learning algorithm for multi-label data. In addition to the features provided by Guillaumin et al. (2009) that are used by almost all the recent image annotation methods, we benchmark using new features that include features extracted from a generic convolutional neural network model and those computed using modern encoding techniques. We also learn linear and kernelized cross-modal embeddings over different feature combinations to reduce semantic gap between visual features and textual labels. Extensive evaluations on four image annotation datasets (Corel-5K, ESP-Game, IAPR-TC12 and MIRFlickr-25K) demonstrate that our method achieves promising results, and establishes a new state-of-the-art on the prevailing image annotation datasets.  相似文献   

5.
基于CLM(无码本模型)提出一种规避码本的室内功能区表示与建模方法.首先,在灰度级图像的基础上提取SURF(加速鲁棒特征)描述子;然后,运用空间金字塔方法将图像分成规则区域,在向量空间引入高斯流形,将每个区域用单高斯模型表示,并将其联合构成混合高斯模型以表示整幅图像;最后,将图像的高斯模型与改进的SVM(支持向量机)分类器联合使用,实现室内功能区的分类.在Scene 15数据集上的实验结果表明,本文方法相较于传统的构建码本方式在分类识别精度上提升约20%,同时对方向变化、光照不均匀等情况具有较好的鲁棒性,有效提升了服务机器人对室内功能区的认知能力.  相似文献   

6.
We study the recognition problem for composite objects based on a probabilistic model of a piecewise regular object with thousands of alternative classes. Using the model’s asymptotic properties, we develop a new maximal likelihood enumeration method which is optimal (in the sense of choosing the most likely reference for testing on every step) in the class of “greedy” algorithms of approximate nearest neighbor search. We show experimental results for the face recognition problem on the FERET dataset. We demonstrate that the proposed approach lets us reduce decision making time by several times not only compared to exhaustive search but also compared to known approximate nearest neighbors techniques.  相似文献   

7.
一种基于HOG-PCA的高效图像分类方法   总被引:1,自引:0,他引:1  
李林  吴跃  叶茂 《计算机应用研究》2013,30(11):3476-3479
为了更有效地提高图像分类性能和准确率, 提出一种基于HOG-PCA的高效图像分类方法。首先通过提取方向梯度直方图(HOG)特征并作特征白化, 再随机下采样进行尺度统一, 随后采用主成分分析(PCA)进行特征映射, 最后用最小二阶范数判定进行最近邻分类。实验中, 采用C++, 基于OpenCV和Darwin实现了提出的方法, 并在Pascal 2012数据集上进行测试, 比较了该方法和BOW-SVM方法的准确率和运行性能。实验证明, 提出的方法具有更高的准确率和更好的运行性能。  相似文献   

8.
9.
In order to obtain more robust face recognition results, the paper proposes an image preprocessing method based on local approximation gradient (LAG). The traditional gradient is only calculated along 0° and 90°; however, there exist many other directional gradients in an image block. To consider more directional gradients, we introduce a novel LAG operator. The LAG operator is actually calculated by integrating more directional gradients. Because of considering more directional gradients, LAG captures more edge information for each pixel of an image and finally generates an LAG image, which achieves a more robust image dissimilarity between images. An LAG image is normalized into an augmented feature vector using the “z-score” method. The dimensionality of the augmented feature vector is reduced by linear discriminant analysis to yield a low-dimensional feature vector. Experimental results show that the proposed method achieves more robust results in comparison with state-of-the-art methods in AR, Extended Yale B and CMU PIE face database.  相似文献   

10.
抽象画图像的自动方向检测由于其内容的含蓄性与自然图像相比会比较困难。为了提高抽象画图像方向自动检测的准确率,将每一幅绘画图像逆时针旋转四个角度(0°,90°,180°,270°),提取四幅图像的非旋转不变等价局部二值模式(nri-uniform-LBP)描述符作为特征,通过AdaBoost算法进行自动方向检测,将绘画分为“向上”和“不向上”两类。实验结果表明,该方法能有效提高抽象画图像方向自动检测的准确率,也为抽象绘画图像研究提供了一个新的研究视角。  相似文献   

11.
提出了一种利用图像特征空间信息的核函数——层次对数极坐标匹配核,用于遥感图像建筑物目标的分类。对图像进行特征提取,并将特征映射到已聚类好的"码本"中,量化为有限个类别。将图像由粗到细划分为多个层次的对数极坐标系下的"子区域(单元格)"。比对落入同一层次、同一"子区域(单元格)"的每类特征的直方图交集,建立加权的多尺度直方图,将多个特征多尺度直方图合并,得到最终的核函数,并利用"一对多"的支持向量机(supportvector machine,SVM)完成建筑物的分类。对标准数据库Caltech-256和自建遥感图像数据集进行实验,结果证明了该核函数的有效性。  相似文献   

12.
The paper compares a number of different methods for accelerating and damping the modified Newton-Raphson method. For the purposes of the paper, “acceleration” is defined as a process whereby information currently available as part of the standard iterative process (although not necessarily normally stored) is used to modify the standard iterative vector. On the other hand, “damping” is defined as a process whereby, as a consequence of the violation of some tolerance check, extra computations of the out-of-balance force vector are required in order to make similar adjustments. Such “damping” is introduced via the method of “line searches” which is much used in “unconstrained optimisation”.All the accelerations, described in the paper, involve single scalars that scale the standard vector, given by the modified Newton-Raphson method. In some cases, the resulting iterative vector is supplemented by the addition of scaled versions of previous iterative vectors. The objective of the work is to assess and derive methods that are more effective than the basic modified Newton-Raphson procedure, but less complex than the “BFGS” quasi Newton method which has recently become very popular. Numerical experiments are presented, involving a non-linear finite element computer program. Both material and geometric non-linearities are considered.  相似文献   

13.
14.
When images are described with visual words based on vector quantization of low-level color, texture, and edge-related visual features of image regions, it is usually referred as “bag-of-visual words (BoVW)”-based presentation. Although it has proved to be effective for image representation similar to document representation in text retrieval, the hard image encoding approach based on one-to-one mapping of regions to visual words is not expressive enough to characterize the image contents with higher level semantics and prone to quantization error. Each word is considered independent of all the words in this model. However, it is found that the words are related and their similarity of occurrence in documents can reflect the underlying semantic relations between them. To consider this, a soft image representation scheme is proposed by spreading each region’s membership values through a local fuzzy membership function in a neighborhood to all the words in a codebook generated by self-organizing map (SOM). The topology preserving property of the SOM map is exploited to generate a local membership function. A systematic evaluation of retrieval results of the proposed soft representation on two different image (natural photographic and medical) collections has shown significant improvement in precision at different recall levels when compared to different low-level and “BoVW”-based feature that consider only probability of occurrence (or presence/absence) of a word.  相似文献   

15.
Simple classifiers have the advantage of more generalization capability with the side effect of less power. It would be a good idea if we could build a classifier which is as simple as possible while giving it the ability of classifying complex patterns. In this paper, a hybrid classifier called “constrained classifier” is presented that classifies most of the input patterns using a simple, for example, a linear classifier. It performs the classification in four steps. In the “Dividing” step, the input patterns are divided into linearly separable and nonlinearly separable groups. The patterns belonging to the first group are classified using a simple classifier while the second group patterns (named “constraints”) are modeled in the “Modeling” step. The results of previous steps are merged together in the “Combining” step. The “Evaluation” step tests and fine tunes the membership of patterns into two groups. The experimental results of comparison of the new classifier with famous classifiers such as “support vector machine”, k-NN, and “Classification and Regression Trees” are very encouraging.  相似文献   

16.
We have witnessed 3D shape models abundant in many application fields including 3D CAD/CAM, augmented/mixed reality (AR/MR), and entertainment. Creating 3D shape models from scratch is still very expensive. Efficient and accurate methods for shape retrieval is essential for 3D shape models to be reused. To retrieve similar 3D shape models, one must provide an arbitrary 3D shape as a query. Most of the research on 3D shape retrieval has been conducted with a “whole” shape as a query (aka whole-to-whole shape retrieval), while a “part” shape (aka part-to-whole shape retrieval) is more practically requested as a query especially by mechanical engineering with 3D CAD/CAM applications. A “part” shape is naturally constructed by a 3D range scanner as an input device. In this paper, we focus on the efficient method for part-to-whole shape retrieval where the “part” shape is assumed to be given by a 3D range scanner. Specifically, we propose a Super-Vector coding feature with SURF local features extracted from the View-Normal-Angle image, or the image synthesized by taking account of the angle between the view vector and the surface normal vector, together with the depth-buffered image, for part-to-whole shape retrieval. In addition, we propose a weighted whole-to-whole re-ranking method taking advantage of global information based on the result of part-to-whole shape retrieval. Through experiments we demonstrate that our proposed method outperforms the previous methods with or without re-ranking.  相似文献   

17.
《Information Fusion》2008,9(2):186-199
A natural color mapping method has been previously proposed that matches the statistical properties (mean and standard deviation) of night-vision (NV) imagery to those of a daylight color image (manually selected as the “target” color distribution). Thus the rendered NV image appears to resemble the natural target image in terms of color appearance. However, in this prior method (termed “global-coloring”) the colored NV image may appear unnatural if the target image’s “global” color statistics are different from that of the night-vision scene (e.g., it would appear to have too much green if much more vegetation was contained in the target image). Consequently, a new “local-coloring” method is presented that functions to render the NV image segment-by-segment by taking advantage of image segmentation, pattern recognition, histogram matching and image fusion. Specifically, a false-color image (source image) is formed by assigning multi-band NV images to three RGB (red, green and blue) channels. A nonlinear diffusion filter is then applied to the false-colored image to reduce the number of colors. The final grayscale segments are obtained by using clustering and merging techniques. With a supervised nearest-neighbor paradigm, a segment can be automatically associated with a known “color scheme”. The statistic-matching procedure is merged with the histogram-matching procedure to enhance the color mapping effect. Instead of extracting the color set from a single target image, the mean, standard deviation and histogram distribution of the color planes from a set of natural scene images are used as the target color properties for each color scheme. The target color schemes are grouped by their scene contents and colors such as plants, mountain, roads, sky, water, etc. In our experiments, five pairs of night-vision images were initially analyzed, and the images that were colored (segment-by-segment) by the proposed “local-coloring” method are shown to possess much more natural and realistic coloration when compared with those produced by the previous “global-coloring” method.  相似文献   

18.
19.
We propose a recovery approach for highly subsampled dynamic parallel MRI image without auto-calibration signals (ACSs) or prior knowledge of coil sensitivity maps. By exploiting the between-frame redundancy of dynamic parallel MRI data, we first introduce a new low-rank matrix recovery-based model, termed as calibration using spatial–temporal matrix (CUSTOM), for ACSs recovery. The recovered ACSs from data are used for estimating coil sensitivity maps and further dynamic image reconstruction. The proposed non-convex and non-smooth minimization for the CUSTOM step is solved by a proximal alternating linearized minimization method, and we provide its convergence result for this specific minimization problem. Numerical experiments on several highly subsampled test data demonstrate that the proposed overall approach outperforms other state-of-the-art methods for calibrationless dynamic parallel MRI reconstruction.  相似文献   

20.
We evaluated the use of the “DeNitrification DeComposition” (DNDC) model to estimate variation in N2O emissions from agriculture for large geographic areas, using maize production in the United States as an example. We address practical and methodological issues, including how much model spin-up time is necessary; whether to run the model in “site” or “regional” mode; the importance of considering crop rotations; and the contribution of “background emissions” (emissions that occur when no crop is grown). We conclude that: 1) Spin-up time of 5–10 years may be sufficient to achieve steady-state N2O fluxes in many cases, 2) results between “site” and “regional” mode can differ greatly; 3) "Site" mode is preferable to "regional" mode as its use is more transparent and flexible, 4) Crop rotations have a modest, but observable effect on modeled emissions from US maize, 5) Background emissions, while generally low, can be very high in certain locations and should thus be considered.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号