首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Maximum confidence hidden markov modeling for face recognition   总被引:1,自引:0,他引:1  
This paper presents a hybrid framework of feature extraction and hidden Markov modeling(HMM) for two-dimensional pattern recognition. Importantly, we explore a new discriminative training criterion to assure model compactness and discriminability. This criterion is derived from hypothesis test theory via maximizing the confidence of accepting the hypothesis that observations are from target HMM states rather than competing HMM states. Accordingly, we develop the maximum confidence hidden Markov modeling (MC-HMM) for face recognition. Under this framework, we merge a transformation matrix to extract discriminative facial features. The closed-form solutions to continuous-density HMM parameters are formulated. Attractively, the hybrid MC-HMM parameters are estimated under the same criterion and converged through the expectation-maximization procedure. From the experiments on FERET and GTFD facial databases, we find that the proposed method obtains robust segmentation in presence of different facial expressions, orientations, etc. In comparison with maximum likelihood and minimum classification error HMMs, the proposed MC-HMM achieves higher recognition accuracies with lower feature dimensions.  相似文献   

2.
针对细粒度图像分类任务中难以对图中具有鉴别性对象进行有效学习的问题,本文提出了一种基于注意力机制的弱监督细粒度图像分类算法.该算法能有效定位和识别细粒度图像中语义敏感特征.首先在经典卷积神经网络的基础上通过线性融合特征得到对象整体信息的表达,然后通过视觉注意力机制进一步提取特征中具有鉴别性的细节部分,获得更完善的细粒度特征表达.所提算法实现了线性融合和注意力机制的结合,可看作是多网络分支合作训练共同优化的网络模型,从而让网络模型对整体信息和局部信息都有更好的表达能力.在3个公开可用的细粒度识别数据集上进行了验证,实验结果表明,所提方法有效性均优于基线方法,且达到了目前先进的分类水平.  相似文献   

3.
4.
针对实时语义分割方法中因忽略其本质所导致的分割精度不高的问题,提出了一种多级上下文引导的轻量化网络。首先,将深度可分离卷积及非对称卷积相结合,设计了基于并行非对称卷积的上下文引导模型(CGPA)以学习局部特征及其周围上下文构成的联合特征;其次,将该模型堆叠于网络来实现特征的多级优化;最后,通过通道注意模型(CAM)筛选出与更高阶段语义一致的浅层特征,从而提高分割效果。实验结果表明,所提网络在Cityscapes数据集上以94.7的帧速率获得了72.4%的平均交并比,并在CamVid数据集上取得显著的性能提升。同当前的其他实时语义分割方法相比,该网络性能更优。  相似文献   

5.
6.
针对相位一致性特征对血管中心检测不足问题,提出基于融合相位特征的眼底视网膜血管分割算法。首先,预处理原始的视网膜图像;然后,对图像中每个像素构造4D的特征向量(包括Hessian矩阵、Gabor变换、条带选择组合位移滤波响应(B-COSFIRE)滤波、相位特征);最后,采用支持向量机(SVM)进行像素分类,实现眼底视网膜血管的分割。其中,相位特征是将分别提取的相位一致性特征与Hessian矩阵特征进行小波融合后得到的一种新的融合相位特征。该特征既保留了相位一致性特征良好的血管边缘信息,又克服了相位一致性特征对血管中心检测的不足。在用于血管提取的数字视网膜图像(DRIVE)数据库上测得基于融合相位特征的视网膜血管分割算法的平均准确率(Acc)为0.9574,平均受试者工作曲线面积(AUC)为0.9702;且在单一特征进行像素分类提取血管的实验中,与使用相位一致性特征相比,使用融合相位特征进行像素分类提取血管的Acc由0.9191提高到0.9478,AUC由0.9359提高到0.9578。实验结果表明,融合相位特征比相位一致性特征更适用于基于像素分类的眼底视网膜血管分割算法。  相似文献   

7.
8.
9.
在图像语义分割中使用卷积网络进行特征提取时,由于最大池化和下采样操作的重复组合引起了特征分辨率降低,从而导致上下文信息丢失,使得分割结果失去对目标位置的敏感性。虽然基于编码器-解码器架构的网络通过跳跃连接在恢复分辨率的过程中逐渐细化了输出精度,但其将相邻特征简单求和的操作忽略了特征之间的差异性,容易导致目标局部误识别等问题。为此,文中提出了基于深度特征融合的图像语义分割方法。该方法采用多组全卷积VGG16模型并联组合的网络结构,结合空洞卷积并行高效地处理金字塔中的多尺度图像,提取了多个层级的上下文特征,并通过自顶向下的方法逐层融合,最大限度地捕获上下文信息;同时,以改进损失函数而得到的逐层标签监督策略为辅助支撑,联合后端像素建模的全连接条件随机场,无论是在模型训练的难易程度还是预测输出的精度方面都有一定的优化。实验数据表明,通过对表征不同尺度上下文信息的各层深度特征进行逐层融合,图像语义分割算法在目标对象的分类和空间细节的定位方面都有所提升。在PASCAL VOC 2012和PASCAL CONTEXT两个数据集上获得的实验结果显示,所提方法分别取得了80.5%和45.93%的mIoU准确率。实验数据充分说明,并联框架中的深度特征提取、特征逐层融合和逐层标签监督策略能够联合优化算法架构。特征对比表明,该模型能够捕获丰富的上下文信息,得到更加精细的图像语义特征,较同类方法具有明显的优势。  相似文献   

10.
Face attribute classification (FAC) is a high-profile problem in biometric verification and face retrieval. Although recent research has been devoted to extracting more delicate image attribute features and exploiting the inter-attribute correlations, significant challenges still remain. Wavelet scattering transform (WST) is a promising non-learned feature extractor. It has been shown to yield more discriminative representations and outperforms the learned representations in certain tasks. Applied to the image classification task, WST can enhance subtle image texture information and create local deformation stability. This paper designs a scattering-based hybrid block, to incorporate frequency-domain (WST) and image-domain features in a channel attention manner (Squeeze-and-Excitation, SE), termed WS-SE block. Compared with CNN, WS-SE achieves a more efficient FAC performance and compensates for the model sensitivity of the small-scale affine transform. In addition, to further exploit the relationships among the attribute labels, we propose a learning strategy from a causal view. The cause attributes defined using the causality-related information can be utilized to infer the effect attributes with a high confidence level. Ablative analysis experiments demonstrate the effectiveness of our model, and our hybrid model obtains state-of-the-art results in two public datasets.  相似文献   

11.
Using high-spatial-resolution multispectral imagery alone is insufficient for achieving highly accurate and reliable thematic mapping of urban areas. Integration of lidar-derived elevation information into image classification can considerably improve classification results. Additionally, traditional pixel-based classifiers have some limitations in regard to certain landscape and data types. In this study, we take advantage of current advances in object-based image analysis and machine learning algorithms to reduce manual image interpretation and automate feature selection in a classification process. A sequence of image segmentation, feature selection, and object classification is developed and tested by the data sets in two study areas (Mannheim, Germany and Niagara Falls, Canada). First, to improve the quality of segmentation, a range image of lidar data is incorporated in an image segmentation process. Among features derived from lidar data and aerial imagery, the random forest, a robust ensemble classifier, is then used to identify the best features using iterative feature elimination. On the condition that the number of samples is at least two or three times the number of features, a segmentation scale factor has no particular effect on the selected features or classification accuracies. The results of the two study areas demonstrate that the presented object-based classification method, compared with the pixel-based classification, improves by 0.02 and 0.05 in kappa statistics, and by 3.9% and 4.5% in overall accuracy, respectively.  相似文献   

12.
13.
14.
This paper shows (i) improvements over state-of-the-art local feature recognition systems, (ii) how to formulate principled models for automatic local feature selection in object class recognition when there is little supervised data, and (iii) how to formulate sensible spatial image context models using a conditional random field for integrating local features and segmentation cues (superpixels). By adopting sparse kernel methods, Bayesian learning techniques and data association with constraints, the proposed model identifies the most relevant sets of local features for recognizing object classes, achieves performance comparable to the fully supervised setting, and obtains excellent results for image classification.  相似文献   

15.
16.
17.
18.
Image classification usually requires complicated segmentation to separate foreground objects from the background scene. However, the statistical content of a background scene can actually provide very useful information for classification. In this paper, we propose a new hybrid pyramid kernel which incorporates local features extracted from both dense regular grids and interest points for image classification, without requiring segmentation. Features extracted from dense regular grids can better capture information about the background scene, while interest points detected at corners and edges can better capture information about the salient objects. In our algorithm, these two local features are combined in both the spatial and the feature-space domains, and are organized into pyramid representations. In order to obtain better classification accuracy, we fine-tune the parameters involved in the similarity measure, and we determine discriminative regions by means of relevance feedback. From the experimental results, we observe that our algorithm can achieve a 6.37 % increase in performance as compared to other pyramid-representation-based methods. To evaluate the applicability of the proposed hybrid kernel to large-scale databases, we have performed a cross-dataset experiment and investigated the effect of foreground/background features on each of the kernels. In particular, the proposed hybrid kernel has been proven to satisfy Mercer’s condition and is efficient in measuring the similarity between image features. For instance, the computational complexity of the proposed hybrid kernel is proportional to the number of features.  相似文献   

19.
This paper presents the results of handwritten digit recognition on well-known image databases using state-of-the-art feature extraction and classification techniques. The tested databases are CENPARMI, CEDAR, and MNIST. On the test data set of each database, 80 recognition accuracies are given by combining eight classifiers with ten feature vectors. The features include chaincode feature, gradient feature, profile structure feature, and peripheral direction contributivity. The gradient feature is extracted from either binary image or gray-scale image. The classifiers include the k-nearest neighbor classifier, three neural classifiers, a learning vector quantization classifier, a discriminative learning quadratic discriminant function (DLQDF) classifier, and two support vector classifiers (SVCs). All the classifiers and feature vectors give high recognition accuracies. Relatively, the chaincode feature and the gradient feature show advantage over other features, and the profile structure feature shows efficiency as a complementary feature. The SVC with RBF kernel (SVC-rbf) gives the highest accuracy in most cases but is extremely expensive in storage and computation. Among the non-SV classifiers, the polynomial classifier and DLQDF give the highest accuracies. The results of non-SV classifiers are competitive to the best ones previously reported on the same databases.  相似文献   

20.
Fine-grained image classification aims at subdividing large coarse-grained categories into finer-grained subcategories. Most existing fine-grained research methods use a single attention mechanism or multiple sub-networks to zoom in and find distinguishable local feature regions. These models seldom explore the intrinsic connections between cross-layer features with similar semantic features. This tends to show erratic performance in images with complex backgrounds. To this end, we propose a feature-semantic fusion module to enhance the diversity of global feature information. Second, we employ cross-layer spatial attention and channel attention modules, which can accurately locate local key regions of images. Finally, we propose a cross-gate attention module that can find rich discriminative features from key object regions of images to guide the final classification. Experiments show that the proposed model performs well on three datasets: CUB-200-2011, Stanford cars, and FGVC aircraft.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号