首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
Face image analysis is one among several important cues in computer vision. Over the last five decades, methods for face analysis have received immense attention due to large scale applications in various face analysis tasks. Face parsing strongly benefits various human face image analysis tasks inducing face pose estimation. In this paper we propose a 3D head pose estimation framework developed through a prior end to end deep face parsing model. We have developed an end to end face parts segmentation framework through deep convolutional neural networks (DCNNs). For training a deep face parts parsing model, we label face images for seven different classes, including eyes, brows, nose, hair, mouth, skin, and back. We extract features from gray scale images by using DCNNs. We train a classifier using the extracted features. We use the probabilistic classification method to produce gray scale images in the form of probability maps for each dense semantic class. We use a next stage of DCNNs and extract features from grayscale images created as probability maps during the segmentation phase. We assess the performance of our newly proposed model on four standard head pose datasets, including Pointing’04, Annotated Facial Landmarks in the Wild (AFLW), Boston University (BU), and ICT-3DHP, obtaining superior results as compared to previous results.  相似文献   

2.
提出一种将肤色分割与灰度图像对称变换相结合的有效的人脸特征定位方法.为降低背景和人脸姿态的影响,在传统人脸肤色分割基础上,利用Blob分析及椭圆拟和方法判断并剔除无效区域,并将待选人脸区域初步校正为竖直;直接利用色度信息在待选人脸区域中分割并检测嘴唇;接着,采用灰度图像的对称变换检测双眼的待选位置;以嘴唇和待选双眼的位置为依据,提出了基于人脸几何模型知识的双眼匹配代价函数并利用组合优化方法检测真实双眼;最后,对人脸上其它器官的进行精确定位.实验结果表明,该方法对于头发遮挡、大倾斜角以及眼镜干扰具有较强的鲁棒性.  相似文献   

3.
Nowadays, dietary assessment becomes the emerging system for evaluating the person’s food intake. In this paper, the multiple hypothesis image segmentation and feed-forward neural network classifier are proposed for dietary assessment to enhance the performance. Initially, the segmentation is applied to input image which is used to determine the regions where a particular food item is located using salient region detection, multi-scale segmentation, and fast rejection. Then, the significant feature of food items is extracted by the global feature and local feature extraction method. After the features are obtained, the classification is performed for each segmented region using feed-forward neural network model. Finally, the calorie value is computed with the aid of (i) food area volume and (ii) calorie and nutrition measure based on mass value. The outcome of the proposed method attains 96% of accuracy value which provides the better classification performance.  相似文献   

4.
《中国工程学刊》2012,35(5):529-534
Faces are highly deformable objects which may easily change their appearance over time. Not all face areas are subject to the same variability. Therefore, decoupling of the information from independent areas of the face is of paramount importance to improve the robustness of any face recognition technique. The aim of this article is to present a robust face recognition technique based on the extraction and matching of probabilistic graphs drawn on scale invariant feature transform (SIFT) features related to independent face areas. The face matching strategy is based on matching individual salient facial graphs characterized by SIFT features as connected to facial landmarks such as the eyes and the mouth. In order to reduce the face matching errors, the Dempster–Shafer decision theory is applied to fuse the individual matching scores obtained from each pair of salient facial features. The proposed algorithm is evaluated with the Olivetti Research Lab (ORL) and the Indian Institute of Technology Kanpur (IITK) face databases. The experimental results demonstrate the effectiveness and potential of the proposed face recognition technique, even in the case of partially occluded faces.  相似文献   

5.
田卓  佘青山  甘海涛  孟明 《计量学报》2019,40(4):576-582
为了提高复杂背景下面部信息的识别性能,提出了一种面向人脸特征点定位和姿态估计任务协同的深度卷积神经网络(DCNN)方法。首先从视频图像中检测出人脸信息;其次设计一个深度卷积网络模型,将人脸特征点定位和姿态估计两个任务协同优化,同时回归得到人脸特征点坐标和姿态角度值,然后融合生成相应的人机交互信息;最后采用公开数据集和实际场景数据进行测试,并与其他现有方法进行比对分析。实验结果表明:该方法在人脸特征点定位和姿态估计上表现出较好的性能,在光照变化、表情变化、部分遮挡等复杂条件下人机交互应用也取得了良好的准确性和鲁棒性,平均处理速度约16帧/s,具备一定的实用性。  相似文献   

6.
Facial expression recognition has been a hot topic for decades, but high intraclass variation makes it challenging. To overcome intraclass variation for visual recognition, we introduce a novel fusion methodology, in which the proposed model first extract features followed by feature fusion. Specifically, RestNet-50, VGG-19, and Inception-V3 is used to ensure feature learning followed by feature fusion. Finally, the three feature extraction models are utilized using Ensemble Learning techniques for final expression classification. The representation learnt by the proposed methodology is robust to occlusions and pose variations and offers promising accuracy. To evaluate the efficiency of the proposed model, we use two wild benchmark datasets Real-world Affective Faces Database (RAF-DB) and AffectNet for facial expression recognition. The proposed model classifies the emotions into seven different categories namely: happiness, anger, fear, disgust, sadness, surprise, and neutral. Furthermore, the performance of the proposed model is also compared with other algorithms focusing on the analysis of computational cost, convergence and accuracy based on a standard problem specific to classification applications.  相似文献   

7.
A 3-D anthropometric-muscle-based active appearance model   总被引:1,自引:0,他引:1  
This paper describes a novel method for modeling the shape and appearance of human faces in three dimensions using a constrained three-dimensional (3-D) active appearance model (AAM). Our algorithm is an extension of the classical two-dimensional (2-D) AAM. The method uses a generic 3-D wireframe model of the face, based on two sets of controls: anatomically motivated muscle actuators to model facial expressions and statistically based anthropometrical controls to model different facial-types. The 3-D anthropometric-muscle-based model (AMBM) of the face allows representing a facial image in terms of a controlled model-parameter set, hence, providing a natural and constrained basis for face segmentation and analysis. The generated face models are consequently simpler and less memory intensive compared to the classical appearance-based models. The proposed method allows for accurate fitting results by constraining solutions to be valid instances of a face model. Extensive image-segmentation experiments have demonstrated the accuracy of the proposed algorithm against the classical AAM.  相似文献   

8.
ABSTRACT

This paper proposes the multiple-hypotheses image segmentation and feed-forward neural network classifier for food recognition to improve the performance. Initially, the food or meal image is given as input. Then, the segmentation is applied to identify the regions, where a particular food item is located using salient region detection, multi-scale segmentation, and fast rejection. Then, the features of every food item are extracted by the global feature and local feature extraction. After the features are obtained, the classification is performed for each segmented region using a feed-forward neural network model. Finally, the calorie value is computed with the aid of (i) food volume and (ii) calorie and nutrition measure based on mass value. The experimental results and performance evaluation are validated. The outcome of the proposed method attains 0.947 for Macro Average Accuracy (MAA) and 0.959 for Standard Accuracy (SA), which provides better classification performance.  相似文献   

9.
In order to improve face recognition accuracy, we present a simple near-infrared (NIR) and visible light (VL) image fusion algorithm based on two-dimensional linear discriminant analysis (2DLDA). We first use two such schemes to extract two classes of face discriminant features of each of NIR and VL images separately. Then the two classes of features of each kind of images are fused using the matching score fusion method. At last, a simple NIR and VL image fusion approach is exploited to combine the scores of NIR and VL images and to obtain the classification result. The experimental results show that the proposed NIR and VL image fusion approach can effectively improve the accuracy of face recognition.  相似文献   

10.
Histopathological whole-slide image (WSI) analysis is still one of the most important ways to identify regions of cancer risk. For cancer in which early diagnosis is vital, pathologists are at the center of the decision-making process. Thanks to the widespread use of digital pathology and the development of artificial intelligence methods, automatic histopathological image analysis methods help pathologists in their decision-making process. In this process, rather than producing labels for whole-slide image patches, semantic segmentation is very useful, which facilitates the pathologists’ interpretation. In this study, automatic semantic segmentation based on cell type is proposed for the first time in the literature using novel deep convolutional networks structure (DCNN). We presents semantic information on four classes, including white areas in the whole-slide image, tissue without cells, tissue with normal cells and tissue with cancerous cells. This visual information presented to the pathologist is an easy-to-understand picture of the status of the cells and their implications for the spread of cancerous cells. A new DCNN architecture is created, inspired by the residual network and deconvolution network architecture. Our network is trained end-to-end manner with histopathological image patches for cell structures to be more discriminative. The proposed method not only produces more successful results than other state-of-art semantic segmentation algorithms with 9.2% training error and 88.89% F-score for test, but also has the most important advantage in that it has the ability to generate automatic information about the cancer and also provides information that pathologists can quickly interpret.  相似文献   

11.
Motion segmentation is a crucial step for video analysis and has many applications. This paper proposes a method for motion segmentation, which is based on construction of statistical background model. Variance and Covariance of pixels are computed to construct the model for scene background. We perform average frame differencing with this model to extract the objects of interest from the video frames. Morphological operations are used to smooth the object segmentation results. The proposed technique is adaptive to the dynamically changing background because of change in the lighting conditions and in scene background. The method has the capability to relearn the background to adapt these variations. The immediate advantage of the proposed method is its high processing speed of 30 frames per second on large sized (high resolution) videos. We compared the proposed method with other five popular methods of object segmentation in order to prove the effectiveness of the proposed technique. Experimental results demonstrate the novelty of the proposed method in terms of various performance parameters. The method can segment the video stream in real-time, when background changes, lighting conditions vary, and even in the presence of clutter and occlusion  相似文献   

12.
Recent convolutional neural networks (CNNs) based deep learning has significantly promoted fire detection. Existing fire detection methods can efficiently recognize and locate the fire. However, the accurate flame boundary and shape information is hard to obtain by them, which makes it difficult to conduct automated fire region analysis, prediction, and early warning. To this end, we propose a fire semantic segmentation method based on Global Position Guidance (GPG) and Multi-path explicit Edge information Interaction (MEI). Specifically, to solve the problem of local segmentation errors in low-level feature space, a top-down global position guidance module is used to restrain the offset of low-level features. Besides, an MEI module is proposed to explicitly extract and utilize the edge information to refine the coarse fire segmentation results. We compare the proposed method with existing advanced semantic segmentation and salient object detection methods. Experimental results demonstrate that the proposed method achieves 94.1%, 93.6%, 94.6%, 95.3%, and 95.9% Intersection over Union (IoU) on five test sets respectively which outperforms the suboptimal method by a large margin. In addition, in terms of accuracy, our approach also achieves the best score.  相似文献   

13.
A computer software system is designed for the segmentation and classification of benign and malignant tumor slices in brain computed tomography images. In this paper, we present a texture analysis methods to find and select the texture features of the tumor region of each slice to be segmented by support vector machine (SVM). The images considered for this study belongs to 208 benign and malignant tumor slices. The features are extracted and selected using Student's t‐test. The reduced optimal features are used to model and train the probabilistic neural network (PNN) classifier and the classification accuracy is evaluated using k fold cross validation method. The segmentation results are also compared with the experienced radiologist ground truth. Quantitative analysis between ground truth and segmented tumor is presented in terms of quantitative measure of segmentation accuracy and the overlap similarity measure of Jaccard index. The proposed system provides some newly found texture features have important contribution in segmenting and classifying benign and malignant tumor slices efficiently and accurately. The experimental results show that the proposed hybrid texture feature analysis method using Probabilistic Neural Network (PNN) based classifier is able to achieve high segmentation and classification accuracy effectiveness as measured by Jaccard index, sensitivity, and specificity.  相似文献   

14.
Background: In medical image analysis, the diagnosis of skin lesions remains a challenging task. Skin lesion is a common type of skin cancer that exists worldwide. Dermoscopy is one of the latest technologies used for the diagnosis of skin cancer. Challenges: Many computerized methods have been introduced in the literature to classify skin cancers. However, challenges remain such as imbalanced datasets, low contrast lesions, and the extraction of irrelevant or redundant features. Proposed Work: In this study, a new technique is proposed based on the conventional and deep learning framework. The proposed framework consists of two major tasks: lesion segmentation and classification. In the lesion segmentation task, contrast is initially improved by the fusion of two filtering techniques and then performed a color transformation to color lesion area color discrimination. Subsequently, the best channel is selected and the lesion map is computed, which is further converted into a binary form using a thresholding function. In the lesion classification task, two pre-trained CNN models were modified and trained using transfer learning. Deep features were extracted from both models and fused using canonical correlation analysis. During the fusion process, a few redundant features were also added, lowering classification accuracy. A new technique called maximum entropy score-based selection (MESbS) is proposed as a solution to this issue. The features selected through this approach are fed into a cubic support vector machine (C-SVM) for the final classification. Results: The experimental process was conducted on two datasets: ISIC 2017 and HAM10000. The ISIC 2017 dataset was used for the lesion segmentation task, whereas the HAM10000 dataset was used for the classification task. The achieved accuracy for both datasets was 95.6% and 96.7%, respectively, which was higher than the existing techniques.  相似文献   

15.
A robust smile recognition system could be widely used for many real-world applications. Classification of a facial smile in an unconstrained setting is difficult due to the invertible and wide variety in face images. In this paper, an adaptive model for smile expression classification is suggested that integrates a fast features extraction algorithm and cascade classifiers. Our model takes advantage of the intrinsic association between face detection, smile, and other face features to alleviate the over-fitting issue on the limited training set and increase classification results. The features are extracted taking into account to exclude any unnecessary coefficients in the feature vector; thereby enhancing the discriminatory capacity of the extracted features and reducing the computational process. Still, the main causes of error in learning are due to noise, bias, and variance. Ensemble helps to minimize these factors. Combinations of multiple classifiers decrease variance, especially in the case of unstable classifiers, and may produce a more reliable classification than a single classifier. However, a shortcoming of bagging as the best ensemble classifier is its random selection, where the classification performance relies on the chance to pick an appropriate subset of training items. The suggested model employs a modified form of bagging while creating training sets to deal with this challenge (error-based bootstrapping). The experimental results for smile classification on the JAFFE, CK+, and CK+48 benchmark datasets show the feasibility of our proposed model.  相似文献   

16.
Digital surveillance systems are ubiquitous and continuously generate massive amounts of data, and manual monitoring is required in order to recognise human activities in public areas. Intelligent surveillance systems that can automatically ide.pngy normal and abnormal activities are highly desirable, as these would allow for efficient monitoring by selecting only those camera feeds in which abnormal activities are occurring. This paper proposes an energy-efficient camera prioritisation framework that intelligently adjusts the priority of cameras in a vast surveillance network using feedback from the activity recognition system. The proposed system addresses the limitations of existing manual monitoring surveillance systems using a three-step framework. In the first step, the salient frames are selected from the online video stream using a frame differencing method. A lightweight 3D convolutional neural network (3DCNN) architecture is applied to extract spatio-temporal features from the salient frames in the second step. Finally, the probabilities predicted by the 3DCNN network and the metadata of the cameras are processed using a linear threshold gate sigmoid mechanism to control the priority of the camera. The proposed system performs well compared to state-of-the-art violent activity recognition methods in terms of efficient camera prioritisation in large-scale surveillance networks. Comprehensive experiments and an evaluation of activity recognition and camera prioritisation showed that our approach achieved an accuracy of 98% with an F1-score of 0.97 on the Hockey Fight dataset, and an accuracy of 99% with an F1-score of 0.98 on the Violent Crowd dataset.  相似文献   

17.
张立国  程瑶  金梅  王娜 《计量学报》2021,42(4):515-520
室内场景的语义分割一直是深度学习语义分割领域的一个重要方向。室内语义分割主要存在的问题有语义类别多、很多物体类会有相互遮挡、某些类之间相似性较高等。针对这些问题,提出了一种用于室内场景语义分割的方法。该方法在BiSeNet(bilateral segmentation network)的网络结构基础上,引入了一个空洞金字塔池化层和多尺度特征融合模块,将上下文路径中的浅层细节特征与通过空洞金字塔池化得到的深层抽象特征进行融合,得到增强的内容特征,提高模型对室内场景语义分割的表现。该方法在ADE20K中关于室内场景的数据集上的MIoU表现,比SegNet高出23.5%,比改进前高出3.5%。  相似文献   

18.
Tumor detection has been an active research topic in recent years due to the high mortality rate. Computer vision (CV) and image processing techniques have recently become popular for detecting tumors in MRI images. The automated detection process is simpler and takes less time than manual processing. In addition, the difference in the expanding shape of brain tumor tissues complicates and complicates tumor detection for clinicians. We proposed a new framework for tumor detection as well as tumor classification into relevant categories in this paper. For tumor segmentation, the proposed framework employs the Particle Swarm Optimization (PSO) algorithm, and for classification, the convolutional neural network (CNN) algorithm. Popular preprocessing techniques such as noise removal, image sharpening, and skull stripping are used at the start of the segmentation process. Then, PSO-based segmentation is applied. In the classification step, two pre-trained CNN models, alexnet and inception-V3, are used and trained using transfer learning. Using a serial approach, features are extracted from both trained models and fused features for final classification. For classification, a variety of machine learning classifiers are used. Average dice values on datasets BRATS-2018 and BRATS-2017 are 98.11 percent and 98.25 percent, respectively, whereas average jaccard values are 96.30 percent and 96.57% (Segmentation Results). The results were extended on the same datasets for classification and achieved 99.0% accuracy, sensitivity of 0.99, specificity of 0.99, and precision of 0.99. Finally, the proposed method is compared to state-of-the-art existing methods and outperforms them.  相似文献   

19.
人脸图像中人眼的检测与定位   总被引:6,自引:0,他引:6  
张敏  陶亮 《光电工程》2006,33(8):32-36,93
利用人脸几何特征和图像分割原理,提出了一种在有背景的灰度和彩色人脸图像中自动检测与定位人眼的新算法。首先,基于人脸器官几何分布先验知识建立人眼位置判定准则;其次对人眼的分割阈值范围进行粗估计;然后采用分割阈值递增法,并结合人眼位置判定准则判定分割图像中双眼黑块是否出现;最后利用二雏相关系数作为对称相似性测度,检验检测到的双眼的真实性。为了避免图像背景对人眼检测的干扰,还运用了肤色分割原理来缩小检测人眼的搜索区域,从而进一步提高人眼定位的准确性。实验验证表明,所提出的人眼检测与定位方法在速度和准确性方面具有良好的性能。  相似文献   

20.
Diabetes is a metabolic disorder that results in a retinal complication called diabetic retinopathy (DR) which is one of the four main reasons for sightlessness all over the globe. DR usually has no clear symptoms before the onset, thus making disease identification a challenging task. The healthcare industry may face unfavorable consequences if the gap in identifying DR is not filled with effective automation. Thus, our objective is to develop an automatic and cost-effective method for classifying DR samples. In this work, we present a custom Faster-RCNN technique for the recognition and classification of DR lesions from retinal images. After pre-processing, we generate the annotations of the dataset which is required for model training. Then, introduce DenseNet-65 at the feature extraction level of Faster-RCNN to compute the representative set of key points. Finally, the Faster-RCNN localizes and classifies the input sample into five classes. Rigorous experiments performed on a Kaggle dataset comprising of 88,704 images show that the introduced methodology outperforms with an accuracy of 97.2%. We have compared our technique with state-of-the-art approaches to show its robustness in term of DR localization and classification. Additionally, we performed cross-dataset validation on the Kaggle and APTOS datasets and achieved remarkable results on both training and testing phases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号