共查询到20条相似文献,搜索用时 15 毫秒
1.
目的 人脸年龄估计技术作为一种新兴的生物特征识别技术,已经成为计算机视觉领域的重要研究方向之一。随着深度学习的飞速发展,基于深度卷积神经网络的人脸年龄估计技术已成为研究热点。方法 本文以基于深度学习的真实年龄和表象年龄估计方法为研究对象,通过调研文献,分析了基于深度学习的人脸年龄估计方法的基本思想和特点,阐述其研究现状,总结关键技术及其局限性,对比了常见人脸年龄估计方法的性能,展望了未来的发展方向。结果 尽管基于深度学习的人脸年龄估计研究取得了巨大的进展,但非受限条件下年龄估计的效果仍不能满足实际需求,主要因为当前人脸年龄估计研究仍存在以下困难:1)引入人脸年龄估计的先验知识不足;2)缺少兼顾全局和局部细节的人脸年龄估计特征表达方法;3)现有人脸年龄估计数据集的限制;4)实际应用环境下的多尺度人脸年龄估计问题。结论 基于深度学习的人脸年龄估计技术已取得显著进展,但是由于实际应用场景复杂,容易导致人脸年龄估计效果不佳。对目前基于深度学习的人脸年龄估计技术进行全面综述,从而为研究者解决存在的问题提供便利。 相似文献
2.
3.
本文提出了一种新型的基于人脸五官辅助的深度年龄估计方法,将传统的人脸五官区域特征提取加分类器设计方法与基于深层卷积神经网络(convolutional neural network,CNN)的端到端分类方法进行融合来解决年龄估计问题,增强了系统模型的泛化能力.该方法将面部关键点生成的局部对齐的人脸图像块作为CNN的输入,直接从图像的像素点评估年龄,采用多尺度分析网络结构极大地提高了性能,同时又利用传统算法增强了五官区域的信息.最后通过在MORPH AlbumⅡ上的实验表明文中提出方法比其他同类研究方法更加优秀. 相似文献
4.
标记分布学习(label distribution learning,LDL)是一种用于解决标记多义性的新颖学习范式。现有的LDL方法大多基于完整数据信息进行设计,然而由于高昂的标注成本以及标注人员水平的局限性,很难获取到完整标注数据信息,且会导致传统LDL算法性能的下降。为此,本文提出了一种新型的结合局部序标记关系的弱监督标记分布学习算法,通过维持尚未缺失标记之间的相对关系,并利用标记相关性来恢复缺失的标记,在数据标注不完整的情况下提升算法性能。在14个数据集上进行了大量的实验来验证算法的有效性。 相似文献
5.
目的 为了提高人脸图像年龄估计的精度,提出一种端对端可训练的深度神经网络模型来进行人脸年龄估计。方法 该网络模型由多个卷积神经网络(CNN)和一个深度置信网络(DBN)堆叠而成,称为深度融合网络(DFN)。首先使用多个并联的CNN提取人脸图像多个区域的外观特征,将得到的特征进行串接输入一个DBN网络进行非线性融合。为了实现DFN的端到端的整体训练,提出一种逐网络迭代训练(INWT)的机制。为了降低过拟合效应,那些对应人脸局部图像的CNN经过多次迭代迁移学习实现面向人脸年龄估计任务的训练。完成对DFN中所有CNN和DBN的预训练后,再进行全网络端到端的整体精调。结果 在两个人脸年龄图像库MORPHⅡ和FG-NET上对本文方法进行测试,实验结果显示基于DFN的人脸年龄估计方法能在两个人脸图像库中分别取得平均绝对误差(MAE)等于3.42和4.14的估计精度,与目前主流的年龄估计算法,如基于浅层学习的CA-SVR方法(两个数据库上取得的MAE分别等于5.88和4.75),基于深度学习的DeepRank+方法(MORPHⅡ数据库上取得的MAE为3.49)和Deep-CS-LBMFL方法(FG-NET数据库上取得的MAE为4.22)等相比,估计精确度明显提高。结论 本文提出基于深度融合网络的人脸年龄估计方法与当前大部分基于深度神经网络的主流算法相比具有明显的优势。 相似文献
6.
7.
In this paper, we tackle the problem of segmenting out a sequence of actions from videos. The videos contain background and actions which are usually composed of ordered sub-actions. We refer the sub-actions and the background as semantic units. Considering the possible overlap between two adjacent semantic units, we propose a bidirectional sliding window method to generate the label distributions for various segments in the video. The label distribution covers a certain number of semantic unit labels, representing the degree to which each label describes the video segment. The mapping from a video segment to its label distribution is then learned by a Label Distribution Learning (LDL) algorithm. Based on the LDL model, a soft video parsing method with segmental regular grammars is proposed to construct a tree structure for the video. Each leaf of the tree stands for a video clip of background or sub-action. The proposed method shows promising results on the THUMOS’14, MSR-II and UCF101 datasets and its computational complexity is much less than the compared state-of-the-art video parsing method. 相似文献
8.
针对遥感图像由于雾霾的存在导致图像清晰度下降的问题,提出了一种基于深度学习的图像去雾算法.首先将原有大气散射模型进行变形得到一个端到端的去雾模型,再将多个未知参数统一在一个参数中,运用多尺度卷积神经网络对未知参数进行估计,最后将参数估计值代入去雾模型中得到无雾图像.针对无参考图像数据集,先运用现有数据集对网络进行初步训练,再加入自建数据集对网络进行二次训练.实验结果表明,与相关去雾算法进行对比,该算法在视觉效果和客观指标上都有不同程度的提高,有效提升了遥感图像在雾霾天气状况下的清晰度. 相似文献
9.
标记分布学习能有效求解多标记学习任务,然而分类器构造以获得大规模具有更强监督信息的标注为前提,在许多应用中难以满足。一种替代的方案是以标记增强的方式从传统逻辑形式的标注中挖掘出隐含的数值型标记的重要程度。现有的标记增强方法大多假设增强后的标记需要在所有示例上保持原有逻辑标记的相关性,不能有效保持局部标记相关性。基于粒计算理论,提出了一种适用于标记分布学习的粒化标记增强学习方法。该方法通过k均值聚类构造具有局部相关性语义的信息粒,并在粒的抽象层面上,分别在图上依据逻辑标记的特性和属性空间的拓扑性质完成粒内示例的标记转化。最后,将得到的标记分布在示例层面进行融合,得到描述整个数据集标记重要程度的数值型标记。大量比较研究表明,所提出的模型可以显著地提升多标记学习的性能。 相似文献
10.
行为识别(action recognition,AR)是计算机视觉领域的研究热点,在安防监控、自动驾驶、生产安全等领域具有广泛的应用前景。首先,对行为识别的内涵与外延进行了剖析,提出了面临的技术挑战问题。其次,从时间特征提取、高效率优化和长期特征捕获三个角度分析比较了行为识别的工作原理。对近十年43种基准AR方法在UCF101、HMDB51、Something-Something和Kinetics400数据集上的性能表征进行比对,有助于针对不同应用场景选择适合的AR模型。最后指明了行为识别领域的未来发展方向,研究成果可为视频特征提取和视觉内容理解提供理论参考和技术支撑。 相似文献
11.
12.
Yih An Ding Filipe Mutz Klaus F. Côco Luiz A. Pinto Karin S. Komati 《Expert Systems》2020,37(6):e12584
Bone age estimation has been used in medicine to verify whether the bone structure development degree of a person corresponds to their chronological age. Such estimate is useful for prognosis about the development of children and adolescents, as well as for the diagnosis of endocrinological diseases. This work proposes a fully automated methodology for bone age estimation from carpal radiography images. The methodology comprises two steps, the preprocessing of the image and the classification using a convolutional neural network. The system accuracy for different types of preprocessing is evaluated. We compare the accuracy achieved using the full radiography image as input for the neural network and using only parts of the image corresponding to the Phalangeal region, the Epiphyseal region, and the concatenation of these parts with a crop around the wrist. Digital image processing techniques are employed to segment these regions. Experiments are performed using radiography images from the California University Database. The impact of using different pre-trained neural networks for transfer learning is evaluated. 相似文献
13.
The amount of digital data in the universe is growing at an exponential rate, doubling every 2 years, and changing how we live in the world. The information storage capacity and data requirement crossed the zettabytes. With this level of bombardment of data on machine learning techniques, it becomes very difficult to carry out parallel computations. Deep learning is broadening its scope and gaining more popularity in natural language processing, feature extraction and visualization, and almost in every machine learning trend. The purpose of this study is to provide a brief review of deep learning architectures and their working. Research papers and proceedings of conferences from various authentic resources (Institute of Electrical and Electronics Engineers, Wiley, Nature, and Elsevier) are studied and analyzed. Different architectures and their effectiveness to solve domain specific problems are evaluated. Various limitations and open problems of current architectures are discussed to provide better insights to help researchers and student to resume their research on these issues. One hundred one articles were reviewed for this meta‐analysis of deep learning. From this analysis, it is concluded that advanced deep learning architectures are combinations of few conventional architectures. For example, deep belief network and convolutional neural network are used to build convolutional deep belief network, which has higher capabilities than the parent architectures. These combined architectures are more robust to explore the problem space and thus can be the answer to build a general‐purpose architecture. 相似文献
14.
Xinyue DONG Tingjin LUO Ruidong FAN Wenzhang ZHUGE Chenping HOU 《Frontiers of Computer Science》2023,17(4):174327
Label distribution learning (LDL) is a new learning paradigm to deal with label ambiguity and many researches have achieved the prominent performances. Compared with traditional supervised learning scenarios, the annotation with label distribution is more expensive. Direct use of existing active learning (AL) approaches, which aim to reduce the annotation cost in traditional learning, may lead to the degradation of their performance. To deal with the problem of high annotation cost in LDL, we propose the active label distribution learning via kernel maximum mean discrepancy (ALDL-kMMD) method to tackle this crucial but rarely studied problem. ALDL-kMMD captures the structural information of both data and label, extracts the most representative instances from the unlabeled ones by incorporating the nonlinear model and marginal probability distribution matching. Besides, it is also able to markedly decrease the amount of queried unlabeled instances. Meanwhile, an effective solution is proposed for the original optimization problem of ALDL-kMMD by constructing auxiliary variables. The effectiveness of our method is validated with experiments on the real-world datasets. 相似文献
15.
Mohammad Haider Syed Kamal Upreti Mohammad Shahnawaz Nasir Mohammad Shabbir Alam Arvind Kumar Sharma 《Computational Intelligence》2023,39(4):577-591
Digital images are more important in numerous contemporary applications, and the need for images in the technical field is also increasing drastically. It is used to recognize signatures and faces in many industries and is applicable for intelligent departments. The images are usually associated with the noise content; this may happen due to the instrument imperfections, troubleshooting while collecting data from the acquisition process, and another natural phenomenon. Poisson noise, also known as photon noise, is caused in the images due to the statistical essence of electromagnetic waves. X-ray, visible light, and gamma rays are electromagnetic waves. The enhancement of the convolution model in addressing images is challenging due to the various constituents such as optical aberrations, noise level, and optical setup. The modeling configuration of the image is attained using the point spread function (PSF), which is responsible for the system's impulse response. The quality image is retrieved by denoising and super-resolution (SR) methods; these methods simultaneously eliminate the noise content from the images. A Richardson–Lucy and alternating direction method of multipliers type of non-blind iterative algorithmic approaches associated with the PSF performance in addressing image is comparatively analyzed. The deep learning approach, convolutional neural networks (CNNs), is also employed to understand the nonlinear mapping relationship between the observed data and ground reality. The performance of the various network approaches is compared in this article. The result obtained shows that the deep learning CNNs achieved higher accuracy in producing denoising images. The goal of the proposed system model is to remove the interference noise in images. The high-resolution images are obtained by implementing a SR-based CNN model. 相似文献
16.
Earthwork operations are crucial parts of most construction projects. Heavy construction equipment and workers are often required to work in limited workspaces simultaneously. Struck-by accidents resulting from poor worker and equipment interactions account for a large proportion of accidents and fatalities on construction sites. The emerging technologies based on computer vision and artificial intelligence offer an opportunity to enhance construction safety through advanced monitoring utilizing site cameras. A crucial pre-requisite to the development of safety monitoring applications is the ability to identify accurately and localize the position of the equipment and its critical components in 3D space. This study proposes a workflow for excavator 3D pose estimation based on deep learning using RGB images. In the proposed workflow, an articulated 3D digital twin of an excavator is used to generate the necessary data for training a 3D pose estimation model. In addition, a method for generating hybrid datasets (simulation and laboratory) for adapting the 3D pose estimation model for various scenarios with different camera parameters is proposed. Evaluations prove the capability of the workflow in estimating the 3D pose of excavators. The study concludes by discussing the limitations and future research opportunities. 相似文献
17.
18.
目的 图像修复是计算机视觉领域研究的一项重要内容,其目的是根据图像中已知内容来自动地恢复丢失的内容,在图像编辑、影视特技制作、虚拟现实及数字文化遗产保护等领域都具有广泛的应用价值。而近年来,随着深度学习在学术界和工业界的广泛研究,其在图像语义提取、特征表示、图像生成等方面的应用优势日益突出,使得基于深度学习的图像修复方法的研究成为了国内外一个研究热点,得到了越来越多的关注。为了使更多研究者对基于深度学习的图像修复理论及其发展进行探索,本文对该领域研究现状进行综述。方法 首先对基于深度学习图像修复方法提出的理论依据进行分析;然后对其中涉及的关键技术进行研究;总结了近年来基于深度学习的主要图像修复方法,并依据修复网络的结构对现有方法进行了分类,即分为基于卷积自编码网络结构的图像修复方法、基于生成式对抗网络结构的图像修复方法和基于循环神经网络结构的图像修复方法。结果 在基于深度学习的图像修复方法中,深度学习网络的设计和训练过程中的损失函数的选择是其重要的内容,各类方法各有优缺点和其适用范围,如何提高修复结果语义的合理性、结构及细节的正确性,一直是研究者们努力的方向,基于此目的,本文通过实验分析总结了各类方法的主要特点、存在的问题、对训练样本的要求、主要应用领域及参考代码。结论 基于深度学习图像修复领域的研究已经取得了一些显著进展,但目前深度学习在图像修复中的应用仍处于起步阶段,主要研究的内容也仅仅是利用待修复图像本身的图像内容信息,因此基于深度学习的图像修复仍是一个极具挑战的课题。如何设计具有普适性的修复网络,提高修复结果的准确性,还需要更加深入的研究。 相似文献
19.
深度学习是基于数据表示的一类更广的机器学习方法,它的出现不仅推动了机器学习的发展,而且促进了人工智能的革新。对深度学习的几种典型模型进行研究与对比。首先介绍受限玻尔兹曼机、深度置信网络、自编码器等无监督学习模型,对其结构、原理和优缺点进行了详细探讨。讨论卷积神经网络、循环神经网络和深度堆叠网络等监督学习模型,分别从模型架构和工作原理来评价与分析。对深度学习的典型模型进行对比分析,将深度置信网络和卷积神经网络应用在手写体数字识别任务中,结果证实深度学习比传统的神经网络具有更好的识别性能。最后探讨深度学习未来的发展与挑战。 相似文献
20.
目的 颜色恒常性通常指人类在任意光源条件下正确感知物体颜色的自适应能力,是实现识别、分割、3维视觉等高层任务的重要前提。对图像进行光源颜色估计是实现颜色恒常性计算的主要途径之一,现有光源颜色估计方法往往因局部场景的歧义颜色导致估计误差较大。为此,提出一种基于深度残差学习的光源颜色估计方法。方法 将输入图像均匀分块,根据局部图像块的光源颜色估计整幅图像的全局光源颜色。算法包括光源颜色估计和图像块选择两个残差网络:光源颜色估计网络通过较深的网络层次和残差结构提高光源颜色估计的准确性;图像块选择网络按照光源颜色估计误差对图像块进行分类,根据分类结果去除图像中误差较大的图像块,进一步提高全局光源颜色估计精度。此外,对输入图像进行对数色度预处理,可以降低图像亮度对光源颜色估计的影响,提高计算效率。结果 在NUS-8和重处理的ColorChecker数据集上的实验结果表明,本文方法的估计精度和稳健性较好;此外,在相同条件下,对数色度图像比原始图像的估计误差低10% 15%,图像块选择网络能够进一步使光源颜色估计网络的误差降低约5%。结论 在两组单光源数据集上的实验表明,本文方法的总体设计合理有效,算法精度和稳健性好,可应用于需要进行色彩校正的图像处理和计算机视觉等领域。 相似文献