The availability of huge structured and unstructured data, advanced highly dense memory and high performance computing machines have provided a strong push for the development in artificial intelligence (AI) and machine learning (ML) domains. AI and machine learning has rekindled the hope of efficiently solving complex problems which was not possible in the recent past. The generation and availability of big-data is a strong driving force for the development of AI/ML applications, however, several challenges need to be addressed, like processing speed, memory requirement, high bandwidth, low latency memory access, and highly conductive and flexible connections between processing units and memory blocks. The conventional computing platforms are unable to address these issues with machine learning and AI. Deep neural networks (DNNs) are widely employed for machine learning and AI applications, like speech recognition, computer vison, robotics, and so forth, efficiently and accurately. However, accuracy is achieved at the cost of high computational complexity, sacrificing energy efficiency and throughput like performance measuring parameters along with high latency. To address the problems of latency, energy efficiency, complexity, power consumption, and so forth, a lot of state of the art DNN accelerators have been designed and implemented in the form of application specific integrated circuits (ASICs) and field programmable gate arrays (FPGAs). This work provides the state of the art of all these DNN accelerators which have been developed recently. Various DNN architectures, their computing units, emerging technologies used in improving the performance of DNN accelerators will be discussed. Finally, we will try to explore the scope for further improvement in these accelerator designs, various opportunities and challenges for the future research.  相似文献   

首先回顾了计算视觉发展的历史,介绍了神经元、多层感知机和反向传播等人工神经网络的基本知识以及卷积神经网络的发展史及其卷积、池化等基本操作;讨论了AlexNet、VGGNet、GoogLeNet和ResNet等经典卷积神经网络结构,并重点介绍了CapsNet;总结了卷积神经网络在图像分类、语义分割、目标检测以及图像生成等领域的研究进展;最后提出了卷积神经网络研究所面临的挑战以及对CapsNet未来研究的展望。  相似文献   

医学影像的诊断是许多临床决策的基础,而医学影像的智能分析是医疗人工智能的重要组成部分。与此同时,随着越来越多3D空间传感器的兴起和普及,3D计算机视觉正变得越发重要。本文关注医学影像分析和3D计算机的交叉领域,即医学3D计算机视觉或医学3D视觉。本文将医学3D计算机视觉系统划分为任务、数据和表征3个层面,并结合最新文献呈现这3个层面的研究进展。在任务层面,介绍医学3D计算机视觉中的分类、分割、检测、配准和成像重建,以及这些任务在临床诊断和医学影像分析中的作用和特点。在数据层面,简要介绍了医学3D数据中最重要的数据模态:包括计算机断层成像(computed tomography,CT)、磁共振成像(magnetic resonance imaging,MRI)、正电子放射断层成像(positron emission tomography,PET)等,以及一些新兴研究提出的其他数据格式。在此基础上,整理了医学3D计算机视觉中重要的研究数据集,并标注其数据模态和主要视觉任务。在表征层面,介绍并讨论了2D网络、3D网络和混合网络在医学3D数据的表征学习上的优缺点。此外,针对医学影像中普遍存在的小数据问题,重点讨论了医学3D数据表征学习中的预训练问题。最后,总结了目前医学3D计算机视觉的研究现状,并指出目前尚待解决的研究挑战、问题和方向。  相似文献   

针对深度学习构建网络模型以及确定模型参数的问题,在分析神经网络基本结构和线性模型局限性的基础上,研究了深度神经网络设计的关键因素和优化策略。结合手写数字识别问题,对优化策略、动态衰减学习率、隐藏层节点数、隐藏层数等情形下的识别正确率进行了实验。结果表明,不同神经网络模型对最终正确率有质的影响,相同优化策略在不同参数取值时对最终正确率有很大影响,并进一步探究了具体选取优化策略和参数的方法。  相似文献   

海洋中尺度涡是一种重要的海洋中尺度现象,在海洋环流、物质能量传输中发挥重要作用,对舰船航行安全、水声通信等也具有重要的影响。高效准确地检测识别出海洋中尺度涡无论对于物理海洋认知还是海洋开发利用都有着重要的研究价值。传统涡旋检测识别方法依赖专家经验设计的单一阈值,具有显著的主观性。随着深度学习的兴起,机器学习方法在涡旋检测识别的准确性和自动化程度上表现出一定的优势。通过总结与对比分析现有基于机器学习的检测识别方法,为发展海洋中尺度涡检测识别的研究提供系统认知和参考依据。  相似文献   

The amount of digital data in the universe is growing at an exponential rate, doubling every 2 years, and changing how we live in the world. The information storage capacity and data requirement crossed the zettabytes. With this level of bombardment of data on machine learning techniques, it becomes very difficult to carry out parallel computations. Deep learning is broadening its scope and gaining more popularity in natural language processing, feature extraction and visualization, and almost in every machine learning trend. The purpose of this study is to provide a brief review of deep learning architectures and their working. Research papers and proceedings of conferences from various authentic resources (Institute of Electrical and Electronics Engineers, Wiley, Nature, and Elsevier) are studied and analyzed. Different architectures and their effectiveness to solve domain specific problems are evaluated. Various limitations and open problems of current architectures are discussed to provide better insights to help researchers and student to resume their research on these issues. One hundred one articles were reviewed for this meta‐analysis of deep learning. From this analysis, it is concluded that advanced deep learning architectures are combinations of few conventional architectures. For example, deep belief network and convolutional neural network are used to build convolutional deep belief network, which has higher capabilities than the parent architectures. These combined architectures are more robust to explore the problem space and thus can be the answer to build a general‐purpose architecture.  相似文献   

传统的无人机人机交互需要专门的设备和专业的训练,便捷新颖的交互方式往往更令人青睐。利用普通相机,对基于计算机视觉以及深度学习的无人机手势控制系统进行了研究。该系统首先利用快速跟踪算法在视频序列中提取出操作者所在区域,大大减少后续视频处理压力的同时去除了复杂背景以及相机漂移的影响。其次,根据动作的时间信息,用不同颜色编码光流特征,叠加在一张图片上,将视频转换为同时包含时间特征以及空间特征的彩色纹理图。最后,利用卷积神经网络对彩色纹理图进行学习及分类,根据分类结果生成控制无人机的指令。该系统每0.4 s对1.6 s内的动作进行一次判定,利用卷积神经网络对图片的分类实现实时性的人机交互,系统在60 m范围内的识别准确率在93%以上,在室内和室外环境下,操作者可以通过模仿指令动作方便地控制无人机。  相似文献   

The maintainability of source code is a key quality characteristic for software quality. Many approaches have been proposed to quantitatively measure code maintainability. Such approaches rely heavily on code metrics, e.g., the number of Lines of Code and McCabe’s Cyclomatic Complexity. The employed code metrics are essentially statistics regarding code elements, e.g., the numbers of tokens, lines, references, and branch statements. However, natural language in source code, especially identifiers, is rarely exploited by such approaches. As a result, replacing meaningful identifiers with nonsense tokens would not significantly influence their outputs, although the replacement should have significantly reduced code maintainability. To this end, in this paper, we propose a novel approach (called DeepM) to measure code maintainability by exploiting the lexical semantics of text in source code. DeepM leverages deep learning techniques (e.g., LSTM and attention mechanism) to exploit these lexical semantics in measuring code maintainability. Another key rationale of DeepM is that measuring code maintainability is complex and often far beyond the capabilities of statistics or simple heuristics. Consequently, DeepM leverages deep learning techniques to automatically select useful features from complex and lengthy inputs and to construct a complex mapping (rather than simple heuristics) from the input to the output (code maintainability index). DeepM is evaluated on a manually-assessed dataset. The evaluation results suggest that DeepM is accurate, and it generates the same rankings of code maintainability as those of experienced programmers on 87.5% of manually ranked pairs of Java classes.  相似文献   

Laser Metal Deposition (LMD) is an additive manufacturing technology that attracts great interest from the industry, thanks to its potential to realize parts with complex geometries in one piece, and to repair damaged ones, while maintaining good mechanical properties. Nevertheless, the complexity of this process has limited its widespread adoption, since different part geometries, strategies and boundary conditions can yield very different results in terms of external shapes and inner flaws. Moreover, monitoring part quality during the process execution is very challenging, as direct measurements of both structural and geometrical properties are mostly impracticable. This work proposes an on-line monitoring and prediction approach for LMD that exploits coaxial melt pool images, together with process input data, to estimate the size of a track deposited by LMD. In particular, a novel deep learning architecture combines the output of a convolutional neural network (that takes melt pool images as inputs) with scalar variables (process and trajectory data). Various network architectures are evaluated, suggesting to use at least three convolutional layers. Furthermore, results imply a certain degree of invariance to the number and size of dense layers. The effectiveness of the proposed method is demonstrated basing on experiments performed on single tracks deposited by LMD using powders of Inconel 718, a relevant material for the aerospace and automotive sectors.  相似文献   

盲人音乐家在交流创作的音乐作品时面临着人工转换和效率较低的问题,信息科学与技术的迅速发展为解决此类问题提供了许多解决方案。虽然目前有许多盲文音乐作品的识别方案,但其存在识别效率低和兼容能力不足等缺点。为了避免传统方案在盲文音乐图片特征提取时过多依赖人工经验,通过研究提出并设计了基于卷积神经网络的识别模型。在对盲文音乐图片的样例数据进行预处理之后,通过多次反复迭代训练,模型就可学习到盲文音乐图片中音乐符号的特征。实验结果表明,该模型的识别有效性和较强的泛化能力为盲文音乐作品的识别提供了一种新的解决方案。  相似文献   

深度神经网络在图像识别、语言识别和机器翻译等人工智能任务中取得了巨大进展,很大程度上归功于优秀的神经网络结构设计。神经网络大都由手工设计,需要专业的机器学习知识以及大量的试错。为此,自动化的神经网络结构搜索成为研究热点。神经网络结构搜索(neural architecture search,NAS)主要由搜索空间、搜索策略与性能评估方法3部分组成。在搜索空间设计上,出于计算量的考虑,通常不会搜索整个网络结构,而是先将网络分成几块,然后搜索块中的结构。根据实际情况的不同,可以共享不同块中的结构,也可以对每个块单独搜索不同的结构。在搜索策略上,主流的优化方法包含强化学习、进化算法、贝叶斯优化和基于梯度的优化等。在性能评估上,为了节省计算时间,通常不会将每一个网络都充分训练到收敛,而是通过权值共享、早停等方法尽可能减小单个网络的训练时间。与手工设计的网络相比,神经网络结构搜索得到的深度神经网络具有更好的性能。在ImageNet分类任务上,与手工设计的MobileNetV2相比,通过神经网络结构搜索得到的MobileNetV3减少了近30%的计算量,并且top-1分类精度提升了3.2%;在Cityscapes语义分割任务上,与手工设计的DeepLabv3+相比,通过神经网络结构搜索得到的Auto-DeepLab-L可以在没有ImageNet预训练的情况下,达到比DeepLabv3+更高的平均交并比(mean intersection over union,mIOU),同时减小一半以上的计算量。神经网络结构搜索得到的深度神经网络通常比手工设计的神经网络有着更好的表现,是未来神经网络设计的发展趋势。  相似文献   

机器阅读理解任务在近年来备受关注,它赋予计算机从文本数据中获取知识和回答问题的能力。如何让机器理解自然语言是人工智能领域长期存在的挑战之一,近年来大规模高质量数据集的发布和深度学习技术的运用,使得机器阅读理解取得了快速发展。基于神经网络的端到端的模型结构,基于预训练语言模型以及推理技术的应用,其性能在大规模评测数据集上有很大提升,但距离真正的理解语言还有较大差距。本文对机器阅读理解任务的研究现状与发展趋势进行了综述,主要包括任务划分、机器阅读理解模型与相关技术的分析,特别是基于知识推理的机器阅读理解技术,总结并讨论了该领域的发展趋势。  相似文献   

目的 生成式对抗网络(GAN)的出现为计算机视觉应用提供了新的技术和手段,它以独特零和博弈与对抗训练的思想生成高质量的样本,具有比传统机器学习算法更强大的特征学习和特征表达能力。目前在机器视觉领域尤其是样本生成领域取得了显著的成功,是当前研究的热点方向之一。方法 以生成式对抗网络的不同模型及其在计算机视觉领域的应用为研究对象,在广泛调研文献特别是GAN的最新发展成果基础上,结合不同模型的对比试验,对每种方法的基本思想、方法特点及使用场景进行分析,并对GAN的优势与劣势进行总结,阐述了GAN研究的现状、在计算机视觉上的应用范围,归纳生成式对抗网络在高质量图像生成、风格迁移与图像翻译、文本与图像的相互生成和图像的还原与修复等多个计算机视觉领域的研究现状和发展趋势,并对每种应用的理论改进之处、优点、局限性及使用场景进行了总结,对未来可能的发展方向进行展望。结果 GAN的不同模型在生成样本质量与性能上各有优劣。当前的GAN模型在图像的处理上取得较大的成就,能生成以假乱真的样本,但是也存在网络不收敛、模型易崩溃、过于自由不可控的问题。结论 GAN作为一种新的生成模型具有很高的研究价值与应用价值,但目前存在一些理论上的桎梏亟待突破,在应用方面生成高质量的样本、逼真的场景是值得研究的方向。  相似文献   

从理论上分析了隐含层激励函数满足Mercer条件的前向神经网络的数学本质,给出了网络学习的指导方向.提出3种网络在线学习算法,它们通过动态调整网络结构和权值来提高网络在线预测性能.算法完全符合统计学习理论提出的结构风险最小化原则,具有较快的学习收敛速度和良好的抗噪声能力.最后通过具体数值实验验证了上述算法的可行性和优越性.  相似文献   

Extreme learning machine (ELM), proposed by Huang et al., has been shown a promising learning algorithm for single-hidden layer feedforward neural networks (SLFNs). Nevertheless, because of the random choice of input weights and biases, the ELM algorithm sometimes makes the hidden layer output matrix H of SLFN not full column rank, which lowers the effectiveness of ELM. This paper discusses the effectiveness of ELM and proposes an improved algorithm called EELM that makes a proper selection of the input weights and bias before calculating the output weights, which ensures the full column rank of H in theory. This improves to some extend the learning rate (testing accuracy, prediction accuracy, learning time) and the robustness property of the networks. The experimental results based on both the benchmark function approximation and real-world problems including classification and regression applications show the good performances of EELM.  相似文献   

In this paper, we propose a methodology for training a new model of artificial neural network called the generalized radial basis function (GRBF) neural network. This model is based on generalized Gaussian distribution, which parametrizes the Gaussian distribution by adding a new parameter τ. The generalized radial basis function allows different radial basis functions to be represented by updating the new parameter τ. For example, when GRBF takes a value of τ=2, it represents the standard Gaussian radial basis function. The model parameters are optimized through a modified version of the extreme learning machine (ELM) algorithm. In the methodology proposed (MELM-GRBF), the centers of each GRBF were taken randomly from the patterns of the training set and the radius and τ values were determined analytically, taking into account that the model must fulfil two constraints: locality and coverage. An thorough experimental study is presented to test its overall performance. Fifteen datasets were considered, including binary and multi-class problems, all of them taken from the UCI repository. The MELM-GRBF was compared to ELM with sigmoidal, hard-limit, triangular basis and radial basis functions in the hidden layer and to the ELM-RBF methodology proposed by Huang et al. (2004) [1]. The MELM-GRBF obtained better results in accuracy than the corresponding sigmoidal, hard-limit, triangular basis and radial basis functions for almost all datasets, producing the highest mean accuracy rank when compared with these other basis functions for all datasets.  相似文献   

PTA工业生产过程中4-CBA的含量是评价其产品质量的重要依据。将深度置信网络和已有的浅层算法相结合,提出基于深度置信网络的4-CBA软测量模型。深度置信网络是一种典型的深度学习算法,该算法在特征学习方面优势显著。根据实验结果,基于深度置信网络的软测量模型能够很好地估计4-CBA含量,和单纯的BP神经网络模型相比,基于深度置信网络的模型预测精度更高。  相似文献   

针对无人机避障问题,提出一种基于深度学习的四旋翼无人机单目视觉避障方法。首先通过目标检测框选出目标在图像中的位置,并通过计算目标选框上下边距的长度,以此来估量出障碍物到无人机之间的距离;然后通过协同计算机判断是否执行避障动作;最后使用基于Pixhawk搭建的飞行实验平台进行实验。实验结果表明,该方法可用于无人机低速飞行条件下避障。该方法所用到的传感器只有一块单目摄像头,而且相对于传统的主动式传感器避障方法,所占用无人机的体积大幅减小。该方法鲁棒性较好,能够准确识别不同姿态的人,实现对人避障。  相似文献   

This editorial summarizes and analyzes 17 articles selected for a special issue on machine learning advances for Industry 4.0 applications. The diverse articles cover fault detection, deep learning optimisation, IoT networking, vehicle control, recommendation systems and domain knowledge integration. Key methods represented include neural networks, deep learning, reinforcement learning and explainable AI. Real-world industrial case studies showcase machine learning's versatility in enabling intelligent automation, control, and decision-making across manufacturing, healthcare, transportation and other sectors. While highlighting theoretical innovations, the contributions also demonstrate machine learning's transformative potential for intelligent, connected, self-optimising next generation production systems. This editorial concisely overviews the latest trends represented in this special issue.  相似文献   

针对无人机对目标的识别定位与跟踪,本文提出了一种基于深度学习的多旋翼无人机单目视觉目标识别跟踪方法,解决了传统的基于双目摄像机成本过高以及在复杂环境下识别准确率较低的问题。该方法基于深度学习卷积神经网络的目标检测算法,使用该算法对目标进行模型训练,将训练好的模型加载到搭载ROS的机载电脑。机载电脑外接单目摄像机,单目摄像头检测目标后,自动检测出目标在图像中的位置,通过采用一种基于坐标求差的优化算法进行目标位置准确获取,然后将目标位置信息转化为控制无人机飞行的期望速度和高度发送给飞控板,飞控板接收到机载电脑发送的跟踪指令,实现对目标物体的跟踪。试验结果验证了该方法可以很好的进行目标识别并实现目标追踪  相似文献   

