首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
宁楠楠  刘侠  邓可欣  吴萍  王坤  田捷 《自动化学报》2014,40(8):1793-1803
在自发荧光断层成像(Bioluminescent tomography imaging,BLT)中,双模态融合(光学模态与结构模态)可充分利用结构模态提供的高精度3D几何结构,重建三维表面荧光光通量分布,进而实现小动物内部荧光光源定位.然而,与纯光学模态相比,双模态融合存在采集系统复杂、成本高、数据处理繁琐及存在电离辐射(如CT)等问题.因此,研究基于纯光学3D几何结构的自发荧光光源定位方法对BLT具有重要意义. 本文在搭建纯光学自发荧光断层系统(All-optical bioluminescence tomography system,AOBTS)的基础上,提出一种基于多角度光学投影表面重建的三维自发荧光光源定位方法. 本方法由基于多角度光学投影的3D表面重建、多角度荧光无缝融合、荧光光通量的量化校正以及自发荧光内部光源重建4部分组成. 通过真实小鼠内部植入荧光光源实验表明,与传统纯光学方法相比,本文提出方法不仅改进了3D表面重建方法,而且增加了多角度荧光无缝融合,可实现真实小鼠的三维自发荧光光源定位,初步实验证明具有小动物预临床实验潜力.  相似文献   

2.
医学图像分割及其发展现状   总被引:2,自引:0,他引:2  
医学图像分割是各种医学图像应用的基础,当前的临床辅助诊断、图像引导的外科手术和放射治疗中,医学图像分割技术显示出越来越重要的临床价值.由于医学图像种类繁多,常规影像包括磁共振(MR)成像、计算机断层(CT)成像、正电子发射计算机断层显像(PET)、超声(US)成像等,其中MR成像还可以产生多种不同时间参数序列的图像模态.为此,医学图像分割技术已成为面向不同的影像模态、临床目标、特定解剖学部位的一种独特的应用科学体系.结合现有的国内外研究成果,该文详细地介绍和系统地对比了图像分割方法并进行了分类,最后还对6个国际知名医学成像期刊和会议进行了统计分析,阐述了医学图像分割技术的研究趋势.  相似文献   

3.
《软件》2019,(5):115-127
随着现代计算机辅助技术与医学成像系统的发展,临床医生对医学影像处理、三维可视化以及新兴技术的引入提出了更高的要求。医学影像处理软件,作为满足以上临床需求的主要手段,成为了近年来的研究热点。本文介绍了一款自主研发的、基于云计算的医学影像处理与3D打印平台。该平台不仅实现了云端数据管理、模型重建、三维可视化、手术规划、3D打印与工程服务等诸多功能,还充分预留了下一代应用如手术导航与混合现实的开发接口。  相似文献   

4.
医学影像分割是计算机辅助诊断中的一项基础且关键的任务,目的在于从像素级别准确识别出目标器官、组织或病变区域。不同于自然场景下的图像,医学影像往往纹理复杂,同时受限于成像技术和成像设备,医学影像噪声大,边界模糊而不易判断。除此之外,对医学影像进行标注极大依赖于医疗专家的认知和经验,因此可用于训练中的标注数据少且存在标注误差。由于上述的医学影像边缘模糊不清、训练数据较少和标注误差较大等特点,基于传统图像分割算法搭建的辅助诊断系统难以满足临床应用的要求。近年来随着卷积神经网络(CNN)在计算机视觉和自然语言处理领域的广泛应用,基于深度学习的医学影像分割算法取得了极大的成功。首先概述了近几年基于深度学习的医学影像分割的研究进展,包括这些医学影像分割算法的基本结构、目标函数和优化方法。随后针对医学影像标注数据有限的问题,对目前半监督条件下医学影像分割的主流工作进行了整理归纳和分析。此外,还介绍了针对标注误差进行不确定度分析的相关工作。最后,总结分析了深度学习医学影像分割的特点并展望了未来的研究趋势。  相似文献   

5.
随着深度学习在医疗领域的快速发展,医学视觉问答(Med-VQA)吸引了研究人员的广泛关注.现有的Med-VQA方法大都使用权重参数共享的同一特征提取网络对多模态医学影像进行特征提取,在一定程度上忽略了不同模态医学影像的差异性特征,导致对特定模态特征提取时引入其它模态的噪声特征,使得模型难以关注到不同模态医学影像中的关键特征.针对上述问题,本文提出一种基于多模态特征提取的医学视觉问答方法.首先,对医学影像进行模态识别,根据模态标签指导输入参数不共享的特征提取网络以获得不同模态影像的差异性特征;然后,设计了一种面向Med-VQA的卷积降噪模块以降低医学影像不同模态特征的噪声信息;最后,采用空间与通道注意力模块进一步增强不同模态差异性特征的关注度.在Med-VQA公共数据集Slake上得到的实验结果表明,本文提出方法能有效提高Med-VQA的准确率.  相似文献   

6.
随着互联网技术的迅速发展,文本和图像等各种类型的数据在网络上呈现爆发式增长,如何从这些多源异构且语义关联的多模态数据中获取有价值的信息则尤为重要。跨模态检索能够突破模态的限制,跨越不同模态的数据进行信息检索,满足用户获取有关事件信息的需求。近年来,跨模态检索已经成为了学术界和工业界研究的热点问题。本文聚焦于图文跨模态检索任务,首先介绍图文跨模态检索的定义,并分析说明了当前该任务面临的挑战。其次,对现有的研究方法进行归纳总结,将其分为3大类:(1)传统方法;(2)基于深度学习的方法;(3)基于哈希表示的方法。然后,详细介绍了图文跨模态检索的常用数据集,并对常用数据集上已有算法进行详细分析与比较。最后,对图文跨模态检索任务的未来发展方向进行展望。  相似文献   

7.
图像描述是多模态学习的基础任务,主要任务是实现图像到文字的模态转换。随着图像描述精度的提高,希望图像描述模型能够应用在医学领域中,帮助医师对医学影像进行诊疗。为了更好地进行医学影像描述,本文提出了基于SE-ResNet和扩展长短期记忆网络的医学影像模型。实验结果证明,该模型在肺部CT数据集IUX-ray和医学影像数据集PubCaption上各项评价指标效果都比传统模型要高,生成的医学影像描述更精确。  相似文献   

8.
三维指称表达理解(3D VG)旨在通过理解指称表达来准确定位三维场景中的目标对象。现有3D VG研究通过引入文本和视觉分类任务优化文本和视觉编码器,这种方法可能由于文本和视觉特征的语义不对齐,从而导致模型难以在场景中定位文本描述的视觉对象。此外,3D VG数据集有限的数据量和复杂的模型结构往往导致模型过拟合。针对上述问题提出MP3DVG模型,通过学习统一的多模态特征表示完成单模态分类和3D VG任务,并降低模型的过拟合。基于跨模态特征交互提出TGV和VGT模块,在单模态任务之前预融合文本和视觉特征,减小不同模态特征因语义不对齐带来的不利影响。基于线性分类器可评价样本特征多样性的特性,提出周期性初始化的辅助分类器,并通过动态损失调节项自适应地调节样本损失,弱化模型的过拟合。大量实验结果表明所提方法的优越性,相比于MVT模型,MP3DVG在Nr3D和Sr3D数据集上性能分别提升1.1%和1.8%,模型的过拟合现象得到显著改善。  相似文献   

9.
深度视觉生成是计算机视觉领域的热门方向,旨在使计算机能够根据输入数据自动生成预期的视觉内容。深度视觉生成使用人工智能技术赋能相关产业,推动产业自动化、智能化改革与转型。生成对抗网络(generative adversarial networks,GANs)是深度视觉生成的有效工具,近年来受到极大关注,成为快速发展的研究方向。GANs能够接收多种模态的输入数据,包括噪声、图像、文本和视频,以对抗博弈的模式进行图像生成和视频生成,已成功应用于多项视觉生成任务。利用GANs实现真实的、多样化和可控的视觉生成具有重要的研究意义。本文对近年来深度对抗视觉生成的相关工作进行综述。首先介绍深度视觉生成背景及典型生成模型,然后根据深度对抗视觉生成的主流任务概述相关算法,总结深度对抗视觉生成目前面临的痛点问题,在此基础上分析深度对抗视觉生成的未来发展趋势。  相似文献   

10.
医学影像分割是计算机视觉在医学影像处理中的一个重要应用领域,其目标是从医学影像中分割出目标区域,为后续的疾病诊断和治疗提供有效的帮助。近年来深度学习技术在图像处理方面取得了巨大进展,基于深度学习的医学影像分割算法逐渐成为该领域研究的重点和热点。叙述了计算机视觉下的医学影像分割任务及其难点,重点综述了基于深度学习的医学影像分割算法,对当前具有代表性的相关方法进行了分类和总结,介绍了医学影像分割算法常用的评价指标和数据集。对该技术的发展进行了总结和展望。  相似文献   

11.
近年来深度学习在计算机视觉(CV)和自然语言处理(NLP)等单模态领域都取得了十分优异的性能.随着技术的发展,多模态学习的重要性和必要性已经慢慢展现.视觉语言学习作为多模态学习的重要部分,得到国内外研究人员的广泛关注.得益于Transformer框架的发展,越来越多的预训练模型被运用到视觉语言多模态学习上,相关任务在性能上得到了质的飞跃.系统地梳理了当前视觉语言预训练模型相关的工作,首先介绍了预训练模型的相关知识,其次从两种不同的角度分析比较预训练模型结构,讨论了常用的视觉语言预训练技术,详细介绍了5类下游预训练任务,最后介绍了常用的图像和视频预训练任务的数据集,并比较和分析了常用预训练模型在不同任务下不同数据集上的性能.  相似文献   

12.
13.
Accurate visual hand pose estimation at joint level has several applications for human-robot interaction, natural user interfaces and virtual/augmented reality applications. However, it is still an open problem being addressed by the computer vision community. Recent novel deep learning techniques may help circumvent the limitations of standard approaches. However, they require large amounts of accurate annotated data.Hand pose datasets that have been released so far present issues such as limited number of samples, inaccurate data or high-level annotations. Moreover, most of them are focused on depth-based approaches, providing only depth information (missing RGB data).In this work, we present a novel multiview hand pose dataset in which we provide hand color images and different kind of annotations for each sample, i.e. the bounding box and the 2D and 3D location on the joints in the hand. Furthermore, we introduce a simple yet accurate deep learning architecture for real-time robust 2D hand pose estimation. Then, we conduct experiments that show how the use of the proposed dataset in the training stage produces accurate results for 2D hand pose estimation using a single color camera.  相似文献   

14.
Following an obvious growth of available collections of medical images in recent years, both in number and in size, machine learning has nowadays become an important tool for solving various image-analysis-related problems, such as organ segmentation or injury/pathology detection. The potential of learning algorithms to produce models having good generalisation properties is highly dependent on model complexity and the amount of available data. Bearing in mind that complex concepts require the use of complex models, it is of paramount importance to mitigate representation complexity, where possible, therefore enabling the utilisation of simpler models for performing the same task. When dealing with image collections of quasi-symmetric organs, or imaging observations of organs taken from different quasi-symmetric perspectives, one way of reducing representation complexity would be aligning all the images in a collection for left-right or front-rear orientation. That way, a learning algorithm would not be dealing with learning redundant symmetric representations. In this paper, we study in detail the influence of such within-class variation on model complexity, and present a possible solution, that can be applied to medical-imaging computer-aided diagnosis systems. The proposed method involves compacting the data, extracting features and then learning to separate the mirror-image representation classes from one another. Two efficient approaches are considered for performing such orientation separation: a fully automated unsupervised approach and a semi-automated supervised approach. Both solutions are directly applicable to imaging data. Method performance is illustrated on two 2D and one 3D real-world publicly-available medical datasets, concerning different parts of human anatomy, and observed using different imaging techniques: colour fundus photography, mammography CT scans and volumetric knee-joint MR scans. Experimental results suggest that efficient organ-mirroring orientation-classifier models, having expected classification accuracy greater than 99%, can be estimated using either the unsupervised or the supervised approach. In the presence of noise, however, an equally good performance can be achieved only by using the supervised approach, learning from a small subset of labelled data.  相似文献   

15.

Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision. We start with comprehending the theories of deep learning, reinforcement learning, and deep reinforcement learning. We then propose a categorization of deep reinforcement learning methodologies and discuss their advantages and limitations. In particular, we divide deep reinforcement learning into seven main categories according to their applications in computer vision, i.e. (i) landmark localization (ii) object detection; (iii) object tracking; (iv) registration on both 2D image and 3D image volumetric data (v) image segmentation; (vi) videos analysis; and (vii) other applications. Each of these categories is further analyzed with reinforcement learning techniques, network design, and performance. Moreover, we provide a comprehensive analysis of the existing publicly available datasets and examine source code availability. Finally, we present some open issues and discuss future research directions on deep reinforcement learning in computer vision.

  相似文献   

16.

Deep learning proved its efficiency in many fields of computer science such as computer vision, image classifications, object detection, image segmentation, and more. Deep learning models primarily depend on the availability of huge datasets. Without the existence of many images in datasets, different deep learning models will not be able to learn and produce accurate models. Unfortunately, several fields don't have access to large amounts of evidence, such as medical image processing. For example. The world is suffering from the lack of COVID-19 virus datasets, and there is no benchmark dataset from the beginning of 2020. This pandemic was the main motivation of this survey to deliver and discuss the current image data augmentation techniques which can be used to increase the number of images. In this paper, a survey of data augmentation for digital images in deep learning will be presented. The study begins and with the introduction section, which reflects the importance of data augmentation in general. The classical image data augmentation taxonomy and photometric transformation will be presented in the second section. The third section will illustrate the deep learning image data augmentation. Finally, the fourth section will survey the state of the art of using image data augmentation techniques in the different deep learning research and application.

  相似文献   

17.
基于深度学习的三维数据分析理解方法研究综述   总被引:1,自引:0,他引:1  
基于深度学习的三维数据分析理解是数字几何领域的一个研究热点.不同于基于深度学习的图像分析理解,基于深度学习的三维数据分析理解需要解决的首要问题是数据表达的多样性.相较于规则的二维图像,三维数据有离散表达和连续表达的方法,目前基于深度学习的相关工作多基于三维数据的离散表示,不同的三维数据表达方法与不同的数字几何处理任务对深度学习网络的要求也不同.本文首先汇总了常用的三维数据集与特定任务的评价指标,并分析了三维模型特征描述符.然后从特定任务出发,就不同的三维数据表达方式,对现有的基于深度学习的三维数据分析理解网络进行综述,对各类方法进行对比分析,并从三维数据表达方法的角度进一步汇总现有工作.最后基于国内外研究现状,讨论了亟待解决的挑战性问题,展望了未来发展的趋势.  相似文献   

18.
基于图像的3维重建旨在从一组2维多视角图像中精确地恢复真实场景的几何形状,是计算机视觉和摄影测量中基础且活跃的研究课题,具有重要的理论研究意义和应用价值,在智慧城市、虚拟旅游、数字遗产保护、数字地图和导航等领域有着广泛应用。随着图像采集系统(智能手机、消费级数码相机和民用无人机等)的普及和互联网的高速发展,通过搜索引擎可以获取大量关于某个室外场景的互联网图像。利用这些图像进行高效鲁棒准确的3维重建,为用户提供真实感知和沉浸式体验已经成为研究热点,引发了学术界和产业界的广泛关注,涌现了多种方法。深度学习的出现为大规模室外图像的3维重建提供了新的契机。首先阐述大规模室外图像3维重建的基本串行过程,包括图像检索、图像特征点匹配、运动恢复结构和多视图立体。然后从传统方法和基于深度学习的方法两个角度,分别系统全面地回顾大规模室外图像3维重建技术在各重建子过程中的发展和应用,总结各子过程中适用于大规模室外场景的数据集和评价指标。最后介绍现有主流的开源和商业3维重建系统以及国内相关产业的发展现状。  相似文献   

19.
Stratified 3D reconstruction, or a layer-by-layer 3D reconstruction upgraded from projective to affine, then to the final metric reconstruction, is a well-known 3D reconstruction method in computer vision. It is also a key supporting technology for various well-known applications, such as streetview, smart3D, oblique photogrammetry. Generally speaking, the existing computer vision methods in the literature can be roughly classified into either the geometry-based approaches for spatial vision or the learning-based approaches for object vision. Although deep learning has demonstrated tremendous success in object vision in recent years, learning 3D scene reconstruction from multiple images is still rare, even not existent, except for those on depth learning from single images. This study is to explore the feasibility of learning the stratified 3D reconstruction from putative point correspondences across images, and to assess whether it could also be as robust to matching outliers as the traditional geometry-based methods do. In this study, a special parsimonious neural network is designed for the learning. Our results show that it is indeed possible to learn a stratified 3D reconstruction from noisy image point correspondences, and the learnt reconstruction results appear satisfactory although they are still not on a par with the state-of-the-arts in the structure-from-motion community due to largely its lack of an explicit robust outlier detector such as random sample consensus (RANSAC). To the best of our knowledge, our study is the first attempt in the literature to learn 3D scene reconstruction from multiple images. Our results also show that how to implicitly or explicitly integrate an outlier detector in learning methods is a key problem to solve in order to learn comparable 3D scene structures to those by the current geometry-based state-of-the-arts. Otherwise any significant advancement of learning 3D structures from multiple images seems difficult, if not impossible. Besides, we even speculate that deep learning might be, in nature, not suitable for learning 3D structure from multiple images, or more generally, for solving spatial vision problems.  相似文献   

20.
Semantic image parsing, which refers to the process of decomposing images into semantic regions and constructing the structure representation of the input, has recently aroused widespread interest in the field of computer vision. The recent application of deep representation learning has driven this field into a new stage of development. In this paper, we summarize three aspects of the progress of research on semantic image parsing, i.e., category-level semantic segmentation, instance-level semantic segmentation, and beyond segmentation. Specifically, we first review the general frameworks for each task and introduce the relevant variants. The advantages and limitations of each method are also discussed. Moreover, we present a comprehensive comparison of different benchmark datasets and evaluation metrics. Finally, we explore the future trends and challenges of semantic image parsing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号