首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
As colon cancer is among the top causes of death, there is a growing interest in developing improved techniques for the early detection of colon polyps. Given the close relation between colon polyps and colon cancer, their detection helps avoid cancer cases. The increment in the availability of colorectal screening tests and the number of colonoscopies have increased the burden on the medical personnel. In this article, the application of deep learning techniques for the detection and segmentation of colon polyps in colonoscopies is presented. Four techniques were implemented and evaluated: Mask-RCNN, PANet, Cascade R-CNN and Hybrid Task Cascade (HTC). These were trained and tested using CVC-Colon database, ETIS-LARIB Polyp, and a proprietary dataset. Three experiments were conducted to assess the techniques performance: 1) Training and testing using each database independently, 2) Mergingd the databases and testing on each database independently using a merged test set, and 3) Training on each dataset and testing on the merged test set. In our experiments, PANet architecture has the best performance in Polyp detection, and HTC was the most accurate to segment them. This approach allows us to employ Deep Learning techniques to assist healthcare professionals in the medical diagnosis for colon cancer. It is anticipated that this approach can be part of a framework for a semi-automated polyp detection in colonoscopies.  相似文献   

2.
    
Zanthoxylum bungeanum Maxim, generally called prickly ash, is widely grown in China. Zanthoxylum rust is the main disease affecting the growth and quality of Zanthoxylum. Traditional method for recognizing the degree of infection of Zanthoxylum rust mainly rely on manual experience. Due to the complex colors and shapes of rust areas, the accuracy of manual recognition is low and difficult to be quantified. In recent years, the application of artificial intelligence technology in the agricultural field has gradually increased. In this paper, based on the DeepLabV2 model, we proposed a Zanthoxylum rust image segmentation model based on the FASPP module and enhanced features of rust areas. This paper constructed a fine-grained Zanthoxylum rust image dataset. In this dataset, the Zanthoxylum rust image was segmented and labeled according to leaves, spore piles, and brown lesions. The experimental results showed that the Zanthoxylum rust image segmentation method proposed in this paper was effective. The segmentation accuracy rates of leaves, spore piles and brown lesions reached 99.66%, 85.16% and 82.47% respectively. MPA reached 91.80%, and MIoU reached 84.99%. At the same time, the proposed image segmentation model also had good efficiency, which can process 22 images per minute. This article provides an intelligent method for efficiently and accurately recognizing the degree of infection of Zanthoxylum rust.  相似文献   

3.
    
A brain tumor is a mass or growth of abnormal cells in the brain. In children and adults, brain tumor is considered one of the leading causes of death. There are several types of brain tumors, including benign (non-cancerous) and malignant (cancerous) tumors. Diagnosing brain tumors as early as possible is essential, as this can improve the chances of successful treatment and survival. Considering this problem, we bring forth a hybrid intelligent deep learning technique that uses several pre-trained models (Resnet50, Vgg16, Vgg19, U-Net) and their integration for computer-aided detection and localization systems in brain tumors. These pre-trained and integrated deep learning models have been used on the publicly available dataset from The Cancer Genome Atlas. The dataset consists of 120 patients. The pre-trained models have been used to classify tumor or no tumor images, while integrated models are applied to segment the tumor region correctly. We have evaluated their performance in terms of loss, accuracy, intersection over union, Jaccard distance, dice coefficient, and dice coefficient loss. From pre-trained models, the U-Net model achieves higher performance than other models by obtaining 95% accuracy. In contrast, U-Net with ResNet-50 outperforms all other models from integrated pre-trained models and correctly classified and segmented the tumor region.  相似文献   

4.
    
With rapid developments in convolutional neural networks for image processing, deep learning methods based on pixel classification have been extensively applied in medical image segmentation. One popular strategy for such tasks is the encoder-decoder-based U-Net architecture and its variants. Most segmentation methods based on fully convolutional networks will cause the loss of spatial and contextual information due to continuous pooling operations or strided convolution when decreasing image resolution, and make less use of contextual information and global information under different receptive fields. To overcome this shortcoming, this paper proposes a novel structure called RAAU-Net. In our proposed RAAU-Net structure, which is a modified U-shaped architecture, we aim to capture high-level information while preserving spatial information and focusing on the regions of interest. RAAU-Net comprises three main components: a feature encoder module that utilizes a pre-trained ResNet-18 model as a fixed feature extractor, a multi-receptive field extraction module that we developed, and a feature decoder module. We have tested our method on several 2D medical image segmentation tasks such as retinal nerve, breast tumor, skin lesion, lung, gland, and polyp segmentation. All the indexes of the model reached the best in the dataset of skin lesions, in which Accuracy, Precision, IoU, Recall, and Dice Score were 3.26%, 5.42%, 9.92%, 6.52%, and 5.95% higher than UNet.  相似文献   

5.
    
Accurate segmentation of retinal vessels is crucial for the early diagnosis and treatment of eye diseases, for example, diabetic retinopathy, glaucoma, and macular degeneration. Due to the intricate structure of retinal vessels, it is essential to extract their features with precision for the semantic segmentation of medical images. In this study, an improved deep learning neural network was developed with a focus on feature extraction based on the U-Net structure. The enhanced U-Net combines the architecture of convolutional neural networks (CNNs) with SE blocks (squeeze-and-excitation blocks) to adaptively extract image features after each U-Net encoder's convolution. This approach aids in suppressing nonvascular regions and highlighting features for specific segmentation tasks. The proposed method was trained and tested on the DRIVECHASE_DB1 and STARE datasets. As a result, the proposed model had an algorithmic accuracy, sensitivity, specificity, Dice coefficient (Dc), and Matthews correlation coefficient (MCC) of 95.62/0.9853/0.9652, 0.7751/0.7976/0.7773, 0.9832/0.8567/0.9865, 82.53/87.23/83.42, and 0.7823/0.7987/0.8345, respectively, outperforming previous methods, including UNet++, attention U-Net, and ResUNet. The experimental results demonstrated that the proposed method improved the retinal vessel segmentation performance.  相似文献   

6.
张艳  马春明  刘树东  孙叶美 《光电工程》2024,51(12):240237-1-240237-15

针对现有基于Transformer的语义分割网络存在的多尺度语义信息利用不充分、处理图像时生成冗长序列导致的高计算成本等问题,本文提出了一种基于多尺度特征增强的高效语义分割主干网络MFE-Former。该网络主要包括多尺度池化自注意力模块(multi-scale pooling self-attention, MPSA)和跨空间前馈网络模块(cross-spatial feed-forward network, CS-FFN)。其中,MPSA利用多尺度池化操作对特征图序列进行降采样,在减少计算成本的同时还高效地从特征图序列中提取多尺度的上下文信息,增强Transformer对多尺度信息的建模能力;CS-FFN通过采用简化的深度卷积层替代传统的全连接层,减少前馈网络初始线性变换层的参数量,并在前馈网络中引入跨空间注意力(cross-spatial attention, CSA),使模型更有效地捕捉不同空间的交互信息,进一步增强模型的表达能力。MFE-Former在数据集ADE20K、Cityscapes和COCO-Stuff上的平均交并比分别达到44.1%、80.6%和38.0%,与主流分割算法相比,MFE-Former能够以更低的计算成本获得具有竞争力的分割精度,有效改善了现有方法多尺度信息利用不足和计算成本高的问题。

  相似文献   

7.
    
Accurately and rapidly segmenting the prostate in transrectal ultrasound (TRUS) images remains challenging due to the complex semantic information in ultrasound images. The paper discusses a cross-layer connection with SegFormer attention U-Net for efficient TRUS image segmentation. The SegFormer framework is enhanced by reducing model parameters and complexity without sacrificing accuracy. We introduce layer-skipping connections for precise positioning and combine local context with global dependency for superior feature recognition. The decoder is improved with Multi-layer Perceptual Convolutional Block Attention Module (MCBAM) for better upsampling and reduced information loss, leading to increased accuracy. The experimental results show that compared with classic or popular deep learning methods, this method has better segmentation performance, with the dice similarity coefficient (DSC) of 97.55% and the intersection over union (IoU) of 95.23%. This approach balances encoder efficiency, multi-layer information flow, and parameter reduction.  相似文献   

8.
    
Optical Coherence Tomography (OCT) is very important in medicine and provide useful diagnostic information. Measuring retinal layer thicknesses plays a vital role in pathophysiologic factors of many ocular conditions. Among the existing retinal layer segmentation approaches, learning or deep learning-based methods belong to the state-of-art. However, most of these techniques rely on manual-marked layers and the performances are limited due to the image quality. In order to overcome this limitation, we build a framework based on gray value curve matching, which uses depth learning to match the curve for semi-automatic segmentation of retinal layers from OCT. The depth convolution network learns the column correspondence in the OCT image unsupervised. The whole OCT image participates in the depth convolution neural network operation, compares the gray value of each column, and matches the gray value sequence of the transformation column and the next column. Using this algorithm, when a boundary point is manually specified, we can accurately segment the boundary between retinal layers. Our experimental results obtained from a 54-subjects database of both normal healthy eyes and affected eyes demonstrate the superior performances of our approach.  相似文献   

9.
    
Tuberculosis (TB) is a highly infectious disease and is one of the major health problems all over the world. The accurate detection of TB is a major challenge faced by most of the existing methods. This work addresses these issues and developed an effective mechanism for detecting TB using deep learning. Here, the color space transformation is applied for transforming the red green and blue image to LUV space, where L stands for luminance, U and V represent chromaticity values of color images. Then, adaptive thresholding is carried out for image segmentation and various features, like coverage, density, color histogram, area, length, and texture features, are extracted to enable effective classification. After the feature extraction, the size of the features is reduced using principal component analysis. The extracted features are subjected to fractional crow search-based deep convolutional neural network (FC-SVNN) for the classification. Then, the image level features, like bacilli count, bacilli area, scattering coefficients and skeleton features are considered to perform severity detection using proposed adaptive fractional crow (AFC)-deep CNN. Finally, the inflection level is determined using entropy, density and detection percentage. The proposed AFC-Deep CNN algorithm is designed by modifying FC algorithm using self-adaptive concept. The proposed AFC-Deep CNN shows better performance with maximum accuracy value as 0.935.  相似文献   

10.
11.
12.
《成像科学杂志》2013,61(6):491-502
Abstract

Image segmentation is an important step for finger-vein identification technique. However, it is difficult to extract precise details of the image because of the irregular noise and shades around the finger-vein. The repeated line tracking algorithm achieves good segmentation performance for low quality images of finger-vein, but it has some drawbacks such as low robustness and efficiency. In this paper, a modified repeated line tracking algorithm is proposed for image segmentation of finger-vein. Firstly, we propose a segmentation method called threshold image to execute rough segmentation and obtain binary and skeleton image of finger-vein. Secondly, the width of finger-vein is estimated based on the binary and skeleton image. The parameters are revised according to the width. Then, the modified repeated line tracking algorithm is executed to figure out the locus space of finger-vein based on the revised parameters. Finally, processing results are obtained by using Otsu algorithm which executes exact segmentation on the locus space. Experiments show that the proposed algorithm is more robust and efficient than traditional repeated line tracking algorithm.  相似文献   

13.
    
Liver segmentation is a crucial step in medical image analysis and is essential for diagnosing and treating liver diseases. However, manual segmentation is time-consuming and subject to variability among observers. To address these challenges, a novel liver segmentation approach, SwinUNet with transformer skip-fusion is proposed. This method harnesses the Swin Transformer's capacity to model long-range dependencies efficiently, the U-Net's ability to preserve fine spatial details, and the transformer skip-fusion's effectiveness in enabling the decoder to learn intricate features from encoder feature maps. In experiments using the 3DIRCADb and CHAOS datasets, this technique outperformed traditional CNN-based methods, achieving a mean DICE coefficient of 0.988% and a mean Jaccard coefficient of 0.973% by aggregating the results obtained from each dataset, signifying outstanding agreement with ground truth. This remarkable accuracy in liver segmentation holds significant promise for improving liver disease diagnosis and enhancing healthcare outcomes for patients with liver conditions.  相似文献   

14.
    
The diagnosis' treatment planning, follow-up and prognostication of Gliomas is significantly enhanced on Magnetic Resonance Imaging. In the present research, deep learning-based variant of convolutional neural network methodology is proposed for glioma segmentation where pretrained autoencoder acts as backbone to the 3D-Unet which performs the segmentation task as well as image restoration. Further, Unet accepts input as the combination of three non-native MR images (T2, T1CE, and FLAIR) to extract maximum and superior features for segmenting tumor regions. Further, weighted dice loss employed, focusses on segregating tumor region into three regions of interest namely whole tumor with oedema (WT), enhancing tumor (ET), and tumor core (TC). The optimizer preferred in the proposed methodology is Adam and the learning rate is initially set to 1e4, progressively reduced by a cosine decay after 50 epochs. The learning parameters are reduced to a larger extent (up to 9.8 M as compared to 27 M). The experimental results show that the proposed model achieved Dice similarity coefficients: 0.77, 0.92, and 0.84; sensitivity: 0.90, 0.95, and 0.89; specificity: 0.97, 0.99, and 0.99; Hausdorff95: 5.74, 4.89, and 6.00, in the three regions including ET, WT, TC. This proposed Glioma segmentation method is efficient for segregation of tumors.  相似文献   

15.
    
Gliomas segmentation is a critical and challenging task in surgery and treatment, and it is also the basis for subsequent evaluation of gliomas. Magnetic resonance imaging is extensively employed in diagnosing brain and nervous system abnormalities. However, brain tumor segmentation remains a challenging task, because differentiating brain tumors from normal tissues is difficult, tumor boundaries are often ambiguous and there is a high degree of variability in the shape, location, and extent of the patient. It is therefore desired to devise effective image segmentation architectures. In the past few decades, many algorithms for automatic segmentation of brain tumors have been proposed. Methods based on deep learning have achieved favorable performance for brain tumor segmentation. In this article, we propose a Multi-Scale 3D U-Nets architecture, which uses several U-net blocks to capture long-distance spatial information at different resolutions. We upsample feature maps at different resolutions to extract and utilize sufficient features, and we hypothesize that semantically similar features are easier to learn and process. In order to reduce the computational cost, we use 3D depthwise separable convolution instead of some standard 3D convolution. On BraTS 2015 testing set, we obtained dice scores of 0.85, 0.72, and 0.61 for the whole tumor, tumor core, and enhancing tumor, respectively. Our segmentation performance was competitive compared to other state-of-the-art methods.  相似文献   

16.
    
A wide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic, privacy, and security reasons. Numerous studies show that the Deep-Learning (DL) is a suitable option for human segmentation, and the ensemble of multiple DL-based segmentation models can improve the segmentation result. However, these approaches are not as effective when directly applied to the image segmentation in a video. This paper proposes an Adaptive N-Frames Ensemble (AFE) approach for high-movement human segmentation in a video using an ensemble of multiple DL models. In contrast to an ensemble, which executes multiple DL models simultaneously for every single video frame, the proposed AFE approach executes only a single DL model upon a current video frame. It combines the segmentation outputs of previous frames for the final segmentation output when the frame difference is less than a particular threshold. Our method employs the idea of the N-Frames Ensemble (NFE) method, which uses the ensemble of the image segmentation of a current video frame and previous video frames. However, NFE is not suitable for the segmentation of fast-moving objects in a video nor a video with low frame rates. The proposed AFE approach addresses the limitations of the NFE method. Our experiment uses three human segmentation models, namely Fully Convolutional Network (FCN), DeepLabv3, and Mediapipe. We evaluated our approach using 1711 videos of the TikTok50f dataset with a single-person view. The TikTok50f dataset is a reconstructed version of the publicly available TikTok dataset by cropping, resizing and dividing it into videos having 50 frames each. This paper compares the proposed AFE with single models and the Two-Models Ensemble, as well as the NFE models. The experiment results show that the proposed AFE is suitable for low-movement as well as high-movement human segmentation in a video.  相似文献   

17.
         下载免费PDF全文
Aiming at the harsh environment and serious light pollution in the production workshop of automobile body-in-white, it is difficult to accurately locate and inefficient when the vision system and other equipment are combined to detect the quality of the solder joints. An improved U-Net image segmentation algorithm was proposed. By improving the convolution structure to better fuse the semantic information of the feature map and lighten the network structure. Improve the loss function and integrate the attention mechanism to better mine the foreground in the case of uneven positive and negative samples, obtain spatial features of different scale feature maps and establish long-term channel relationships. Compared with the original U-Net network, the Dice coefficient of the proposed RPSA-U-Net network is increased by 8.76% to 0.983 6, the MIOU is increased by 11.5% to 0.967 81, and the network parameters are also reduced by 7%. Combined with the image processing method to find the center of the solder joint, the efficiency is higher and the precision is higher, and it has application value.  相似文献   

18.
    
Automatic cervical cancer segmentation in multimodal magnetic resonance imaging (MRI) is essential because tumor location and delineation can support patients' diagnosis and treatment planning. To meet this clinical demand, we present an encoder–decoder deep learning architecture which employs an EfficientNet encoder in the UNet++ architecture (E-UNet++). EfficientNet helps in effectively encoding multiscale image features. The nested decoders with skip connections aggregate multiscale features from low-level to high-level, which helps in detecting fine-grained details. A cohort of 228 cervical cancer patients with multimodal MRI sequences, including T2-weighted imaging, diffusion-weighted imaging, apparent diffusion coefficient imaging, contrast enhancement T1-weighted imaging, and dynamic contrast-enhanced imaging (DCE), has been explored. Evaluations are performed by considering either single or multimodal MRI with standard segmentation quantitative metrics: dice similarity coefficient (DSC), intersection over union (IOU), and 95% Hausdorff distance (HD). Our results show that the E-UNet++ model can achieve DSC values of 0.681–0.786, IOU values of 0.558–0.678, and 95% HD values of 3.779–7.411 pixels in different single sequences. Meanwhile, it provides DSC values of 0.644 and 0.687 on three DCE subsequences and all MRI sequences together. Our designed model is superior to other comparative models, which shows the potential to be used as an artificial intelligence tool for cervical cancer segmentation in multimodal MRI.  相似文献   

19.
    
Recently, medical data classification becomes a hot research topic among healthcare professionals and research communities, which assist in the disease diagnosis and decision making process. The latest developments of artificial intelligence (AI) approaches paves a way for the design of effective medical data classification models. At the same time, the existence of numerous features in the medical dataset poses a curse of dimensionality problem. For resolving the issues, this article introduces a novel feature subset selection with artificial intelligence based classification model for biomedical data (FSS-AICBD) technique. The FSS-AICBD technique intends to derive a useful set of features and thereby improve the classifier results. Primarily, the FSS-AICBD technique undergoes min-max normalization technique to prevent data complexity. In addition, the information gain (IG) approach is applied for the optimal selection of feature subsets. Also, group search optimizer (GSO) with deep belief network (DBN) model is utilized for biomedical data classification where the hyperparameters of the DBN model can be optimally tuned by the GSO algorithm. The choice of IG and GSO approaches results in promising medical data classification results. The experimental result analysis of the FSS-AICBD technique takes place using different benchmark healthcare datasets. The simulation results reported the enhanced outcomes of the FSS-AICBD technique interms of several measures.  相似文献   

20.
    
Wearable sound detectors require strain sensors that are stretchable, sensitive, and capable of adhering conformably to the skin, and toward this end, 2D materials hold great promise. However, the vibration of vocal cords and muscle contraction are complex and changeable, which can compromise the sensing performance of devices. By combining deep learning and 2D MXenes, an MXene‐based sound detector is prepared successfully with improved recognition and sensitive response to pressure and vibration, which facilitate the production of a high‐recognition and resolution sound detector. By training and testing the deep learning network model with large amounts of data obtained by the MXene‐based sound detector, the long vowels and short vowels of human pronunciation are successfully recognized. The proposed scheme accelerates the application of artificial throat devices in biomedical fields and opens up practical applications in voice control, motion monitoring, and many other fields.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号