首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
在线多目标跟踪是实时视频序列分析的重要前提。针对在线多目标跟踪中目标检测可靠性低、跟踪丢失较多、轨迹不平滑等问题,提出了基于R-FCN网络框架的多候选关联的在线多目标跟踪模型。首先,通过基于R-FCN网络从KF预测结果和检测结果中获取更可靠的候选框,然后利用Siamese网络进行基于外观特征的相似性度量,实现候选与轨迹之间的数据关联,最后通过RANSAC算法优化跟踪轨迹。在人流密集和目标被部分遮挡的复杂场景中,提出的算法具有较高的目标识别和跟踪能力,大幅减少漏检和误检现象,跟踪轨迹更加连续平滑。实验结果表明,在同等条件下,与当前已有的方法对比,本文提出在目标跟踪准确度(MOTA)、丢失轨迹数(ML)和误报次数(FN)等多个性能指标均有较大提升。  相似文献   

2.
Manual detection of small uncalcified pulmonary nodules (diameter <4 mm) in thoracic computed tomography (CT) scans is a tedious and error-prone task. Automatic detection of disperse micronodules is, thus, highly desirable for improved characterization of the fatal and incurable occupational pulmonary diseases. Here, we present a novel computer-assisted detection (CAD) scheme specifically dedicated to detect micronodules. The proposed scheme consists of a candidate-screening module and a false positive (FP) reduction module. The candidate-screening module is initiated by a lung segmentation algorithm and is followed by a combination of 2D/3D features-based thresholding parameters to identify plausible micronodules. The FP reduction module employs a 3D convolutional neural network (CNN) to classify each identified candidate. It automatically encodes the discriminative representations by exploiting the volumetric information of each candidate. A set of 872 micro-nodules in 598 CT scans marked by at least two radiologists are extracted from the Lung Image Database Consortium and Image Database Resource Initiative to test our CAD scheme. The CAD scheme achieves a detection sensitivity of 86.7% (756/872) with only 8 FPs/scan and an AUC of 0.98. Our proposed CAD scheme efficiently identifies micronodules in thoracic scans with only a small number of FPs. Our experimental results provide evidence that the automatically generated features by the 3D CNN are highly discriminant, thus making it a well-suited FP reduction module of a CAD scheme.  相似文献   

3.
针对3D-CNN能够较好地提取视频中时空特征但对计算量和内存要求很高的问题,本文设计了高效3D卷积块替换原来计算量大的3×3×3卷积层,进而提出了一种融合3D卷积块的密集残差网络(3D-EDRNs)用于人体行为识别。高效3D卷积块由获取视频空间特征的1×3×3卷积层和获取视频时间特征的3×1×1卷积层组合而成。将高效3D卷积块组合在密集残差网络的多个位置中,不但利用了残差块易于优化和密集连接网络特征复用等优点,而且能够缩短训练时间,提高网络的时空特征提取效率和性能。在经典数据集UCF101、HMDB51和动态多视角复杂3D人体行为数据库(DMV action3D)上验证了结合3D卷积块的3D-EDRNs能够显著降低模型复杂度,有效提高网络的分类性能,同时具有计算资源需求少、参数量小和训练时间短等优点。  相似文献   

4.
To generate realistic three-dimensional animation of virtual character, capturing real facial expression is the primary task. Due to diverse facial expressions and complex background, facial landmarks recognized by existing strategies have the problem of deviations and low accuracy. Therefore, a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks (MTCNN) and high-resolution network. Firstly, the convolution operation of traditional MTCNN is improved. The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network, which further rejects a large number of false candidates. The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces. Then the images cropped after face detection are input into high-resolution network. Multi-scale feature fusion is realized by parallel connection of multi-resolution streams, and rich high-resolution heatmaps of facial landmarks are obtained. Finally, the changes of facial landmarks recognized are tracked in real-time. The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character's face, which can realize facial expression synchronous animation. Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness, especially for diverse expressions and complex background. The method can accurately capture facial expression and generate three-dimensional animation effects, making online entertainment and social interaction more immersive in shared virtual space.  相似文献   

5.
Computer Assisted Diagnosis (CAD) is an effective method to detect lung cancer from computed tomography (CT) scans. The development of artificial neural network makes CAD more accurate in detecting pathological changes. Due to the complexity of the lung environment, the existing neural network training still requires large datasets, excessive time, and memory space. To meet the challenge, we analysis 3D volumes as serialized 2D slices and present a new neural network structure lightweight convolutional neural network (CNN)-long short-term memory (LSTM) for lung nodule classification. Our network contains two main components: (a) optimized lightweight CNN layers with tiny parameter space for extracting visual features of serialized 2D images, and (b) LSTM network for learning relevant information among 2D images. In all experiments, we compared the training results of several models and our model achieved an accuracy of 91.78% for lung nodule classification with an AUC of 93%. We used fewer samples and memory space to train the model, and we achieved faster convergence. Finally, we analyzed and discussed the feasibility of migrating this framework to mobile devices. The framework can also be applied to cope with the small amount of training data and the development of mobile health device in future.  相似文献   

6.
目的研究无需进行复杂的图像预处理和人工特征提取,就能提高光学遥感图像的船只检测准确率和实现船只类型精细分类。方法对输入的检测图像,采用选择性搜索的方法产生船只候选区域,用已经标记好的训练样本对卷积神经网络进行监督训练,得到网络参数,然后使用经过监督训练的卷积神经网络提取抽象特征,并对候选区域进行分类,根据船只候选区域的分类概率同时确定船只的位置以及类型。结果与现有的2种检测方法进行对比,实验结果表明卷积神经网络能有效提高船只检测准确率,平均检测准确率达到了93.3%。结论该检测方法无需进行复杂的预处理,能同时对船只进行检测和分类,并能有效提高船只检测准确率。  相似文献   

7.
Multiple ocular region segmentation plays an important role in different applications such as biometrics, liveness detection, healthcare, and gaze estimation. Typically, segmentation techniques focus on a single region of the eye at a time. Despite the number of obvious advantages, very limited research has focused on multiple regions of the eye. Similarly, accurate segmentation of multiple eye regions is necessary in challenging scenarios involving blur, ghost effects low resolution, off-angles, and unusual glints. Currently, the available segmentation methods cannot address these constraints. In this paper, to address the accurate segmentation of multiple eye regions in unconstrainted scenarios, a lightweight outer residual encoder-decoder network suitable for various sensor images is proposed. The proposed method can determine the true boundaries of the eye regions from inferior-quality images using the high-frequency information flow from the outer residual encoder-decoder deep convolutional neural network (called ORED-Net). Moreover, the proposed ORED-Net model does not improve the performance based on the complexity, number of parameters or network depth. The proposed network is considerably lighter than previous state-of-theart models. Comprehensive experiments were performed, and optimal performance was achieved using SBVPI and UBIRIS.v2 datasets containing images of the eye region. The simulation results obtained using the proposed OREDNet, with the mean intersection over union score (mIoU) of 89.25 and 85.12 on the challenging SBVPI and UBIRIS.v2 datasets, respectively.  相似文献   

8.
《成像科学杂志》2013,61(7):556-567
Abstract

Region growing is an important application of image segmentation in medical research for detection of tumour. In this paper, we propose an effective modified region growing technique for detection of brain tumour. It consists of four steps which includes: (i) pre-processing; (2) modified region growing by the inclusion of an additional orientation constraint in addition to the normal intensity constrain; (3) feature extraction of the region; and (4) final classification using the neural network. The performance of the proposed technique is systematically evaluated using the magnetic resonance imaging (MRI) brain images received from the public sources. For validating the effectiveness of the modified region growing, we have considered the quantity rate parameter. For the evaluation of the proposed technique of tumour detection, we make use of sensitivity, specificity and accuracy values which we compute from finding out false positive, false negative, true positive and true negative. Comparative analyses were made of the normal and the modified region growing using both the Feed Forward Neural Network (FFNN) and Radial Basis Function (RBF) neural network. From the results obtained, we could see that the proposed technique achieved the accuracy of 80% for the testing dataset, which clearly demonstrated the effectiveness of the modified region growing when compared to the normal technique.  相似文献   

9.
为了提高目标检测的准确性,提出了一种基于深度学习利用特征图加权融合实现目标检测的方法。首先,提出将卷积神经网络中的浅层特征图采样后与最深层特征图进行加权融合的思想;其次,根据所提的特征图加权融合思想以及卷积神经网络的具体结构,制定相应的特征图加权融合方案,并由该方案得到新特征图;然后,提出改进的RPN网络,并将新特征图输入到改进的RPN网络得到区域建议;最后,将新特征图和区域建议输入到后续网络层完成目标检测。实验结果表明所提方法取得了更高的目标检测精度以及更好的目标检测效果。  相似文献   

10.
赵鹏  唐英杰  杨牧  安静 《包装工程》2020,41(5):192-196
目的针对传统无纺布缺陷分类检测中人工依赖性强、效率低等问题,提出一种能够满足工厂要求的卷积神经网络分类检测方法。方法首先建立包括脏点、褶皱、断裂、缺纱和无缺陷等5种共计7万张无纺布图像样本库,其次构造一个具有不同神经元个数的卷积层和池化层的神经网络,然后采用反向传播算法逐层更新权值,通过梯度下降法最小化损失函数,最后利用Softmax分类器实现无纺布的缺陷分类检测。结果构建了12层的卷积神经网络,通过2万张样本进行测试实验,无缺陷样本准确率可以达到100%,缺陷样本分类准确率均在95%以上,检测时间在35 ms以内。结论该方法能够满足工业生产线中对于无纺布缺陷实时分类检测的要求。  相似文献   

11.
Distributed Denial-of-Service (DDoS) has caused great damage to the network in the big data environment. Existing methods are characterized by low computational efficiency, high false alarm rate and high false alarm rate. In this paper, we propose a DDoS attack detection method based on network flow grayscale matrix feature via multiscale convolutional neural network (CNN). According to the different characteristics of the attack flow and the normal flow in the IP protocol, the seven-tuple is defined to describe the network flow characteristics and converted into a grayscale feature by binary. Based on the network flow grayscale matrix feature (GMF), the convolution kernel of different spatial scales is used to improve the accuracy of feature segmentation, global features and local features of the network flow are extracted. A DDoS attack classifier based on multi-scale convolution neural network is constructed. Experiments show that compared with correlation methods, this method can improve the robustness of the classifier, reduce the false alarm rate and the missing alarm rate.  相似文献   

12.
张志晟  张雷洪 《包装工程》2020,41(19):259-266
目的 现有的易拉罐缺陷检测系统在高速生产线中存在错检率和漏检率高,检测精度相对较低等问题,为了提高易拉罐缺陷识别的准确性,使易拉罐生产线实现进一步自动化、智能化,基于深度学习技术和迁移学习技术,提出一种适用于易拉罐制造的在线检测的算法。方法 利用深度卷积网络提取易拉罐缺陷特征,通过优化卷积核,减短易拉罐缺陷检测的时间。针对国内外数据集缺乏食品包装制造的缺陷图像,构建易拉罐缺陷数据集,结合预训练网络,通过调整VGG16提升对易拉罐缺陷的识别准确率。结果 对易拉罐数据集在卷积神经网络、迁移学习和调整后的预训练网络进行了易拉罐缺陷检测的性能对比,验证了基于深度学习的易拉罐缺陷检测技术在学习率为0.0005,训练10个迭代后可达到较好的识别效果,最终二分类缺陷识别率为99.7%,算法耗时119 ms。结论 相较于现有的易拉罐检测算法,文中提出的基于深度学习的易拉罐检测算法的识别性能更优,智能化程度更高。同时,该研究有助于制罐企业利用深度学习等AI技术促进智能化生产,减少人力成本,符合国家制造业产业升级的策略,具有一定的实际意义。  相似文献   

13.
Road potholes can cause serious social issues, such as unexpected damages to vehicles and traffic accidents. For efficient road management, technologies that quickly find potholes are required, and thus researches on such technologies have been conducted actively. The three-dimensional (3D) reconstruction method has relatively high accuracy and can be used in practice but it has limited application owing to its long data processing time and high sensor maintenance cost. The two-dimensional (2D) vision method has the advantage of inexpensive and easy application of sensor. Recently, although the 2D vision method using the convolutional neural network (CNN) has shown improved pothole detection performance and adaptability, large amount of data is required to sufficiently train the CNN. Therefore, we propose a method to improve the learning performance of CNN-based object detection model by artificially generating synthetic data similar to a pothole and enhancing the learning data. Additionally, to make the defective areas appear more contrasting, the transformed disparity map (TDM) was calculated using stereo-vision cameras, and the detection performance of the model was further improved through the late fusion with RGB (Red, Green, Blue) images. Consequently, through the convergence of multimodal You Only Look Once (YOLO) frameworks trained by RGB images and TDMs respectively, the detection performance was enhanced by 10.7% compared with that when using only RGB. Further, the superiority of the proposed method was confirmed by showing that the data processing speed was two times faster than the existing 3D reconstruction method.  相似文献   

14.
The extent of the peril associated with cancer can be perceived from the lack of treatment, ineffective early diagnosis techniques, and most importantly its fatality rate. Globally, cancer is the second leading cause of death and among over a hundred types of cancer; lung cancer is the second most common type of cancer as well as the leading cause of cancer-related deaths. Anyhow, an accurate lung cancer diagnosis in a timely manner can elevate the likelihood of survival by a noticeable margin and medical imaging is a prevalent manner of cancer diagnosis since it is easily accessible to people around the globe. Nonetheless, this is not eminently efficacious considering human inspection of medical images can yield a high false positive rate. Ineffective and inefficient diagnosis is a crucial reason for such a high mortality rate for this malady. However, the conspicuous advancements in deep learning and artificial intelligence have stimulated the development of exceedingly precise diagnosis systems. The development and performance of these systems rely prominently on the data that is used to train these systems. A standard problem witnessed in publicly available medical image datasets is the severe imbalance of data between different classes. This grave imbalance of data can make a deep learning model biased towards the dominant class and unable to generalize. This study aims to present an end-to-end convolutional neural network that can accurately differentiate lung nodules from non-nodules and reduce the false positive rate to a bare minimum. To tackle the problem of data imbalance, we oversampled the data by transforming available images in the minority class. The average false positive rate in the proposed method is a mere 1.5 percent. However, the average false negative rate is 31.76 percent. The proposed neural network has 68.66 percent sensitivity and 98.42 percent specificity.  相似文献   

15.
针对地震勘探中噪声压制的问题,构建了一种适合分类和识别地震子波的卷积神经网络模型.首先对卷积神经网络模型的激活函数、卷积核大小以及归一化层等进行了设计,然后利用已搭建好的卷积神经网络对地震信号的时频谱图进行特征提取,最后实现了不同类型的含噪地震信号的分类和识别.实验结果表明,该模型有高分类率和识别率及较好的抗干扰能力,...  相似文献   

16.
舒忠  郑波儿 《包装工程》2024,45(7):222-233
目的 解决超分辨率图像重构模型中存在的功能单元之间关联性差,图像色度特征提取完整性不强、超分辨率重构失真控制和采样过程残差控制偏弱等问题。方法 通过在卷积神经网络模型引入双激活函数,提高模型中各功能单元之间的兼容连接性;引用密集连接卷积神经网络构建超分辨率失真控制单元,分别实现对4个色度分量进行卷积补偿运算;将残差插值函数应用于上采样单元中,使用深度反投影网络规则实现超分辨率色度特征插值运算。结果 设计的模型集联了内部多个卷积核,实现了超分辨率色度失真补偿,使用了统一的处理权值,确保了整个模型内部组成单元的有机融合。结论 相关实验结果验证了本文图像重构模型具有良好可靠性、稳定性和高效性。  相似文献   

17.
王胜  吕林涛  杨宏才 《包装工程》2019,40(11):203-211
目的 为了改善传统机器检测印刷产品缺陷存在误费率高的不足。方法 提出以卷积神经网络为控制核心的印刷品缺陷检测系统。设计可在实际检测中应用的卷积神经网络,设计在线印刷质量检测系统的硬件结构。结果 对结构相同而训练次数、学习率不同的卷积神经网络进行了缺陷检测的性能对比,验证了该卷积神经网络在学习率小于0.01时,可以获得较好的识别效果;在学习率大于0.05时,网络不容易收敛。网络训练次数越多,精度越高,相应的训练时间也较长。在满足快速性和精确度的条件下,确定了适应某印刷品的缺陷检验网络训练次数为50,学习率为0.005,此时的识别率为90%。结论 经过实验证明,该检测系统具有良好的缺陷识别能力,缺陷类型的分类准确率较高。该系统具有一定的实用价值。  相似文献   

18.
Theiler J 《Applied optics》2008,47(28):F12-F26
Simulations applied to hyperspectral imagery from the AVIRIS sensor are employed to quantitatively evaluate the performance of anomalous change detection algorithms. The evaluation methodology reflects the aim of these algorithms, which is to distinguish actual anomalous changes in a pair of images from the incidental differences that pervade the entire scene. By simulating both the anomalous changes and the pervasive differences, accurate and plentiful ground truth is made available, and statistical estimates of detection and false alarm rates can be made. Comparing the receiver operating characteristic (ROC) curves that encapsulate these rates provides a way to identify which algorithms work best under which conditions.  相似文献   

19.
Lung cancer is the main cause of cancer related death owing to its destructive nature and postponed detection at advanced stages. Early recognition of lung cancer is essential to increase the survival rate of persons and it remains a crucial problem in the healthcare sector. Computer aided diagnosis (CAD) models can be designed to effectually identify and classify the existence of lung cancer using medical images. The recently developed deep learning (DL) models find a way for accurate lung nodule classification process. Therefore, this article presents a deer hunting optimization with deep convolutional neural network for lung cancer detection and classification (DHODCNN-LCC) model. The proposed DHODCNN-LCC technique initially undergoes pre-processing in two stages namely contrast enhancement and noise removal. Besides, the features extraction process on the pre-processed images takes place using the Nadam optimizer with RefineDet model. In addition, denoising stacked autoencoder (DSAE) model is employed for lung nodule classification. Finally, the deer hunting optimization algorithm (DHOA) is utilized for optimal hyper parameter tuning of the DSAE model and thereby results in improved classification performance. The experimental validation of the DHODCNN-LCC technique was implemented against benchmark dataset and the outcomes are assessed under various aspects. The experimental outcomes reported the superior outcomes of the DHODCNN-LCC technique over the recent approaches with respect to distinct measures.  相似文献   

20.
Object recognition and location has always been one of the research hotspots in machine vision. It is of great value and significance to the development and application of current service robots, industrial automation, unmanned driving and other fields. In order to realize the real-time recognition and location of indoor scene objects, this article proposes an improved YOLOv3 neural network model, which combines densely connected networks and residual networks to construct a new YOLOv3 backbone network, which is applied to the detection and recognition of objects in indoor scenes. In this article, RealSense D415 RGB-D camera is used to obtain the RGB map and depth map, the actual distance value is calculated after each pixel in the scene image is mapped to the real scene. Experiment results proved that the detection and recognition accuracy and real-time performance by the new network are obviously improved compared with the previous YOLOV3 neural network model in the same scene. More objects can be detected after the improvement of network which cannot be detected with the YOLOv3 network before the improvement. The running time of objects detection and recognition is reduced to less than half of the original. This improved network has a certain reference value for practical engineering application.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号