首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Human activity recognition is commonly used in several Internet of Things applications to recognize different contexts and respond to them. Deep learning has gained momentum for identifying activities through sensors, smartphones or even surveillance cameras. However, it is often difficult to train deep learning models on constrained IoT devices. The focus of this paper is to propose an alternative model by constructing a Deep Learning-based Human Activity Recognition framework for edge computing, which we call DL-HAR. The goal of this framework is to exploit the capabilities of cloud computing to train a deep learning model and deploy it on lesspowerful edge devices for recognition. The idea is to conduct the training of the model in the Cloud and distribute it to the edge nodes. We demonstrate how the DL-HAR can perform human activity recognition at the edge while improving efficiency and accuracy. In order to evaluate the proposed framework, we conducted a comprehensive set of experiments to validate the applicability of DL-HAR. Experimental results on the benchmark dataset show a significant increase in performance compared with the state-of-the-art models.  相似文献   

2.
Human gait recognition (HGR) has received a lot of attention in the last decade as an alternative biometric technique. The main challenges in gait recognition are the change in in-person view angle and covariant factors. The major covariant factors are walking while carrying a bag and walking while wearing a coat. Deep learning is a new machine learning technique that is gaining popularity. Many techniques for HGR based on deep learning are presented in the literature. The requirement of an efficient framework is always required for correct and quick gait recognition. We proposed a fully automated deep learning and improved ant colony optimization (IACO) framework for HGR using video sequences in this work. The proposed framework consists of four primary steps. In the first step, the database is normalized in a video frame. In the second step, two pre-trained models named ResNet101 and InceptionV3 are selected and modified according to the dataset's nature. After that, we trained both modified models using transfer learning and extracted the features. The IACO algorithm is used to improve the extracted features. IACO is used to select the best features, which are then passed to the Cubic SVM for final classification. The cubic SVM employs a multiclass method. The experiment was carried out on three angles (0, 18, and 180) of the CASIA B dataset, and the accuracy was 95.2, 93.9, and 98.2 percent, respectively. A comparison with existing techniques is also performed, and the proposed method outperforms in terms of accuracy and computational time.  相似文献   

3.
Lip-reading technologies are rapidly progressing following the breakthrough of deep learning. It plays a vital role in its many applications, such as: human-machine communication practices or security applications. In this paper, we propose to develop an effective lip-reading recognition model for Arabic visual speech recognition by implementing deep learning algorithms. The Arabic visual datasets that have been collected contains 2400 records of Arabic digits and 960 records of Arabic phrases from 24 native speakers. The primary purpose is to provide a high-performance model in terms of enhancing the preprocessing phase. Firstly, we extract keyframes from our dataset. Secondly, we produce a Concatenated Frame Images (CFIs) that represent the utterance sequence in one single image. Finally, the VGG-19 is employed for visual features extraction in our proposed model. We have examined different keyframes: 10, 15, and 20 for comparing two types of approaches in the proposed model: (1) the VGG-19 base model and (2) VGG-19 base model with batch normalization. The results show that the second approach achieves greater accuracy: 94% for digit recognition, 97% for phrase recognition, and 93% for digits and phrases recognition in the test dataset. Therefore, our proposed model is superior to models based on CFIs input.  相似文献   

4.
Currently, breast cancer has been a major cause of deaths in women worldwide and the World Health Organization (WHO) has confirmed this. The severity of this disease can be minimized to the large extend, if it is diagnosed properly at an early stage of the disease. Therefore, the proper treatment of a patient having cancer can be processed in better way, if it can be diagnosed properly as early as possible using the better algorithms. Moreover, it has been currently observed that the deep neural networks have delivered remarkable performance for detecting cancer in histopathological images of breast tissues. To address the above said issues, this paper presents a hybrid model using the transfer learning to study the histopathological images, which help in detection and rectification of the disease at a low cost. Extensive dataset experiments were carried out to validate the suggested hybrid model in this paper. The experimental results show that the proposed model outperformed the baseline methods, with F-scores of 0.81 for DenseNet + Logistic Regression hybrid model, (F-score: 0.73) for Visual Geometry Group (VGG) + Logistic Regression hybrid model, (F-score: 0.74) for VGG + Random Forest, (F-score: 0.79) for DenseNet + Random Forest, and (F-score: 0.79) for VGG + Densenet + Logistic Regression hybrid model on the dataset of histopathological images.  相似文献   

5.
王金甲  周雅倩  郝智 《计量学报》2019,40(6):958-969
深度循环神经网络适用于处理时间序列数据, 然而循环神经网络特征提取能力差, 时间依赖关系挖掘不足。针对此问题, 提出了3种注意力机制和长短时记忆(LSTM)神经网络结合的模型用于人类活动识别问题, 并研究了3种注意力机制在不同数据集上单独及配合使用时对模型精度的影响。对于UCI_HAR数据集, 3种注意力LSTM模型准确率分别为94.13%、95.15%和94.81%,高于LSTM模型识别准确率93.2%。此外, 针对人类活动识别的传感器时间序列数据的标签特点, 提出将时间段分类任务转化为分割任务, 设计了2个基于分割任务的注意力门控循环单元(GRU)神经网络模型, Bahdanau注意力GRU模型在Skoda数据集和机会(Oppor)数据集准确率为84.61%和89.54%, 高于基准UNet模型的70.40%和88.51%。  相似文献   

6.
Coronavirus disease (COVID-19) is an extremely infectious disease and possibly causes acute respiratory distress or in severe cases may lead to death. There has already been some research in dealing with coronavirus using machine learning algorithms, but few have presented a truly comprehensive view. In this research, we show how convolutional neural network (CNN) can be useful to detect COVID-19 using chest X-ray images. We leverage the CNN-based pre-trained models as feature extractors to substantiate transfer learning and add our own classifier in detecting COVID-19. In this regard, we evaluate performance of five different pre-trained models with fine-tuning the weights from some of the top layers. We also develop an ensemble model where the predictions from all chosen pre-trained models are combined to generate a single output. The models are evaluated through 5-fold cross validation using two publicly available data repositories containing healthy and infected (both COVID-19 and other pneumonia) chest X-ray images. We also leverage two different visualization techniques to observe how efficiently the models extract important features related to the detection of COVID- 19 patients. The models show high degree of accuracy, precision, and sensitivity. We believe that the models will aid medical professionals with improved and faster patient screening and pave a way to further COVID-19 research.  相似文献   

7.
Background—Human Gait Recognition (HGR) is an approach based on biometric and is being widely used for surveillance. HGR is adopted by researchers for the past several decades. Several factors are there that affect the system performance such as the walking variation due to clothes, a person carrying some luggage, variations in the view angle. Proposed—In this work, a new method is introduced to overcome different problems of HGR. A hybrid method is proposed or efficient HGR using deep learning and selection of best features. Four major steps are involved in this work-preprocessing of the video frames, manipulation of the pre-trained CNN model VGG-16 for the computation of the features, removing redundant features extracted from the CNN model, and classification. In the reduction of irrelevant features Principal Score and Kurtosis based approach is proposed named PSbK. After that, the features of PSbK are fused in one materix. Finally, this fused vector is fed to the One against All Multi Support Vector Machine (OAMSVM) classifier for the final results. Results—The system is evaluated by utilizing the CASIA B database and six angles 00°, 18°, 36°, 54°, 72°, and 90° are used and attained the accuracy of 95.80%, 96.0%, 95.90%, 96.20%, 95.60%, and 95.50%, respectively. Conclusion—The comparison with recent methods show the proposed method work better.  相似文献   

8.
Violence recognition is crucial because of its applications in activities related to security and law enforcement. Existing semi-automated systems have issues such as tedious manual surveillances, which causes human errors and makes these systems less effective. Several approaches have been proposed using trajectory-based, non-object-centric, and deep-learning-based methods. Previous studies have shown that deep learning techniques attain higher accuracy and lower error rates than those of other methods. However, the their performance must be improved. This study explores the state-of-the-art deep learning architecture of convolutional neural networks (CNNs) and inception V4 to detect and recognize violence using video data. In the proposed framework, the keyframe extraction technique eliminates duplicate consecutive frames. This keyframing phase reduces the training data size and hence decreases the computational cost by avoiding duplicate frames. For feature selection and classification tasks, the applied sequential CNN uses one kernel size, whereas the inception v4 CNN uses multiple kernels for different layers of the architecture. For empirical analysis, four widely used standard datasets are used with diverse activities. The results confirm that the proposed approach attains 98% accuracy, reduces the computational cost, and outperforms the existing techniques of violence detection and recognition.  相似文献   

9.
张典范  杨镇豪  程淑红 《计量学报》2022,43(11):1412-1417
针对人工进行轮毂分拣存在的误识别问题,采用一种基于ResNet50与迁移学习的神经网络模型来识别汽车轮毂。把预训练模型参数迁移到ResNet50卷积神经网络中,修改原网络的输出层,构建基于ResNet50的迁移学习模型,通过进一步训练轮毂数据集来微调模型参数,提取轮毂的细粒度特征。通过对比AlexNet、VGG11、VGG16与ResNet50模型在未使用微调、使用微调和冻结不同数量卷积层参数时的训练效率、准确率,证明ResNet50迁移学模型在冻结前7个Bottleneck残差块参数时不仅能缩短训练时间,并能在相同迭代周期下取得更高的准确率。在该冻结策略下训练生成TL-ResNet50迁移学习模型,分别对8种轮毂进行预测,得出每种轮毂的平均准确率达到99%以上。  相似文献   

10.
Diabetic retinopathy (DR) is a retinal disease that causes irreversible blindness. DR occurs due to the high blood sugar level of the patient, and it is clumsy to be detected at an early stage as no early symptoms appear at the initial level. To prevent blindness, early detection and regular treatment are needed. Automated detection based on machine intelligence may assist the ophthalmologist in examining the patients’ condition more accurately and efficiently. The purpose of this study is to produce an automated screening system for recognition and grading of diabetic retinopathy using machine learning through deep transfer and representational learning. The artificial intelligence technique used is transfer learning on the deep neural network, Inception-v4. Two configuration variants of transfer learning are applied on Inception-v4: Fine-tune mode and fixed feature extractor mode. Both configuration modes have achieved decent accuracy values, but the fine-tuning method outperforms the fixed feature extractor configuration mode. Fine-tune configuration mode has gained 96.6% accuracy in early detection of DR and 97.7% accuracy in grading the disease and has outperformed the state of the art methods in the relevant literature.  相似文献   

11.
水声目标智能识别是水声装备智能化的重要组成部分,深度学习则是实现水声目标智能识别的重要技术手段之一。当前水声目标智能识别经常面临数据集较小带来的训练样本量不足的情况,针对小数据集识别中存在的因过拟合导致模型泛化能力不足,以及输入的水声信号二维谱图样式不统一的问题,文章提出了一种基于VGGish神经网络模型的水声目标识别方法。该方法以VGGish网络作为特征提取器,并在VGGish网络前部加入了信号预处理模块,同时设计了一种基于传统机器学习算法的联合分类器,通过以上措施解决了过拟合问题和二维谱图样式不统一问题。实验结果显示,该方法应用在ShipsEar数据集上得到了94.397%的识别准确率,高于传统预训练-微调法得到的最高90.977%的准确率,并且在相同条件下该方法的模型训练耗时仅为传统预训练-微调方法的0.5%左右,有效提高了识别准确率和模型训练速度。  相似文献   

12.
Epilepsy is a central nervous system disorder in which brain activity becomes abnormal. Electroencephalogram (EEG) signals, as recordings of brain activity, have been widely used for epilepsy recognition. To study epileptic EEG signals and develop artificial intelligence (AI)-assist recognition, a multi-view transfer learning (MVTL-LSR) algorithm based on least squares regression is proposed in this study. Compared with most existing multi-view transfer learning algorithms, MVTL-LSR has two merits: (1) Since traditional transfer learning algorithms leverage knowledge from different sources, which poses a significant risk to data privacy. Therefore, we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance. (2) When utilizing multi-view data, we embed view weighting and manifold regularization into the transfer framework to measure the views’ strengths and weaknesses and improve generalization ability. In the experimental studies, 12 different simulated multi-view & transfer scenarios are constructed from epileptic EEG signals licensed and provided by the University of Bonn, Germany. Extensive experimental results show that MVTL-LSR outperforms baselines. The source code will be available on .  相似文献   

13.
Human action recognition under complex environment is a challenging work. Recently, sparse representation has achieved excellent results of dealing with human action recognition problem under different conditions. The main idea of sparse representation classification is to construct a general classification scheme where the training samples of each class can be considered as the dictionary to express the query class, and the minimal reconstruction error indicates its corresponding class. However, how to learn a discriminative dictionary is still a difficult work. In this work, we make two contributions. First, we build a new and robust human action recognition framework by combining one modified sparse classification model and deep convolutional neural network (CNN) features. Secondly, we construct a novel classification model which consists of the representation-constrained term and the coefficients incoherence term. Experimental results on benchmark datasets show that our modified model can obtain competitive results in comparison to other state-of-the-art models.  相似文献   

14.
(Aim) To make a more accurate and precise COVID-19 diagnosis system, this study proposed a novel deep rank-based average pooling network (DRAPNet) model, i.e., deep rank-based average pooling network, for COVID-19 recognition. (Methods) 521 subjects yield 1164 slice images via the slice level selection method. All the 1164 slice images comprise four categories: COVID-19 positive; community-acquired pneumonia; second pulmonary tuberculosis; and healthy control. Our method firstly introduced an improved multiple-way data augmentation. Secondly, an n-conv rank-based average pooling module (NRAPM) was proposed in which rank-based pooling—particularly, rank-based average pooling (RAP)—was employed to avoid overfitting. Third, a novel DRAPNet was proposed based on NRAPM and inspired by the VGG network. Grad-CAM was used to generate heatmaps and gave our AI model an explainable analysis. (Results) Our DRAPNet achieved a micro-averaged F1 score of 95.49% by 10 runs over the test set. The sensitivities of the four classes were 95.44%, 96.07%, 94.41%, and 96.07%, respectively. The precisions of four classes were 96.45%, 95.22%, 95.05%, and 95.28%, respectively. The F1 scores of the four classes were 95.94%, 95.64%, 94.73%, and 95.67%, respectively. Besides, the confusion matrix was given. (Conclusions) The DRAPNet is effective in diagnosing COVID-19 and other chest infectious diseases. The RAP gives better results than four other methods: strided convolution, l2-norm pooling, average pooling, and max pooling.  相似文献   

15.
基于Karhunen-Loeve变换的人脸识别研究   总被引:6,自引:1,他引:5  
运用生物神经系统的神经信息处理方式开展对自动人脸识别的研究。采用主要成分分析技术-karhuen-Loeve变换,构造了一个能够将图像空间的人脸映射到人脸空间中,并实施识别的实验系统。分析了该系统的构成与特点,并给出了实验测试结果。  相似文献   

16.
Diabetic retinopathy (DR) diagnosis through digital fundus images requires clinical experts to recognize the presence and importance of many intricate features. This task is very difficult for ophthalmologists and time-consuming. Therefore, many computer-aided diagnosis (CAD) systems were developed to automate this screening process of DR. In this paper, a CAD-DR system is proposed based on preprocessing and a pre-train transfer learning-based convolutional neural network (PCNN) to recognize the five stages of DR through retinal fundus images. To develop this CAD-DR system, a preprocessing step is performed in a perceptual-oriented color space to enhance the DR-related lesions and then a standard pre-train PCNN model is improved to get high classification results. The architecture of the PCNN model is based on three main phases. Firstly, the training process of the proposed PCNN is accomplished by using the expected gradient length (EGL) to decrease the image labeling efforts during the training of the CNN model. Secondly, the most informative patches and images were automatically selected using a few pieces of training labeled samples. Thirdly, the PCNN method generated useful masks for prognostication and identified regions of interest. Fourthly, the DR-related lesions involved in the classification task such as micro-aneurysms, hemorrhages, and exudates were detected and then used for recognition of DR. The PCNN model is pre-trained using a high-end graphical processor unit (GPU) on the publicly available Kaggle benchmark. The obtained results demonstrate that the CAD-DR system outperforms compared to other state-of-the-art in terms of sensitivity (SE), specificity (SP), and accuracy (ACC). On the test set of 30,000 images, the CAD-DR system achieved an average SE of 93.20%, SP of 96.10%, and ACC of 98%. This result indicates that the proposed CAD-DR system is appropriate for the screening of the severity-level of DR.  相似文献   

17.
淡卫波  朱勇建  黄毅 《包装工程》2023,44(1):133-140
目的 提取烟包图像数据训练深度学习目标检测模型,提升烟包流水线拣包效率和准确性。方法 基于深度学习建立一种烟包识别分类模型,对原始YOLOv3模型进行改进,在原网络中加入设计的多空间金字塔池化结构(M–SPP),将64×64尺度的特征图下采样与32×32尺度的特征图进行拼接,并去除16×16尺度的预测特征层,提高模型的检测准确率和速度,并采用K–means++算法对先验框参数进行优化。结果 实验表明该目标检测模型平均准确率达到99.68%,检测速度达到70.82帧/s。结论 基于深度学习建立的图像识别分类模型准确率高且检测速度快,有效满足烟包流水线自动化实时检测。  相似文献   

18.
A wide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic, privacy, and security reasons. Numerous studies show that the Deep-Learning (DL) is a suitable option for human segmentation, and the ensemble of multiple DL-based segmentation models can improve the segmentation result. However, these approaches are not as effective when directly applied to the image segmentation in a video. This paper proposes an Adaptive N-Frames Ensemble (AFE) approach for high-movement human segmentation in a video using an ensemble of multiple DL models. In contrast to an ensemble, which executes multiple DL models simultaneously for every single video frame, the proposed AFE approach executes only a single DL model upon a current video frame. It combines the segmentation outputs of previous frames for the final segmentation output when the frame difference is less than a particular threshold. Our method employs the idea of the N-Frames Ensemble (NFE) method, which uses the ensemble of the image segmentation of a current video frame and previous video frames. However, NFE is not suitable for the segmentation of fast-moving objects in a video nor a video with low frame rates. The proposed AFE approach addresses the limitations of the NFE method. Our experiment uses three human segmentation models, namely Fully Convolutional Network (FCN), DeepLabv3, and Mediapipe. We evaluated our approach using 1711 videos of the TikTok50f dataset with a single-person view. The TikTok50f dataset is a reconstructed version of the publicly available TikTok dataset by cropping, resizing and dividing it into videos having 50 frames each. This paper compares the proposed AFE with single models and the Two-Models Ensemble, as well as the NFE models. The experiment results show that the proposed AFE is suitable for low-movement as well as high-movement human segmentation in a video.  相似文献   

19.
In this era, deep learning methods offer a broad spectrum of efficient and original algorithms to recognize or predict an output when given a sequence of inputs. In current trends, deep learning methods using recent long short-term memory (LSTM) algorithms try to provide superior performance, but they still have limited effectiveness when detecting sequences of complex human activity. In this work, we adapted the LSTM algorithm into a synchronous algorithm (sync-LSTM), enabling the model to take multiple parallel input sequences to produce multiple parallel synchronized output sequences. The proposed method is implemented for simultaneous human activity recognition (HAR) using heterogeneous sensor data in a smart home. HAR assists artificial intelligence in providing services to users according to their preferences. The sync-LSTM algorithm improves learning performance and sees its potential for real-world applications in complex HAR, such as concurrent activity, with higher accuracy and satisfactory computational complexity. The adapted algorithm for HAR is also applicable in the fields of ambient assistive living, healthcare, robotics, pervasive computing, and astronomy. Extensive experimental evaluation with publicly available datasets demonstrates the competitive recognition capabilities of our approach. The sync-LSTM algorithm improves learning performance and has the potential for real-life applications in complex HAR. For concurrent activity recognition, our proposed method shows an accuracy of more than 97%.  相似文献   

20.
汪荣贵  姚旭晨  杨娟  薛丽霞 《光电工程》2019,46(6):180416-1-180416-10
现有的细粒度分类模型不仅利用图像的类别标签,还使用大量人工标注的额外信息。为解决该问题,本文提出一种深度迁移学习模型,将大规模有标签细粒度数据集上学习到的图像特征有效地迁移至微型细粒度数据集中。首先,通过衔接域定量计算域间任务的关联度。然后,根据关联度选择适合目标域的迁移特征。最后,使用细粒度数据集视图类标签进行辅助学习,通过联合学习所有属性来获取更多的特征表示。实验表明,本文方法不仅可以获得较高精度,而且能够有效减少模型训练时间,同时也验证了进行域间特征迁移可以加速网络学习与优化这一结论。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号