The text classification process has been extensively investigated in various languages, especially English. Text classification models are vital in several Natural Language Processing (NLP) applications. The Arabic language has a lot of significance. For instance, it is the fourth mostly-used language on the internet and the sixth official language of the United Nations. However, there are few studies on the text classification process in Arabic. A few text classification studies have been published earlier in the Arabic language. In general, researchers face two challenges in the Arabic text classification process: low accuracy and high dimensionality of the features. In this study, an Automated Arabic Text Classification using Hyperparameter Tuned Hybrid Deep Learning (AATC-HTHDL) model is proposed. The major goal of the proposed AATC-HTHDL method is to identify different class labels for the Arabic text. The first step in the proposed model is to pre-process the input data to transform it into a useful format. The Term Frequency-Inverse Document Frequency (TF-IDF) model is applied to extract the feature vectors. Next, the Convolutional Neural Network with Recurrent Neural Network (CRNN) model is utilized to classify the Arabic text. In the final stage, the Crow Search Algorithm (CSA) is applied to fine-tune the CRNN model’s hyperparameters, showing the work’s novelty. The proposed AATC-HTHDL model was experimentally validated under different parameters and the outcomes established the supremacy of the proposed AATC-HTHDL model over other approaches.  相似文献   


Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking. This is a task of decoding the text from the speaker’s mouth movement. This paper proposes a lip-reading model that helps deaf people and persons with hearing problems to understand a speaker by capturing a video of the speaker and inputting it into the proposed model to obtain the corresponding subtitles. Using deep learning technologies makes it easier for users to extract a large number of different features, which can then be converted to probabilities of letters to obtain accurate results. Recently proposed methods for lip reading are based on sequence-to-sequence architectures that are designed for natural machine translation and audio speech recognition. However, in this paper, a deep convolutional neural network model called the hybrid lip-reading (HLR-Net) model is developed for lip reading from a video. The proposed model includes three stages, namely, pre-processing, encoder, and decoder stages, which produce the output subtitle. The inception, gradient, and bidirectional GRU layers are used to build the encoder, and the attention, fully-connected, activation function layers are used to build the decoder, which performs the connectionist temporal classification (CTC). In comparison with the three recent models, namely, the LipNet model, the lip-reading model with cascaded attention (LCANet), and attention-CTC (A-ACA) model, on the GRID corpus dataset, the proposed HLR-Net model can achieve significant improvements, achieving the CER of 4.9%, WER of 9.7%, and Bleu score of 92% in the case of unseen speakers, and the CER of 1.4%, WER of 3.3%, and Bleu score of 99% in the case of overlapped speakers.


Handwritten character recognition systems are used in every field of life nowadays, including shopping malls, banks, educational institutes, etc. Urdu is the national language of Pakistan, and it is the fourth spoken language in the world. However, it is still challenging to recognize Urdu handwritten characters owing to their cursive nature. Our paper presents a Convolutional Neural Networks (CNN) model to recognize Urdu handwritten alphabet recognition (UHAR) offline and online characters. Our research contributes an Urdu handwritten dataset (aka UHDS) to empower future works in this field. For offline systems, optical readers are used for extracting the alphabets, while diagonal-based extraction methods are implemented in online systems. Moreover, our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition. To this end, we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums. Subsequently, we carried out detailed experiments for character recognition, as detailed in the results. The proposed CNN model outperformed as compared to previously published approaches.  相似文献   

This research proposed an improved transfer-learning bird classification framework to achieve a more precise classification of Protected Indonesia Birds (PIB) which have been identified as the endangered bird species. The framework takes advantage of using the proposed sequence of Batch Normalization Dropout Fully-Connected (BNDFC) layers to enhance the baseline model of transfer learning. The main contribution of this work is the proposed sequence of BNDFC that can be applied to any Convolutional Neural Network (CNN) based model to improve the classification accuracy, especially for image-based species classification problems. The experiment results show that the proposed sequence of BNDFC layers outperform other combination of BNDFC. The addition of BNDFC can improve the model’s performance across ten different CNN-based models. On average, BNDFC can improve by approximately 19.88% in Accuracy, 24.43% in F-measure, 17.93% in G-mean, 23.41% in Sensitivity, and 18.76% in Precision. Moreover, applying fine-tuning (FT) is able to enhance the accuracy by 0.85% with a smaller validation loss of 18.33% improvement. In addition, MobileNetV2 was observed to be the best baseline model with the lightest size of 35.9 MB and the highest accuracy of 88.07% in the validation set.  相似文献   


The increasing capabilities of Artificial Intelligence (AI), has led researchers and visionaries to think in the direction of machines outperforming humans by gaining intelligence equal to or greater than humans, which may not always have a positive impact on the society. AI gone rogue, and Technological Singularity are major concerns in academia as well as the industry. It is necessary to identify the limitations of machines and analyze their incompetence, which could draw a line between human and machine intelligence. Internet memes are an amalgam of pictures, videos, underlying messages, ideas, sentiments, humor, and experiences, hence the way an internet meme is perceived by a human may not be entirely how a machine comprehends it. In this paper, we present experimental evidence on how comprehending Internet Memes is a challenge for AI. We use a combination of Optical Character Recognition techniques like Tesseract, Pixel Link, and East Detector to extract text from the memes, and machine learning algorithms like Convolutional Neural Networks (CNN), Region-based Convolutional Neural Networks (RCNN), and Transfer Learning with pre-trained denseNet for assessing the textual and facial emotions combined. We evaluate the performance using Sensitivity and Specificity. Our results show that comprehending memes is indeed a challenging task, and hence a major limitation of AI. This research would be of utmost interest to researchers working in the areas of Artificial General Intelligence and Technological Singularity.


给出了大数据和机器学习的子领域——深度学习的概念,阐述了深度学习对获取大数据中的有价值信息的重要作用。描述了大数据下利用图像处理单元(GPU)进行并行运算的深度学习框架,对其中的大规模卷积神经网络(CNN)、大规模深度置信网络(DBN)和大规模递归神经网络(RNN)进行了重点论述。分析了大数据的容量、多样性、速率特征,介绍了大规模数据、多样性数据、高速率数据下的深度学习方法。展望了大数据背景下深度学习的发展前景,指出在不远的将来,大数据与深度学习融合的技术将会在计算机视觉、机器智能等多个领域获得突破性进展。  相似文献   

This study is designed to develop Artificial Intelligence (AI) based analysis tool that could accurately detect COVID-19 lung infections based on portable chest x-rays (CXRs). The frontline physicians and radiologists suffer from grand challenges for COVID-19 pandemic due to the suboptimal image quality and the large volume of CXRs. In this study, AI-based analysis tools were developed that can precisely classify COVID-19 lung infection. Publicly available datasets of COVID-19 (N = 1525), non-COVID-19 normal (N = 1525), viral pneumonia (N = 1342) and bacterial pneumonia (N = 2521) from the Italian Society of Medical and Interventional Radiology (SIRM), Radiopaedia, The Cancer Imaging Archive (TCIA) and Kaggle repositories were taken. A multi-approach utilizing deep learning ResNet101 with and without hyperparameters optimization was employed. Additionally, the features extracted from the average pooling layer of ResNet101 were used as input to machine learning (ML) algorithms, which twice trained the learning algorithms. The ResNet101 with optimized parameters yielded improved performance to default parameters. The extracted features from ResNet101 are fed to the k-nearest neighbor (KNN) and support vector machine (SVM) yielded the highest 3-class classification performance of 99.86% and 99.46%, respectively. The results indicate that the proposed approach can be better utilized for improving the accuracy and diagnostic efficiency of CXRs. The proposed deep learning model has the potential to improve further the efficiency of the healthcare systems for proper diagnosis and prognosis of COVID-19 lung infection.  相似文献   

车辆识别代号对于车辆年检具有重要的意义.由于缺乏字符级标注,无法对车辆识别代号进行单字符风格校验.针对该问题,设计了一种单字符检测和识别框架,并对此框架提出了一种无须字符级标注的弱监督学习方法.首先,对VGG16-BN各个层次的特征信息进行融合,获得具有单字符位置信息与语义信息的融合特征图;其次,设计了一个字符检测分支...  相似文献   

储有亮  李梁 《声学技术》2021,40(6):815-821
为了解决人们在强噪声环境下,通过空气途径传递的语音信号会严重失真的问题,提出了一种基于深层双向长短期记忆-深度卷积神经网络(Deep Bidirectional Long and Short Term Memory-Deep Convolutional Neural Network,DBLSTM-DCNN)的骨导语音转...  相似文献   

In the modern world, one of the most severe eye infections brought on by diabetes is known as diabetic retinopathy (DR), which will result in retinal damage, and, thus, lead to blindness. Diabetic retinopathy (DR) can be well treated with early diagnosis. Retinal fundus images of humans are used to screen for lesions in the retina. However, detecting DR in the early stages is challenging due to the minimal symptoms. Furthermore, the occurrence of diseases linked to vascular anomalies brought on by DR aids in diagnosing the condition. Nevertheless, the resources required for manually identifying the lesions are high. Similarly, training for Convolutional Neural Networks (CNN) is more time-consuming. This proposed research aims to improve diabetic retinopathy diagnosis by developing an enhanced deep learning model (EDLM) for timely DR identification that is potentially more accurate than existing CNN-based models. The proposed model will detect various lesions from retinal images in the early stages. First, characteristics are retrieved from the retinal fundus picture and put into the EDLM for classification. For dimensionality reduction, EDLM is used. Additionally, the classification and feature extraction processes are optimized using the stochastic gradient descent (SGD) optimizer. The EDLM’s effectiveness is assessed on the KAGGLE dataset with 3459 retinal images, and results are compared over VGG16, VGG19, RESNET18, RESNET34, and RESNET50. Experimental results show that the EDLM achieves higher average sensitivity by 8.28% for VGG16, by 7.03% for VGG19, by 5.58% for ResNet18, by 4.26% for ResNet 34, and by 2.04% for ResNet 50, respectively.  相似文献   

仝钰  庞新宇  魏子涵 《振动与冲击》2021,(5):247-253,260
针对一维信号作为卷积神经网络输入时无法充分利用数据间的相关信息的问题,提出GADF-CNN的轴承故障诊断模型.利用格拉姆角差域(GADF)对采集到的振动信号进行编码,可以很容易地进行角度透视,从而识别出不同时间间隔内的时间相关性并生产相应特征图,之后将其输入卷积神经网络(CNN)自适应的完成滚动轴承故障特征的提取与分类...  相似文献   

Internet of Things (IoT) devices incorporate a large amount of data in several fields, including those of medicine, business, and engineering. User authentication is paramount in the IoT era to assure connected devices’ security. However, traditional authentication methods and conventional biometrics-based authentication approaches such as face recognition, fingerprints, and password are vulnerable to various attacks, including smudge attacks, heat attacks, and shoulder surfing attacks. Behavioral biometrics is introduced by the powerful sensing capabilities of IoT devices such as smart wearables and smartphones, enabling continuous authentication. Artificial Intelligence (AI)-based approaches introduce a bright future in refining large amounts of homogeneous biometric data to provide innovative user authentication solutions. This paper presents a new continuous passive authentication approach capable of learning the signatures of IoT users utilizing smartphone sensors such as a gyroscope, magnetometer, and accelerometer to recognize users by their physical activities. This approach integrates the convolutional neural network (CNN) and recurrent neural network (RNN) models to learn signatures of human activities from different users. A series of experiments are conducted using the MotionSense dataset to validate the effectiveness of the proposed method. Our technique offers a competitive verification accuracy equal to 98.4%. We compared the proposed method with several conventional machine learning and CNN models and found that our proposed model achieves higher identification accuracy than the recently developed verification systems. The high accuracy achieved by the proposed method proves its effectiveness in recognizing IoT users passively through their physical activity patterns.  相似文献   

This paper introduces the principle for recognition of engine work -wave signal with neural net-work. A diagnosis method for recognizing engine trouble by its ivork wave is proposed. The designing process is illustrated by diagnosing the voltage trouble of the fuel injector of an electronic control (EC) engine.  相似文献   

This paper presents an intelligent system for gastrointestinal polyp detection in endoscopic video. Video endoscopy is a popular diagnostic modality in assessing the gastrointestinal polyps. But the accuracy of diagnosis mostly depends on doctors' experience that is crucial to detect polyps in many cases. Computer-aided polyp detection is promising to reduce the miss detection rate of polyp and thus improve the accuracy of diagnosis results. The proposed method illustrates an automatic system based on a new color feature extraction scheme as a support for gastrointestinal polyp detection. The scheme is the combination of color empirical mode decomposition features and convolutional neural network features extracted from video frames. The features are fed into a linear support vector machine to train the classifier. Experiments on standard public databases show that the proposed scheme outperforms the previous conventional methods, gaining accuracy of 99.53%, sensitivity of 99.91%, and specificity of 99.15%.  相似文献   

针对传统的基于数据驱动的机械故障模式识别方法中需要人工构造算法提取特征以及人工构造特征提取算法繁琐的问题,结合卷积神经网络(CNN)在图像特征自动提取与图像分类识别中的广泛应用,提出了一种基于CNN图像分类的轴承故障模式识别方法。首先,利用集合经验模态分解(EEMD)方法对轴承振动信号进行自适应分解并用相关系数对得到的本征模函数分量进行筛选。其次,对筛选得到的本征模函数分量进行伪魏格纳-威利时频分析(PWVD)计算得到信号的时频分布图,并对时频图进行预处理。最后,将轴承15种不同工况预处理后的时频图利用CNN进行特征提取与分类识别。将该方法与同类方法进行了对比,分类正确率提高了4.26%。  相似文献   

