Enhancing multimodal disaster tweet classification using state-of-the-art deep learning networks期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Enhancing multimodal disaster tweet classification using state-of-the-art deep learning networks

Authors:	Adwaith Divakaran Abishake Ashok Kumar Raghul Siva Venkatesh Sivasankar Elango

Affiliation:	1.Department of Computer Science, National Institute of Technology, Tiruchirappalli, India ;2.Tanjore Main Road, National Highway 67, Near BHEL Trichy, Tiruchirappalli, Tamil Nadu, 620015, India ;

Abstract:	During disasters, multimedia content on social media sites offers vital information. Reports of injured or deceased people, infrastructure destruction, and missing or found people are among the types of information exchanged. While several studies have demonstrated the importance of both text and picture content for disaster response, previous research has primarily concentrated on the text modality and not so much success with multi-modality. Latest research in multi-modal classification in disaster related tweets uses comparatively primitive models such as KIMCNN and VGG16. In this research work we have taken this further and utilized state-of-the-art models in both text and image classification to try and improve multi-modal classification of disaster related tweets. The research was conducted on two different classification tasks, first to detect if a tweet is informative or not, second to understand the response needed. The process of multimodal analysis is broken down by incorporating different methods of feature extraction from the textual data corpus and pre-processing the corresponding image corpus, then we use several classification models to train and predict the output and compare their performances while tweaking the parameters to improve the results. Models such as XLNet, BERT and RoBERTa in text classification and ResNet, ResNeXt and DenseNet in image classification were trained and analyzed. Results show that the proposed multimodal architecture outperforms models trained using a single modality (text or image alone). Also, it proves that the newer state-of-the-art models outperform the baseline models by a reasonable margin for both the classification tasks.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏