基于BERT预训练模型的灾害推文分类方法 Disaster tweets classification method based on pretrained BERT model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于BERT预训练模型的灾害推文分类方法

引用本文：	林佳瑞,程志刚,韩宇,尹云鹏. 基于BERT预训练模型的灾害推文分类方法[J]. 图学学报, 2022, 43(3): 530-536. DOI: 10.11996/JG.j.2095-302X.2022030530

作者姓名：	林佳瑞程志刚韩宇尹云鹏

作者单位：	清华大学土木工程系，北京 100084

基金项目：	国家自然科学基金项目(72091512，51908323)

摘要：	社交媒体已成为当前发布和传播突发灾害信息的重要媒介，有效识别并利用其中的真实信息对灾害应急管理具有重要意义。针对传统文本分类模型的不足，提出一种基于BERT预训练模型的灾害推文分类方法。经数据清洗、预处理及算法对比分析，在BERT预训练模型基础上，研究构建了基于长短期记忆-卷积神经网络(LSTM-CNN)的文本分类模型。在Kaggle竞赛平台的推文数据集上的实验表明，相比传统的朴素贝叶斯分类模型和常见的微调模型，该分类模型性能表现优异，识别率可达85%，可以更好地应对小样本分类问题。有关工作对精准识别真实灾害信息、提高灾害应急响应与沟通效率具有重要意义。
关键词：	文本分类深度学习 BERT 预训练模型微调灾害应急管理
Disaster tweets classification method based on pretrained BERT model

LIN Jia-rui,CHENG Zhi-gang,HAN Yu,YIN Yun-peng. Disaster tweets classification method based on pretrained BERT model[J]. Journal of Graphics, 2022, 43(3): 530-536. DOI: 10.11996/JG.j.2095-302X.2022030530

Authors:	LIN Jia-rui CHENG Zhi-gang HAN Yu YIN Yun-peng

Affiliation:	Department of Civil Engineering, Tsinghua University, Beijing 100084, China

Abstract:	Social media has become an important medium for the release and dissemination of disaster information, the effective identification and utilization of which is of great significance to disaster emergency management. Given the shortcomings of the traditional text classification model, a disaster tweet classification method was proposed based on the pre-trained model of bidirectional encoder representations from transformers (BERT). After data cleaning and preprocessing, this study constructed a text classification model based on long short-term memory-convolutional neural network (LSTM-CNN) through comparative analysis, based on BERT. Experiments on the tweet datasets of the Kaggle competition platform showed that the proposed classification model outperforms the traditional Naive Bayesian classification model and the common fine-tuning model, with the recognition rate up to 85%. This study could shed significant light on enhancing the identification accuracy of real disaster information and the efficiency of disaster emergency response.

Keywords:	text classification deep learning BERT pre-trained model fine-tuning disaster emergency management

	点击此处可从《图学学报》浏览原始摘要信息
	点击此处可从《图学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏