首页 | 本学科首页   官方微博 | 高级检索  
     

基于BERT预训练模型的灾害推文分类方法
引用本文:林佳瑞,程志刚,韩宇,尹云鹏. 基于BERT预训练模型的灾害推文分类方法[J]. 图学学报, 2022, 43(3): 530-536. DOI: 10.11996/JG.j.2095-302X.2022030530
作者姓名:林佳瑞  程志刚  韩宇  尹云鹏
作者单位:清华大学土木工程系,北京 100084
基金项目:国家自然科学基金项目(72091512,51908323)
摘    要:社交媒体已成为当前发布和传播突发灾害信息的重要媒介,有效识别并利用其中的真实信息对灾害应急管理具有重要意义。针对传统文本分类模型的不足,提出一种基于BERT预训练模型的灾害推文分类方法。经数据清洗、预处理及算法对比分析,在BERT预训练模型基础上,研究构建了基于长短期记忆-卷积神经网络(LSTM-CNN)的文本分类模型。在Kaggle竞赛平台的推文数据集上的实验表明,相比传统的朴素贝叶斯分类模型和常见的微调模型,该分类模型性能表现优异,识别率可达85%,可以更好地应对小样本分类问题。有关工作对精准识别真实灾害信息、提高灾害应急响应与沟通效率具有重要意义。

关 键 词:文本分类  深度学习  BERT  预训练模型  微调  灾害  应急管理

Disaster tweets classification method based on pretrained BERT model
LIN Jia-rui,CHENG Zhi-gang,HAN Yu,YIN Yun-peng. Disaster tweets classification method based on pretrained BERT model[J]. Journal of Graphics, 2022, 43(3): 530-536. DOI: 10.11996/JG.j.2095-302X.2022030530
Authors:LIN Jia-rui  CHENG Zhi-gang  HAN Yu  YIN Yun-peng
Affiliation:Department of Civil Engineering, Tsinghua University, Beijing 100084, China
Abstract:Social media has become an important medium for the release and dissemination of disaster information, the effective identification and utilization of which is of great significance to disaster emergency management. Given the shortcomings of the traditional text classification model, a disaster tweet classification method was proposed based on the pre-trained model of bidirectional encoder representations from transformers (BERT). After data cleaning and preprocessing, this study constructed a text classification model based on long short-term memory-convolutional neural network (LSTM-CNN) through comparative analysis, based on BERT. Experiments on the tweet datasets of the Kaggle competition platform showed that the proposed classification model outperforms the traditional Naive Bayesian classification model and the common fine-tuning model, with the recognition rate up to 85%. This study could shed significant light on enhancing the identification accuracy of real disaster information and the efficiency of disaster emergency response.
Keywords:text classification  deep learning  BERT  pre-trained model  fine-tuning  disaster  emergency management  
点击此处可从《图学学报》浏览原始摘要信息
点击此处可从《图学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号