首页 | 本学科首页   官方微博 | 高级检索  
     

基于社交媒体文本挖掘的个人事件检测方法
引用本文:肖锐,刘明义,涂志莹,王忠杰.基于社交媒体文本挖掘的个人事件检测方法[J].计算机应用,2022,42(11):3513-3519.
作者姓名:肖锐  刘明义  涂志莹  王忠杰
作者单位:哈尔滨工业大学 计算学部,哈尔滨 150001
基金项目:国家自然科学基金资助项目(61772155)
摘    要:用户的社交媒体中蕴含着他们过去的个人经历和潜在的生活规律,研究其规律对预测用户未来的行为以及对用户进行个性化推荐有很大的价值。通过收集微博数据,定义了11种类型的事件,并提出了一个三阶段的Pipeline的系统,利用BERT预训练模型,分别在三个阶段使用BERT+BiLSTM+Attention、BERT+FullConnect、BERT+BiLSTM+CRF方法进行个人事件检测。从微博文本中抽取出该文本是否包含定义的事件、包含的事件类型、每种事件包含的元素等信息,具体元素为Subject(事件主语)、Object(事件元素)、Time(事件发生时间)、Place(事件发生的地点)和Tense(事件发生的时态),从而探究用户个人时间轴上的事件变化规律来预测个人事件。在收集的真实用户微博数据集上进行实验,并与逻辑回归、朴素贝叶斯、随机森林、决策树等分类算法进行对比分析。实验结果表明,三个阶段中的BERT+BiLSTM+Attention、BERT+FullConnect和BERT+BiLSTM+CRF方法均取得了最高的F1值,验证了所提方法的有效性。最后根据所提方法抽取出的事件和其中的时间信息可视化地构建了用户的个人事件时间轴

关 键 词:社交媒体  个人事件  事件检测  BERT模型  个人事件时间轴  
收稿时间:2022-01-27
修稿时间:2022-03-20

Personal event detection method based on text mining in social media
Rui XIAO,Mingyi LIU,Zhiying TU,Zhongjie WANG.Personal event detection method based on text mining in social media[J].journal of Computer Applications,2022,42(11):3513-3519.
Authors:Rui XIAO  Mingyi LIU  Zhiying TU  Zhongjie WANG
Affiliation:Faculty of Computing,Harbin Institute of Technology,Harbin Heilongjiang 150001,China
Abstract:Users’ social media contains their past personal experiences and potential life patterns, and the study of their patterns is of great value for predicting users’ future behaviors and performing personalized recommendations for users. By collecting Weibo data, 11 types of events were defined, and a three?stage Pipeline system was proposed to detect personal events by using BERT (Bidirectional Encoder Representations from Transformers) pre?trained models in three stages respectively, including BERT+BiLSTM+Attention, BERT+FullConnect and BERT+BiLSTM+CRF. The information of whether the text contained defined events, the event types of events contained, and the elements contained in each event were extracted from the Weibo, and the specific elements are Subject (subject of the event), Object (event element), Time (event occurrence time), Place (place where the event occurred) and Tense (tense of the event), thereby exploring the change law of user’s personal event timeline to predict personal events. Comparative experiments and analysis were conducted with classification algorithms such as logistic regression, naive Bayes, random forest and decision tree on a collected real user Weibo dataset. Experimental results show that the BERT+BiLSTM+Attention, BERT+FullConnect, BERT+BiLSTM+CRF methods used in three stages achieve the highest F1?score, verifying the effectiveness of the proposed methods. Finally, the personal event timeline was visually built according to the extracted events with time information.
Keywords:social media  personal event  event detection  BERT (Bidirectional Encoder Representations from Transformers) model  personal event timeline  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号