首页 | 本学科首页   官方微博 | 高级检索  
     

基于分层注意力机制的神经网络垃圾评论检测模型
引用本文:刘雨心,王莉,张昊. 基于分层注意力机制的神经网络垃圾评论检测模型[J]. 计算机应用, 2018, 38(11): 3063-3068. DOI: 10.11772/j.issn.1001-9081.2018041356
作者姓名:刘雨心  王莉  张昊
作者单位:1. 太原理工大学 信息与计算机学院, 山西 晋中 030600;2. 太原理工大学 大数据学院, 山西 晋中 030600
基金项目:国家863计划项目(2014AA015204);国家自然科学基金资助项目(61702356);山西省自然科学基金资助项目(201703D421013);中国科学院计算技术研究所网络数据科学重点实验室课题(CASNDST20140X)。
摘    要:针对现有垃圾评论识别方法很难揭示用户评论的潜在语义信息这一问题,提出一种基于层次注意力的神经网络检测(HANN)模型。该模型主要由以下两部分组成:Word2Sent层,在词向量表示的基础上,采用卷积神经网络(CNN)生成连续的句子表示;Sent2Doc层,基于上一层产生的句子表示,使用注意力池化的神经网络生成文档表示。生成的文档表示直接作为垃圾评论的最终特征,采用softmax分类器分类。此模型通过完整地保留评论的位置和强度特征,并从中提取重要的和综合的信息(文档任何位置的历史、未来和局部上下文),挖掘用户评论的潜在语义信息,从而提高垃圾评论检测准确率。实验结果表明,与仅基于神经网络的方法相比,该模型准确率平均提高5%,分类效果显著改善。

关 键 词:垃圾评论  表示学习  注意力机制  卷积神经网络  双向长短时记忆  
收稿时间:2018-04-30
修稿时间:2018-06-26

Hierarchical attention-based neural network model for spam review detection
LIU Yuxin,WANG Li,ZHANG Hao. Hierarchical attention-based neural network model for spam review detection[J]. Journal of Computer Applications, 2018, 38(11): 3063-3068. DOI: 10.11772/j.issn.1001-9081.2018041356
Authors:LIU Yuxin  WANG Li  ZHANG Hao
Affiliation:1. College of Information and Computer, Taiyuan University of Technology, Jinzhong Shanxi 030600, China;2. College of Data Science, Taiyuan University of Technology, Jinzhong Shanxi 030600, China
Abstract:Existing measures to detect spam reviews mainly focus on designing features from the perspective of linguistic and psychological clues, which hardly reveal the latent semantic information of the reviews. A Hierarchical Attention-based Neural Network (HANN) model was proposed to mine latent semantic information. The model mainly consisted of the following two layers:the Word2Sent layer, which used a Convolutional Neural Network (CNN) to produce continuous sentence representations on the basis of word embedding, and the Sent2Doc layer, which utilized an attention pooling-based neural network to generate document representations on the basis of sentence representations. The generated document representations were directly employed as features to identify spam reviews. The proposed hierarchical attention mechanism enables our model to preserve position and intensity information completely. Thus, the comprehensive information, history, future, and local context of any position in a document can be extracted. The experimental results show that our method can achieve higher accuracy, compared with neural network-based methods only, the accuracy is increased by 5% on average, and the classification effect is improved significantly.
Keywords:spam review  representation learning  attention mechanism  Convolutional Neural Network (CNN)  Bidirectional Long-short Term Memory (BLSTM)  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号