首页 | 本学科首页   官方微博 | 高级检索  
     

基于中文字形的ELMo在电商事件识别上的应用
引用本文:王铭涛,方晔玮,陈文亮. 基于中文字形的ELMo在电商事件识别上的应用[J]. 中文信息学报, 2021, 35(12): 94-102
作者姓名:王铭涛  方晔玮  陈文亮
作者单位:苏州大学 计算机科学与技术学院,江苏 苏州 215006
基金项目:国家自然科学基金(61525205,61876115)
摘    要:挖掘电商评论文本中的电商事件对分析用户购物行为和商品场景分类有重要帮助.该文给出电商事件的定义,将电商事件识别问题转换为序列标注问题,构建了一个基于电商评论文本的电商事件标注数据.该文首先在基于字符的BiLSTM-CRF神经网络模型上进行扩展,加入语言模型词向量(Embeddings from Language Mod...

关 键 词:电商事件  序列标注  字形特征  ELMo
收稿时间:2020-03-02

E-commerce Event Detection with Chinese Character Glyph Based ELMo
WANG Mingtao,FANG Yewei,CHEN Wenliang. E-commerce Event Detection with Chinese Character Glyph Based ELMo[J]. Journal of Chinese Information Processing, 2021, 35(12): 94-102
Authors:WANG Mingtao  FANG Yewei  CHEN Wenliang
Affiliation:School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China
Abstract:Mining events in E-commerce reviews is of great help to analyze customer shopping behavior and commodity scene classification. This paper presents the definition of E-commerce event and treats the event detectionas a sequence labeling issue. Besides, It constructs an event detection corpus based on E-commerce comments. Firstly, this paper extends the character based BiLSTM-CRF model with the Embeddings from Language Models (ELMo) to improve the performance. Then, it considers the characteristics of Chinese characters, including five-strokes(Wubi) and common strokes. Two novel models are proposed to add glyph features into ELMo by using the glyph information of events. Experimental results show that the proposed models can improve performance on a newly built dataset. Finally, this paper uses two large text corpus from news and E-commerce domains to train language models. The results show that the E-commerce corpus is more helpful to the system.
Keywords:e-commerce event  sequence labeling  glyph features  ELMo  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号