首页 | 本学科首页   官方微博 | 高级检索  
     

基于语义约束LDA的商品特征和情感词提取
引用本文:彭云,万常选,江腾蛟,刘德喜,刘喜平,廖国琼.基于语义约束LDA的商品特征和情感词提取[J].软件学报,2017,28(3):676-693.
作者姓名:彭云  万常选  江腾蛟  刘德喜  刘喜平  廖国琼
作者单位:江西财经大学 信息管理学院, 江西 南昌 330013;数据与知识工程江西省高校重点实验室(江西财经大学), 江西 南昌 330013,江西财经大学 信息管理学院, 江西 南昌 330013;数据与知识工程江西省高校重点实验室(江西财经大学), 江西 南昌 330013,江西财经大学 信息管理学院, 江西 南昌 330013;数据与知识工程江西省高校重点实验室(江西财经大学), 江西 南昌 330013,江西财经大学 信息管理学院, 江西 南昌 330013;数据与知识工程江西省高校重点实验室(江西财经大学), 江西 南昌 330013,江西财经大学 信息管理学院, 江西 南昌 330013;数据与知识工程江西省高校重点实验室(江西财经大学), 江西 南昌 330013,江西财经大学 信息管理学院, 江西 南昌 330013;数据与知识工程江西省高校重点实验室(江西财经大学), 江西 南昌 330013
基金项目:国家自然科学基金项目(61562032,61662032,61662027,61173146,61363039,61363010,61462037,61562031);江西省自然科学基金重大项目(20152ACB20003);江西省高等学校科技落地计划项目(KJLD12022,KJLD14035)
摘    要:随着网络购物的发展,Web上产生了大量的商品评论文本数据,其中蕴含着丰富的评价知识。如何从这些海量评论文本中有效提取商品特征和情感词,进而获取特征级别的情感倾向,是进行商品评论细粒度情感分析的关键。本文根据中文商品评论文本的特点,从句法分析、词义理解和语境相关等多角度获取词语间的语义关系,然后将其作为约束知识嵌入到主题模型,提出语义关系约束的主题模型SRC-LDA(semantic relation constrained LDA),用来实现语义指导下LDA的细粒度主题词提取。由于SRC-LDA改善了标准LDA对于主题词的语义理解和识别能力,从而提高了相同主题下主题词分配的关联度和不同主题下主题词分配的区分度,可以更多地发现细粒度特征词、情感词及其之间的语义关联性。通过实验表明,SRC-LDA对于细粒度特征和情感词的发现和提取具有较好的效果。

关 键 词:LDA模型  语义约束  商品特征  情感词
收稿时间:7/3/2016 12:00:00 AM
修稿时间:2016/9/14 0:00:00

Extracting Product Aspects and User Opinions Based on Semantic Constrained LDA Model
PENG Yun,WAN Chang-Xuan,JIANG Teng-Jiao,LIU De-Xi,LIU Xi-Ping and LIAO Guo-Qiong.Extracting Product Aspects and User Opinions Based on Semantic Constrained LDA Model[J].Journal of Software,2017,28(3):676-693.
Authors:PENG Yun  WAN Chang-Xuan  JIANG Teng-Jiao  LIU De-Xi  LIU Xi-Ping and LIAO Guo-Qiong
Affiliation:School of Information and Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China;Jiangxi Key Laboratory of Data and Knowledge Engineering(Jiangxi University of Finance and Economics), Nanchang 330013, China,School of Information and Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China;Jiangxi Key Laboratory of Data and Knowledge Engineering(Jiangxi University of Finance and Economics), Nanchang 330013, China,School of Information and Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China;Jiangxi Key Laboratory of Data and Knowledge Engineering(Jiangxi University of Finance and Economics), Nanchang 330013, China,School of Information and Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China;Jiangxi Key Laboratory of Data and Knowledge Engineering(Jiangxi University of Finance and Economics), Nanchang 330013, China,School of Information and Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China;Jiangxi Key Laboratory of Data and Knowledge Engineering(Jiangxi University of Finance and Economics), Nanchang 330013, China and School of Information and Technology, Jiangxi University of Finance and Economics, Nanchang 330013, China;Jiangxi Key Laboratory of Data and Knowledge Engineering(Jiangxi University of Finance and Economics), Nanchang 330013, China
Abstract:With the development of online shopping,the Web has produced a large quantity of product reviews,which contains abundant evaluation knowledge about products.How to extract aspects and opinion words from the reviews,further obtain the sentiment polarity of the products at aspect level,is the key problems to solve in fine-grained sentiment analysis of product reviews.Firstly,considering the features of Chinese product reviews,the methods are designed to achieve the semantic relationships among words through the syntactic analysis,word meaning understanding and context relevance and then embed it as constrained knowledge into the topic model.Secondly,a semantic relation constrained topic model called SRC-LDA is proposed in this paper,which can guide the LDA to extract fine-grained topical words.Through the improvement of semantic comprehension and recognition ability of topical words in standard LDA,the proposed model can increase the words correlation under the same topic and the discrimination under the different topics,which can find more fine-grained aspect words,opinion words and the semantic associations of them.The experimental results show that the SRC-LDA is an effective approach for fine-grained aspects and opinion words extraction.
Keywords:Latent Dirichlet Allocation model  semantic constraint  product aspects  opinion words
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号