首页 | 本学科首页   官方微博 | 高级检索  
     

基于特征的用户评论自动摘要
引用本文:章彦星,张铭,邓志鸿.基于特征的用户评论自动摘要[J].计算机研究与发展,2009,46(Z2).
作者姓名:章彦星  张铭  邓志鸿
作者单位:北京大学信息科学技术学院,北京,100871
基金项目:教育部科技发展中心"网络时代的科技论文快速共享"专项研究基金项目,国家"八六三"高技术研究发展计划基金项目 
摘    要:电子商务网站允许用户对商品发表评论,用户评论通常含有用户对商品的主观性体验,常被潜在顾客作为比较不同商品并作出购买选择的参考,也可被生产厂商作为市场反馈调查的数据来源.然而,由于电子商务的发展,热门商品常常拥有成百甚至上千条用户评论,这使得阅读所有评论十分耗时.提出了一种基于特征的用户评论自动摘要方法,能够自动生成简洁、全面的摘要 .首先自动从评论中识别用户评价的商品特征,根据特征对评论句分类,然后使用句子抽取的方法生成摘要 .实验证明该特征识别和特征过滤算法的查准率平均可达81%,查全率为52%.相较于Hu和Liu使用的频繁项集挖掘算法.查全率降低了6%,而查准率提高了24%,F1值提高为6%.算法更加注重特征识别的查准率,总体的摘要效果比较好.

关 键 词:自动摘要  特征  用户评论  频繁项集  分类  句子抽取

Feature-Driven Summarization of Customer Reviews
Zhang Yanxing,Zhang Ming,Deng Zhihong.Feature-Driven Summarization of Customer Reviews[J].Journal of Computer Research and Development,2009,46(Z2).
Authors:Zhang Yanxing  Zhang Ming  Deng Zhihong
Abstract:E-commerce websites allow customers to post reviews about products freely.These customer reviews always contain plenty of information,which are usually referenced by other customers when they are comparing with different products and deciding which one to buy.They can also be used by manufactures to get feedback about their products from market research.However,with the booming of e-commerce,there are usually hundreds of reviews about a certain product,so it's too time-consuming to read all these reviews.In this paper,a feature-driven approach is put forward to summarize customer reviews automatically.It first mines product features from reviews,then classifies reviews at the granularity level of sentence,and at last generates summary via sentence extraction.Experimental results show that the precision of our feature mining and pruning algorithm is 81% and recall is 52%.Comparing with Hu & Liu's work,the algorithm is 24% higher in precision though 6% lower in recall on our datasets,and fortunately F1 is 6% higher than Hu & Liu's.The algorithm focuses on precision of feature identification more than recall,so the overall summarization quality is satisfied.
Keywords:summarization  feature-driven  customer review  frequent itemset mining  classification  sentanca extraetion
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号