首页 | 本学科首页   官方微博 | 高级检索  
     

WSAM:互联网UGC文本主观观点挖掘系统
引用本文:费仲超,朱鲲鹏,魏芳.WSAM:互联网UGC文本主观观点挖掘系统[J].计算机应用与软件,2012,29(5):90-94.
作者姓名:费仲超  朱鲲鹏  魏芳
作者单位:上海贝尔股份有限公司 上海 210206;复旦大学计算机科学技术学院 上海 200433
摘    要:互联网上的用户生成内容UGC(User Generated Content)中蕴含的用户主观观点信息对分析用户行为、用户需求等工作有着重要的价值。设计一套基于自然语言理解的互联网UGC文本主观观点分析系统WSAM,该系统能挖掘出用户主观观点所蕴含的关注对象和主观成分。分析了互联网UGC现象和生成原因,总结出UGC中用户主观观点中的四种主要类型。挖掘用户主观观点过程中,将用户主观观点的挖掘转化为句子中主观观点关注对象的识别和主观成分的判断。算法结合基于词语类、结构类等相关特征,采用最大熵分类器挖掘用户主观观点。实验验证,WSAM系统所采用的算法性能较好,且还能够灵活扩充出情感分析(Opin-ion Mining)等相关应用,同样也能达到较好的结果。

关 键 词:用户生成内容  UGC  自然语言处理  情感分析

WSAM: AN INTERNET TEXT UGC SUBJECTIVE ATTITUDE MINING SYSTEM
Fei Zhongchao , Zhu Kunpeng , Wei Fang.WSAM: AN INTERNET TEXT UGC SUBJECTIVE ATTITUDE MINING SYSTEM[J].Computer Applications and Software,2012,29(5):90-94.
Authors:Fei Zhongchao  Zhu Kunpeng  Wei Fang
Affiliation:Fei Zhongchao Zhu Kunpeng Wei Fang(Alcatel-Lucent Shanghai Bell Company Ltd.,Shanghai 201206,China)(School of Computer Science,Fudan University,Shanghai 200433,China)
Abstract:The information about subjective attitude of users contained in UGC(User Generated Content) of internet is much valuable for user behaviour analysis and user demand analysis.In this paper we design an internet text UGC subjective attitude analysing system,WSAM,based on nature language comprehension.This system can mine the objects attended to and the subjective components,all contained in subjective attitude of users.The UGC phenomena in internet and the reason they generated are analysed in the paper,and four main types of subjective attitude of users in text UGC are concluded.During the process of mining subjective attitude of users,we convert the procedure of subjective attitude mining into the procedures of recognising the object attended to by subjective attitude in sentence and determining the subjective components.The algorithm uses the maximum entropy classifier to mine subjective attitude of users in combination with relative features in regard to lexical and structural classes.Experiments validate that the algorithm adopted by WSAM system is good in performance,and the system can be extended easily to related applications such as opinion mining with preferred good results as well.
Keywords:User generated content UGC Nature language processing Opinion mining
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号