首页 | 本学科首页   官方微博 | 高级检索  
     

中文社区问答中问题答案质量评价和预测
引用本文:李 晨,巢文涵,陈小明,李舟军.中文社区问答中问题答案质量评价和预测[J].计算机科学,2011,38(6):230-236.
作者姓名:李 晨  巢文涵  陈小明  李舟军
作者单位:北京航空航天大学计算机学院,北京,100191
基金项目:本文受国家白然科学基金项目(90718017)和教育部高等学校博十学科点专项基金(20070006055)资助。
摘    要:知识共享型网站为自动问答系统带来了新的研究契机。但用户提供的问题及其答案质量参差不齐,在提供有用信息的同时可能包含各种无关甚至恶意的信息。对此类信息进行判别和过滤,并选取高质量的问题与答案对,有助于在基于社区的自动问答系统中重用相关问题的答案以提高问答系统的服务质量。首先从中文社区问答网站上抓取大量问题及答案,利用社会网络的方法对提问者和回答者的互动关系及特点进行了统计与分析。然后基于给定的问答质量判定标准,对3000多个问题及其答案进行了人工标注。并通过提取文本和非文本两类特征集,利用机器学习算法设计和实现了基于特征集的问答质量分类器。试验结果表明其精度和召回率均在70%以上。最后分析了影响社区网络中问答质量的主要因素。

关 键 词:社区问答,社会网络,机器学习,问题答案质量评价和预测,人工标注

Quality Evaluation and Prediction for Question and Answer in Chinese Community Question Answering
LI Chcn,CHAO Wcn-han,CHEN Xiao-ming,LI Zhou-jun.Quality Evaluation and Prediction for Question and Answer in Chinese Community Question Answering[J].Computer Science,2011,38(6):230-236.
Authors:LI Chcn  CHAO Wcn-han  CHEN Xiao-ming  LI Zhou-jun
Affiliation:(School of Computer Science, Beihang University, Beijing 100191, China)
Abstract:The rise of Knowledgcsharing platform on the Internet in China provides a new approach for Automatic Question Answering. However, the quality of User-Generated Content in such social networks may vary significantly,from useless information to malice spam. Identifying and filtering such content arc particularly important to improve users' experience and the performance of Question Answering System. We first extracted a set of question answer content from Chinese Community Question Answering site, investigated a series of statistic characteristics on the interaction of participants, and then manually annotated quality of a subset of these questions and answers. By combining text features and non-text features provided by the community extracted from those questions and answers,we established acontent quality classification model for evaluation and prediction. We find that this model is able to distinguish highquality ones from others with considerable accuracy.
Keywords:Community question answering  Social networks  Machine learning  Question and answer quality evaluation and prediction  Human annotation
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号