首页 | 本学科首页   官方微博 | 高级检索  
     

基于语义扩展的短问题分类
引用本文:冶忠林,杨燕,贾真,尹红风.基于语义扩展的短问题分类[J].计算机应用,2015,35(3):792-796.
作者姓名:冶忠林  杨燕  贾真  尹红风
作者单位:1. 西南交通大学 信息科学与技术学院, 成都 610031; 2. DOCOMO Innovations公司, 美国加州 帕罗奥图, 94304
基金项目:国家自然科学基金资助项目(61170111,61262058)
摘    要:问题分类是问答系统任务之一。特别是语音交互方式中,用户的提问较短,具有口语化特征,利用传统文本分类方法对问题进行分类的效果不佳。为此提出一种基于语义扩展的短问题分类方法,该方法使用搜索引擎对问题进行知识扩展;然后,使用主题模型进行特征词选择;最后,利用词语相似度计算获取问题的类别。实验结果表明,所提方法在1365条真实问题集上平均F-measure值达到0.713,其值高于支持向量机(SVM)、K近邻(KNN)算法和最大熵方法。因此,该方法在问答系统中可以帮助系统提升问题分类的准确率。

关 键 词:主题模型    问题分类    搜索引擎    问答系统
收稿时间:2014-10-16
修稿时间:2014-11-18

Short question classification based on semantic extensions
YE Zhonglin , YANG Yan , JIA Zhen , YIN Hongfeng.Short question classification based on semantic extensions[J].journal of Computer Applications,2015,35(3):792-796.
Authors:YE Zhonglin  YANG Yan  JIA Zhen  YIN Hongfeng
Affiliation:1. School of Information Science and Technology, Southwest Jiaotong University, Chengdu Sichuan 610031, China;
2. DOCOMO Innovations Incorporation, Palo Alto CA, 94304 USA
Abstract:Question classification is one of the tasks in question answering system. Since questions often have rare words and colloquial expressions, especially in the application of voice interaction, the traditional text classifications perform poorly in short question classification. Thus a short question classification algorithm was proposed, which was based on semantic extensions and used the search engine to extend knowledge for short questions, the question's category was got by selecting features with the topic model and calculating the word similarity. The experimental results show that the proposed method can get F-measure value of 0.713 in a set of 1365 real problems, which is higher than that of Support Vector Machine (SVM), K-Nearest Neighbor (KNN) algorithm and maximum entropy algorithm. Therefore, the accuracy of the question classification can be improved by above method in question answering system.
Keywords:topic model  question classification  search engine  question answering system
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号