首页 | 本学科首页   官方微博 | 高级检索  
     

基于语义空间距离的相似问句识别方法研究
引用本文:苏玉兰,陈鑫,洪宇,朱朦朦,张民.基于语义空间距离的相似问句识别方法研究[J].中文信息学报,2021,35(12):36-46.
作者姓名:苏玉兰  陈鑫  洪宇  朱朦朦  张民
作者单位:苏州大学计算机科学与技术学院,江苏 苏州 215006
基金项目:国家自然科学基金(61672368); 国家重点研发计划(2017YFB1002104); 江苏省研究生科研与实践创新计划(SJCX19_0802)
摘    要:前沿相关研究将相似问句识别转化为二元问句匹配识别并取得很大进展。但是在自动问答系统的实际应用场景中存在大量数据,这些方法受限于二元问句匹配识别模式,导致时效性不高。针对这一问题,受人脸识别相关研究的启发,该文提出基于语义空间距离衡量的相似问句识别方法(Semantic Space Distance Method,SSDM)。该方法将相似问句识别作为多分类问题进行训练,通过利用人脸识别任务中Margin Softmax损失函数得到语义编码模型。该语义编码模型能够将相似问句在语义空间中聚合,不相似问句在语义空间中远离。SSDM方法将相似问句识别转化成语义空间中的向量距离计算,突破二元问句匹配的方式,保证了一定的高时效性,并且仍然能够在深层语义层面对相似问句进行识别。该方法在Biendata的ASQD数据集中实验测试,取得了比基线方法更优的性能,验证了SSDM方法的有效性。

关 键 词:相似问句识别  语义编码  多分类  
收稿时间:2019-03-22

A Semantic Distance Measure for Similar Question Identification
SU Yulan,CHEN Xin,HONG Yu,ZHU Mengmeng,ZHANG Min.A Semantic Distance Measure for Similar Question Identification[J].Journal of Chinese Information Processing,2021,35(12):36-46.
Authors:SU Yulan  CHEN Xin  HONG Yu  ZHU Mengmeng  ZHANG Min
Affiliation:School of Computer Science and Technology, Soochow University,Suzhou, Jiangsu 215006, China
Abstract:To address the low efficeiency limited by the binary classification made in the scenario of question answering system, this paper proposes a similar question identification method based on semantic space distance measure (SSDM), which is inspired by related research on face identification. This method obtains a semantic encoder by similar question multi classification process via the Margin Softmax introduced from face identification community. The semantic encoder can aggregate similar question in the semantic space, and make dissimilar questions to be far away from each other in semantic space. SSDM method transforms similar questions identification into vector distance calculation in semantic space, and breaks the binary question matching and guarantees a certain high efficiency. We test the SSDM method in the ASQD dataset from Biendata, and the experimental results show that the SSDM method is better in performance than the baseline method.
Keywords:similar question identification  semantic encoder  multi-classification  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号