融合词性特征的中文句子相似度计算方法 Method of computing Chinese sentence similarity based on part-of-speech feature期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

融合词性特征的中文句子相似度计算方法

引用本文：	吴浩,艾山·吾买尔,卡哈尔江·阿比的热西提,王路路,吐尔根·依布拉音.融合词性特征的中文句子相似度计算方法[J].计算机工程与设计,2020,41(1):150-155.

作者姓名：	吴浩艾山·吾买尔卡哈尔江·阿比的热西提王路路吐尔根·依布拉音

作者单位：	新疆大学信息科学与工程学院,新疆乌鲁木齐830046;新疆大学新疆多语种信息技术实验室,新疆乌鲁木齐830046;新疆大学信息科学与工程学院,新疆乌鲁木齐830046;新疆大学新疆多语种信息技术实验室,新疆乌鲁木齐830046;新疆大学信息科学与工程学院,新疆乌鲁木齐830046;新疆大学新疆多语种信息技术实验室,新疆乌鲁木齐830046;新疆大学信息科学与工程学院,新疆乌鲁木齐830046;新疆大学新疆多语种信息技术实验室,新疆乌鲁木齐830046;新疆大学信息科学与工程学院,新疆乌鲁木齐830046;新疆大学新疆多语种信息技术实验室,新疆乌鲁木齐830046

基金项目：	国家自然科学基金;重点实验室开放基金

摘要：	为解决近年来使用依存分析等语法信息计算句子相似度存在的手工标注代价较大、自动标注准确率低影响性能等问题,结合现有的句子相似度算法,提出两种方法融合词性特征计算句子相似度。在高精度的自动词性标注基础上,方法一通过词性信息调整不同词性的单词对句子相似度的影响,方法二使用词性信息选择句子中较为关键的单词进行计算。对比实验中,方法一在实验任务中取得了最高的准确率,方法二具有较优的准确率和较快计算速度,实验结果表明了两种方法的有效性。
关键词：	句子相似度词性权重词向量语义
Method of computing Chinese sentence similarity based on part-of-speech feature

Affiliation:	(College of Information Science and Engineering,Xinjiang University,Urumqi 830046,China;Xinjiang Laboratory of Multi-Language Information Technology,Xinjiang University,Urumqi 830046,China)

Abstract:	To solve the problems of high cost of manual tagging and low accuracy of automatic tagging in sentence similarity calculation using syntactic information such as dependency parsing in recent years,two methods were proposed to compute sentence similarity using POS(part-of-speech)features.On the basis of high-precision automatic POS tagging,the first method was used to adjust the influence of different words on sentence similarity through POS information,and the second method was used to select the key words adopting POS information in the sentence for calculation.Results of contrast experiments show that the first method achieves the highest accuracy in the experimental tasks,and the second method has acceptable accuracy and high calculation speed at the same time.

Keywords:	sentence similarity POS word2vec word weight semantic
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏