首页 | 本学科首页   官方微博 | 高级检索  
     

基于矩阵加权关联规则的跨语言查询译后扩展
引用本文:黄名选,蒋曹清,何冬蕾.基于矩阵加权关联规则的跨语言查询译后扩展[J].模式识别与人工智能,2018,31(10):887-898.
作者姓名:黄名选  蒋曹清  何冬蕾
作者单位:1.广西跨境电商智能信息处理重点实验室培育基地(广西财经学院) 南宁 530003
2.广西财经学院 信息与统计学院 南宁 530003
基金项目:国家自然科学基金项目(No.61762006,61662003,61262028)资助
摘    要:首先提出矩阵加权项集支持度计算方法,给出面向跨语言查询扩展的矩阵加权关联模式挖掘算法.然后提出基于矩阵加权关联规则挖掘的跨语言查询译后扩展算法.借助机器翻译进行首次跨语言检索,得到前列初检文档,并经用户相关性判断后得到相关反馈文档.通过计算支持度从相关反馈文档中挖掘含有原查询词的矩阵加权频繁项集,通过置信度-兴趣度评价框架从频繁项集中提取含有原查询词的关联规则,将规则的后件或前件作为扩展词,利用规则的置信度和兴趣度衡量扩展词的重要性,完成跨语言查询译后扩展.在NTCIR-5 CLIR标准测试集上的实验表明,文中算法可以有效提升跨语言查询扩展性能,有利于长查询的跨语言检索,译后后件扩展性能优于前件.

关 键 词:矩阵加权关联模式  关联规则  查询扩展  跨语言信息检索  
收稿时间:2018-05-08

Cross Language Query Post-Translation Expansion Based on Matrix-Weighted Association Rules
HUANG Mingxuan,JIANG Caoqing,HE Donglei.Cross Language Query Post-Translation Expansion Based on Matrix-Weighted Association Rules[J].Pattern Recognition and Artificial Intelligence,2018,31(10):887-898.
Authors:HUANG Mingxuan  JIANG Caoqing  HE Donglei
Affiliation:1.Guangxi Key Laboratory Cultivation Base of Cross-Border E-commerce Intelligent information Processing, Guangxi University of Finance and Economics, Nanning 530003
2.School of Information and Statistics, Guangxi University of Finance and Economics, Nanning 530003
Abstract:A computing method for matrix-weighted itemset support is proposed firstly, and the algorithm of matrix-weighted association patterns mining for cross-language query expansion is presented. Then, the algorithm of cross-language query post-translation expansion is put forward based on matrix-weighted association rules mining. The first cross-language retrieval is performed to obtain the top initially retrieved documents(TIRDs) by machine translation, and the relevance feedback documents(RFDs) are gained from TIRDs by user correlation judgment. The matrix-weighted frequent itemsets containing original query terms are mined from RFDs by means of computing support and the association rules with original query terms are extracted from frequent itemsets according to the evaluation framework of confidence-interest. To implement cross-language query post-translation expansion, the consequents or antecedents of the rules are treated as expansion terms and the importance of the expansion terms is measured by the confidence and interest of the rule. Experiments on NTCIR-5 CLIR standard test set show that the proposed algorithm improves the performance of cross-language query expansion, and it is beneficial in cross-language retrieval of long queries. The performance of post-translation consequent expansion is better than that of the antecedent one.
Keywords:
点击此处可从《模式识别与人工智能》浏览原始摘要信息
点击此处可从《模式识别与人工智能》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号