首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进编辑距离的中文相似句子检索
引用本文:车万翔,刘挺,秦兵,李生.基于改进编辑距离的中文相似句子检索[J].高技术通讯,2004,14(7):15-19.
作者姓名:车万翔  刘挺  秦兵  李生
作者单位:哈尔滨工业大学计算机学院信息检索研究室,哈尔滨,150001
基金项目:863计划 (2 0 0 2AA14 70 2 0 11),国家自然科学基金(60 2 0 3 0 2 0 )资助项目
摘    要:中文相似句子检索的方法在基于实例的机器翻译等中文信息处理领域,具有非常广泛的应用背景。本文提出的基于改进编辑距离的中文相似句子检索方法,在使用信息检索技术提高检索效率的同时,以普通编辑距离算法为基础,加入了词汇的语义信息,使之更加符合中文句子相似度计算的要求。改进编辑距离与单纯基于语义辞典计算句子相似度的方法相比,具有便于扩展,准确率高等优点。在基于大规模双语句对检索的英文辅助写作系统中使用该算法进行中文句子检索,最后获得了81.33%的查准率和95.31%的查全率。

关 键 词:改进编辑距离  中文  相似句子  检索  英文  辅助写作  机器翻译

Similar Chinese Sentence Retrieval Based on Improved Edit-distance
Che Wanxiang,Liu Ting,Qin Bing,Li Sheng.Similar Chinese Sentence Retrieval Based on Improved Edit-distance[J].High Technology Letters,2004,14(7):15-19.
Authors:Che Wanxiang  Liu Ting  Qin Bing  Li Sheng
Abstract:The approach of similar Chinese sentence retrieval has been used widely in the field of Chinese information processing, such as Example Based Machine Translation (EBMT) and so on. This paper proposes the approach of similar Chinese sentence retrieval based on improved edit distance. It not only uses the technology of information retrieval to improve the efficiency of retrieval, but also adds the semantic information of words into the normal edit distance approach. The new approach is more consistent with the computation of Chinese sentence similarity. The approach of improved edit distance has more advantages than original edit distance algorithm, such as easily extending, high precision and so on. The new approach is used in the English writing assistant system based on a large bilingual sentences pairs and achieves 81 33% precision and 95 31% recall.
Keywords:Improved edit-distance  Similar sentence retrieval  English writing assistant
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号