基于Corpus库的词语相似度计算方法 Measurement of word similarity based on Corpus期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于Corpus库的词语相似度计算方法

引用本文：	章志凌,虞立群,陈奕秋,罗海飞,邵晓敏.基于Corpus库的词语相似度计算方法[J].计算机应用,2006,26(3):638-0640.

作者姓名：	章志凌虞立群陈奕秋罗海飞邵晓敏

作者单位：	上海交通大学,软件学院,上海,200030

基金项目：	交大数字家电实验室“Advanced information retrieval technology using the knowledge base”项目

摘要：	构建了一个语义关联库，称为Corpus库，该库使用词语空间乖关系空间结构化地存储了词语和其上下文之间的统计信息，并通过阅读大量的预料数据来训练其相关数据。详细介绍了Corpus库的训练方法，并对训练过程中出现的大量关系提出了裁剪方案。在此基础上，通过构建词语的上下文关系向量提出了一种词语相似度算法。实验证明这是一种有效的对词语相似度进行计算的方法。
关键词：	词语相似度信息检索
文章编号：	1001-9081（2006）03-0638-03
收稿时间：	2005-09-30
修稿时间：	2005-09-302005-12-21
Measurement of word similarity based on Corpus

ZHANG Zhi-ling,YU Li-qun,CHEN Yi-qiu,LUO Hai-fei,SHAO Xiao-min.Measurement of word similarity based on Corpus[J].journal of Computer Applications,2006,26(3):638-0640.

Authors:	ZHANG Zhi-ling YU Li-qun CHEN Yi-qiu LUO Hai-fei SHAO Xiao-min

Affiliation:	Software College, Shanghai Jiao Tong University, Shanghai 200030, China

Abstract:	A semantic relevant database named Corpus was built to store the required information in word similarity t. Corpus got the information from large scale text training and store the information in word space and relation space after analysis and tailoring. The word similarity measurement algorithm by constructing the context relation vectors based on Corpus was given, which proved to be a feasible method by experiments.

Keywords:	Corpus
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏