使用网络搜索引擎计算汉语词汇的语义相似度 Calculation of Chinese Words Semantic Similarity Using Network Search Engines期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

使用网络搜索引擎计算汉语词汇的语义相似度

引用本文：	高国强,黄吕威,陈丰钰. 使用网络搜索引擎计算汉语词汇的语义相似度[J]. 计算机技术与发展, 2014, 0(7): 84-87

作者姓名：	高国强黄吕威陈丰钰

作者单位：	武汉纺织大学传媒学院,湖北武汉430073

基金项目：	湖北省自然科学基金（2013CFB310）;湖北教育科研项目（B2013205）;湖北省高等学校2013年省级大学生创新创业训练计划项目（2013CXZD027）;2013年武汉纺织大学大学生创新创业训练计划项目（2013CXXL008,2013CXXL009）

摘要：	汉字词语的语义相似度计算是中文信息处理中的一个关键问题。文中利用网络搜索引擎提供的信息来计算汉语词对的语义相似性。首先通过程序访问搜索引擎,获取汉字词汇的搜索结果数,并依此实现了相似度计算模型WebPMI;然后描述了根据查询返回的文本片段进行语义相关性分析的模型CODC;最后,结合这个两个模型,给出了文中算法的伪代码。实验结果显示,文中的算法较好地利用了互联网信息,实现了一种较新的汉语词汇语义相似度计算方法,接近于利用词典提供的信息计算相似度的传统算法。
关键词：	相似度搜索引擎词典
Calculation of Chinese Words Semantic Similarity Using Network Search Engines

GAO Guo-qiang,HUANG Lii-wei,CHEN Feng-yu. Calculation of Chinese Words Semantic Similarity Using Network Search Engines[J]. Computer Technology and Development, 2014, 0(7): 84-87

Authors:	GAO Guo-qiang HUANG Lii-wei CHEN Feng-yu

Affiliation:	( School of Media and Communication, Wuhan Textile University, Wuhan 430073, China)

Abstract:	Similarity computation of Chinese words is a key problem in Chinese information processing. It measures semantic similarity between Chinese words using the information returned by web search engines. First,implement a model named WebPMI which computes similarity using page counts,and then,describe another model named CODC which analyzes semantic similarity using text snippets. Final-ly,present the algorithm based on the two models. Experimental results show that this algorithm outperforms all the existing web-based semantic similarity measures for Chinese,and is close to the traditional semantic similarity measures using lexicon.

Keywords:	similarity search engines lexicon
本文献已被维普等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏