首页 | 本学科首页   官方微博 | 高级检索  
     

基于PageRank和锚文本的网页排序研究
引用本文:刘菁菁,林鸿飞,赵晶. 基于PageRank和锚文本的网页排序研究[J]. 计算机工程与应用, 2007, 43(10): 170-173
作者姓名:刘菁菁  林鸿飞  赵晶
作者单位:大连理工大学,计算机科学与工程系,大连,116024;大连理工大学,计算机科学与工程系,大连,116024;大连理工大学,计算机科学与工程系,大连,116024
基金项目:国家自然科学基金 , 国家高技术研究发展计划(863计划)
摘    要:网页和纯文本结构差异性决定了传统的IR排序技术不能适应网络发展。为合理排序检索结果,引入了基于文献引文分析法原理的链接分析方法。该方法对被多个网页链接的网页赋予较高评价,同时考虑锚文本与查询词的相似度。源网页质量参差不齐,链向相同网页的锚文本质量也有优劣之分,但高质量源网页的锚文本不一定比质量低源网页的准确。对相似度高的锚文本加以修正,即通过计算查询词和锚文本相似度,对于相似度较高但源于PageRank值低的源网页的锚文本加以补偿,并重新排序查询结果。

关 键 词:链接分析  锚文本  PageRank  网页排序
文章编号:1002-8331(2007)10-0170-04
修稿时间:2006-11-01

Study on ranking Web pages based on pagerank and anchor text
LIU Jing-jing,LIN Hong-fei,ZHAO Jing. Study on ranking Web pages based on pagerank and anchor text[J]. Computer Engineering and Applications, 2007, 43(10): 170-173
Authors:LIU Jing-jing  LIN Hong-fei  ZHAO Jing
Affiliation:Department of Computer Science and Engineering,Dalian University of Technology,Dalian 116024,China
Abstract:Differences in the structure of Web pages and text make traditional IR methods not meet development of Web.In order to rank results properly,link analysis is introduced into results analysis,which based on academic citation analysis.It gives high evaluation to pages having many back links.Meantime,similarity between anchor text and query is paid attention to.The qualities of source pages vary much and anchor texts from different source pages also differentiate.But maybe anchor text from low quality source pages is more precious than those from high ones.This paper amends the anchor text having large similarity. That is-calculating the similarity between anchor texts and query and amending those anchor text of high similarity from source web pages with low PageRank.And then use the new value to rank Web pages again.
Keywords:link analysis   anchor text    PageRank    ranking Web page
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号