有效地检索HTML文档 EFFECTIVELY RETRIEVE HTML DOCUMENTS期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

有效地检索HTML文档

引用本文：	刘芳,卢正鼎.有效地检索HTML文档[J].小型微型计算机系统,2000,21(9):986-988.

作者姓名：	刘芳卢正鼎

作者单位：	华中理工大学计算机学院应用系,武汉,430074

基金项目：	国防预研基金赞助

摘要：	ＷＷＷ上的资源大多以ＨＴＭＬ格式的文档存储,同普通文档不同,ＴＨＭＬ文档的标签特性使得它具有一定的结构我们采取了一种检索,它扩展了传统的传统检索,利用ＨＴＭＬ文档结构提高了在ＷＷＷ环境下的检索和率。本文介绍了ＨＴＭＬ的结构以及传统的向量空间信息检索提出了运用聚族方法为标符合分组;最后详细讨论了如何利用文棣结构扩展加权架,使得检索词能更贴切地描述文档,以提高检索的准确性。
关键词：	信息检索向量空间模型聚簇 HTML文档 WWW
修稿时间：	1999-09-15
EFFECTIVELY RETRIEVE HTML DOCUMENTS

LIU Fang,LU Zheng-ding.EFFECTIVELY RETRIEVE HTML DOCUMENTS[J].Mini-micro Systems,2000,21(9):986-988.

Authors:	LIU Fang LU Zheng-ding

Abstract:	The information resources in WWW are mostly stored as HTML. Unlike norm al documents, the HTML documents is structured. In this paper, we propose a meth od for making use of the structure to effectively retrieve HTML documents. This method derived from the traditional information retrieval. First, we describe th e structure of HTML and the traditional IR based on the vector space model. Then we propose our extending weighting schema and tags classes. Finally we provide the conclusion and future work.

Keywords:	WWW HTML Information retrieval Vector space model Clustering
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏