基于Web页面有效信息抽取的分类方法 Web Page Classification Method based on Effective Information Extraction期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于Web页面有效信息抽取的分类方法

引用本文：	王立建,尹四清. 基于Web页面有效信息抽取的分类方法[J]. 电脑开发与应用, 2010, 23(6): 71-73

作者姓名：	王立建尹四清

作者单位：	中北大学电子与计算机科学技术学院,太原,030051;中北大学软件学院,太原,030051

摘要：	随着Internet的迅猛发展,Web上的网页数目呈现指数级的爆炸性增长趋势,在Web上检索及发现有价值的信息已成为了一项重要的任务,噪音的出现往往会降低基于页面处理的各种算法的效率。因此,如何删除页面的噪音,提取页面中的主要内容是Web挖掘中的重要问题。给出了抽取网页中各种分类有效的文本的具体实现。
关键词：	Web有效信息信息抽取网页分类
Web Page Classification Method based on Effective Information Extraction

Abstract:	With the Internet＇s rapid development, Web on the number of pages showing the explosive exponential growth trend. In the Web, search and discover valuable information that has become an important task. ＂Noise＂ tends to reduce the appearance of the page-based processing the efficiency of various algorithms. Therefore, how to remove noise pages, extract pages of the main contents of the Web mining in the important issues. In this paper, extract pages of the various classifications of effective concrete realization of the text.

Keywords:	valid information Web information extraction Web page classification
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏