首页 | 本学科首页   官方微博 | 高级检索  
     

面向化工领域的Web文本搜索与分类
引用本文:于海英,潘云东,李亮,万乐.面向化工领域的Web文本搜索与分类[J].计算机与应用化学,2006,23(3):279-281.
作者姓名:于海英  潘云东  李亮  万乐
作者单位:1. 陕西科技大学化学与化工学院,陕西,咸阳,712081
2. 北京理工大学计算机科学与工程系,北京,100081
摘    要:随着网络信息资源的迅速增加,对于主题Web文本信息的搜索与分类日益成为信息处理领域的一个重要问题。本文建立了一个面向化工领域的Web文本搜索与分类系统,该系统在crawler子系统搜集Web文档的基础上,利用支持向量机对网页进行二次分类,找出化工专业中文网页;然后利用向量空间模型,对分类好的专业网页进行多子类分类。与综合搜索引擎相比,具有速度快、搜索信息准确度高和具备学习能力的特点。

关 键 词:化工  文本搜索与分类  搜集器  支持向量机  向量空间模型
文章编号:1001-4160(2006)03-279-281
收稿时间:2004-11-30
修稿时间:2004-11-302005-04-01

Chemical industry-oriented Web page searching and categorization
YU HaiYing,PAN YunDong,LI Liang,WAN Le.Chemical industry-oriented Web page searching and categorization[J].Computers and Applied Chemistry,2006,23(3):279-281.
Authors:YU HaiYing  PAN YunDong  LI Liang  WAN Le
Affiliation:1. Shaanxi University of Science and Technology, Xianyang, 712081, Shanxi, China; 2. Deptment of Computer Science and Engineering, Beijing Institute of Technology, Beijing, 100081, China
Abstract:With the development of the Internet information, Web page searching and categorition is becoming increasingly and more important in field of information processing. This paper gives such a system of Web searching and caregorization which is chemical industry oriented. After Web page collection by the sub system crawler,the system classifies these Web pages by 2-kind with SVM( support vector machine) to find Chinese pages about chemical industry. Then VSM(vector space model) is used to classify the pages into several child classes. The system is faster,more accurate than the common search engine and it also can learn from Web pages.
Keywords:chemical industry  Web page searching and categorization  SVM  VSM
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号