首页 | 本学科首页   官方微博 | 高级检索  
     


An Intelligent Information System for Organizing Online Text Documents
Authors:Email author" target="_blank">Han-joon?KimEmail author  Sang-goo?Lee
Affiliation:(1) Department of Electrical and Computer Engineering, The University of Seoul, 90 Jeonnong-dong, Dongdaemun-gu, 130-743 Seoul, Korea;(2) School of Computer Science and Engineering, Seoul National University, Seoul, Korea
Abstract:This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and then evolves so as to provide a good indexing service as the document collection grows or its usage changes. To this end, we propose a series of algorithms that utilize text-mining technologies such as document clustering, document categorization, and hierarchy reorganization. In particular, clustering and categorization algorithms have been intensively studied in order to provide evolving facilities for hierarchical structures and categorization criteria. Through experiments using the Reuters-21578 document collection, we evaluate the performance of the proposed clustering and categorization methods by comparing them to those of well-known conventional methods.
Keywords:Document categorization  Document clustering  Fuzzy relations  Hierarchical agglomerative clustering  Information systems  Na?ve Bayes  Topic hierarchy
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号