IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms

Authors:	Walter Daelemans Antal Van Den Bosch Ton Weijters

Affiliation:	(1) Computational Linguistics, Tilburg University, The Netherlands. E-mail;(2) MATRIKS, Maastricht University, The Netherlands. E-mail

Abstract:	We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letter–phoneme transliteration and part-of-speech-tagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, information-gain-weighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as information-gain-weighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large number of training instances described by symbolic features with sufficiently differing information gain values.

Keywords:	lazy learning eager learning decision trees information gain data compression instance base indexing
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏