首页 | 本学科首页   官方微博 | 高级检索  
     


ESTIMATING DBLEARN'S POTENTIAL FOR KNOWLEDGE DISCOVERY IN DATABASES
Authors:Howard J  Hamilton David R  Fudger
Affiliation:Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2
Abstract:We propose a procedure for estimating DBLEARN's potential for knowledge discovery, given a relational database and concept hierarchies. This procedure is most useful for evaluating alternative concept hierarchies for the same database. The DBLEARN knowledge discovery program uses an attribute-oriented inductive-inference method to discover potentially significant high-level relationships in a database. A concept forest, with at most one concept hierarchy for each attribute, defines the possible generalizations that DBLEARN can make for a database. The potential for discovery in a database is estimated by examining the complexity of the corresponding concept forest. Two heuristic measures are defined based on the number, depth, and height of the interior nodes. Higher values for these measures indicate more complex concept forests and arguably more potential for discovery. Experimental results using a variety of concept forests and four commercial databases show that in practice both measures permit quite reliable decisions to be made; thus, the simplest may be most appropriate.
Keywords:knowledge discovery  concept hierarchies  discovery potential  databases  machine learning
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号