首页 | 本学科首页   官方微博 | 高级检索  
     


Toward multidatabase mining: identifying relevant databases
Authors:Huan Liu Hongjun Lu Jun Yao
Affiliation:Dept. of Comput. Sci., Arizona State Univ., Tempe, AZ;
Abstract:Various tools and systems for knowledge discovery and data mining have been developed and are available for applications. However, when there are many databases, an immediate question is where one should start mining. It is not true that data mining is better the more databases there are. It is only true when the databases involved are relevant to the task at hand. By breaking away from the conventional data mining assumption that many databases should be joined into one, we argue that the first step for multidatabase mining is to identify databases that are most relevant to an application; without doing so, the mining process can be lengthy, aimless, and ineffective. A measure of relevance is thus proposed for mining tasks with the objective of finding patterns or regularities of certain attributes. An efficient algorithm for identifying relevant databases is described. Experiments are conducted to verify the measure's performance and to exemplify its application
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号