首页 | 本学科首页   官方微博 | 高级检索  
     


SELECTING EFFECTIVE FEATURES AND RELATIONS FOR EFFICIENT MULTI‐RELATIONAL CLASSIFICATION
Authors:Jun He  Hongyan Liu  Bo Hu  Xiaoyong Du  Puwei Wang
Affiliation:1. Key Laboratory of Data Engineering and Knowledge Engineering, MOE, China;2. School of Information, Renmin University of China;3. School of Economics and Management, Tsinghua University
Abstract:Feature selection is an essential data processing step to remove irrelevant and redundant attributes for shorter learning time, better accuracy, and better comprehensibility. A number of algorithms have been proposed in both data mining and machine learning areas. These algorithms are usually used in a single table environment, where data are stored in one relational table or one flat file. They are not suitable for a multi‐relational environment, where data are stored in multiple tables joined to one another by semantic relationships. To address this problem, in this article, we propose a novel approach called FARS to conduct both Feature And Relation Selection for efficient multi‐relational classification. Through this approach, we not only extend the traditional feature selection method to select relevant features from multi‐relations, but also develop a new method to reconstruct the multi‐relational database schema and eliminate irrelevant tables to improve classification performance further. The results of the experiments conducted on both real and synthetic databases show that FARS can effectively choose a small set of relevant features, thereby enhancing classification efficiency and prediction accuracy significantly.
Keywords:feature selection  classification  multi‐relational classification
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号