Conditional entropy for incomplete decision systems and its application in data mining |
| |
Authors: | Jianhua Dai Qing Xu Wentao Wang Haowei Tian |
| |
Affiliation: | 1. College of Computer Science, Zhejiang University , Hangzhou , 310027 , China jhdai@126.com;3. College of Computer Science, Zhejiang University , Hangzhou , 310027 , China |
| |
Abstract: | Rough set theory is a useful mathematic tool for dealing with vague and uncertain information. Shannon's entropy and its variants have been applied to measure uncertainty in rough set theory from the viewpoint of information theory. However, few studies have been carried out on information-theoretical measure of attribute importance in incomplete decision system (IDS) considering the relation between decision attribute and condition attributes. In this paper, we introduce the concept of conditional entropy together with entropy and joint entropy in IDSs. By using the new conditional entropy, we propose a measure for attribute importance. Based on the measure, a heuristic attribute reduction algorithm is presented. Some test experiments on real-lift data-sets show the effectiveness of the algorithm. The attribute importance measure and the attribute reduction algorithm can be used in data mining or machine learning for handling incomplete data. |
| |
Keywords: | rough set theory incomplete decision systems conditional entropy Shannon entropy attribute reduction |
|
|