首页 | 本学科首页   官方微博 | 高级检索  
     


Integrated Fisher linear discriminants: An empirical study
Affiliation:1. RECOD Lab, Institute of Computing (IC), University of Campinas (Unicamp) – Av. Albert Einstein, 1251, Campinas 13083-852, SP, Brazil;2. Department of Computer Engineering and Industrial Automation (DCA), School of Electrical and Computer Engineering (FEEC), University of Campinas (Unicamp) – Av. Albert Einstein, 400, Campinas 13083-852, SP, Brazil;3. Paris-Est University, IGN/SR, MATIS Lab, 73 avenue de Paris, 94160 Saint-Mandé, France;4. CNAM, CEDRIC Lab, 292 rue Saint-Martin, 75141 Paris Cedex 03, France;1. State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China;2. Department of Electrical Engineering, Columbia University, New York, NY 10027, USA;3. Facebook, 1601 Willow Rd, Menlo Park, CA 94025, USA;1. Universidad Autónoma de Aguascalientes, Department of Computer Science, Av. Universidad 940, Col. Ciudad Universitaria, Aguascalientes 20131, Aguascalientes, México
Abstract:This paper studies Fisher linear discriminants (FLDs) based on classification accuracies for imbalanced datasets. An optimal threshold is found out from a series of empirical formulas developed, which is related not only to sample sizes but also to distribution regions. A mixed binary–decimal coding system is suggested to make the very dense datasets sparse and enlarge the class margins on condition that the neighborhood relationships of samples are nearly preserved. The within-class scatter matrices being or approximately singular should be moderately reduced in dimensionality but not added with tiny perturbations. The weight vectors can be further updated by a kind of epoch-limited (three at most) iterative learning strategy provided that the current training error rates come down accordingly. Putting the above ideas together, this paper proposes a type of integrated FLDs. The extensive experimental results over real-world datasets have demonstrated that the integrated FLDs have obvious advantages over the conventional FLDs in the aspects of learning and generalization performances for the imbalanced datasets.
Keywords:Fisher linear discriminants  Imbalanced datasets  Empirical thresholds  Neighborhood-preserving transformations  Iterative learning
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号