首页 | 本学科首页   官方微博 | 高级检索  
     


Mixing numerical and categorical data in a Self-Organizing Map by means of frequency neurons
Affiliation:1. Institut de Physique Théorique de Saclay, CNRS/UMR3681, CEA Saclay, F-91191 Gif-sur-Yvette, France;2. Institute for Nuclear Theory, University of Washington, Seattle, WA 98195-1550, USA;1. Program for Resource Efficient Communities, Institute of Food and Agricultural Sciences, 2295 Mowry Road, Building #0106, PO Box 110940, University of Florida, Gainesville, FL 32611, USA;2. Department of Wildlife Ecology and Conservation, Institute of Food and Agricultural Sciences, 215 Newins-Ziegler Hall, PO Box 110430, University of Florida, Gainesville, FL 32611, USA;3. Physics Department, 65-30 Kissena Boulevard, Science Building Room B322, Queens College, City University of New York, Flushing, NY 11367, USA;1. Department of Information Technology, Faculty of Computers and Information, Menofiya University, Shebin El Kom, Menofiya, Egypt;2. Department of Computer Systems, Faculty of Computers and Information, Ain Shams University, Cairo, Egypt;3. Department of Information Systems, Faculty of Computers and Information, Menofiya University, Shebin El Kom, Menofiya, Egypt
Abstract:Even though Self-Organizing Maps (SOMs) constitute a powerful and essential tool for pattern recognition and data mining, the common SOM algorithm is not apt for processing categorical data, which is present in many real datasets. It is for this reason that the categorical values are commonly converted into a binary code, a solution that unfortunately distorts the network training and the posterior analysis. The present work proposes a SOM architecture that directly processes the categorical values, without the need of any previous transformation. This architecture is also capable of properly mixing numerical and categorical data, in such a manner that all the features adopt the same weight. The proposed implementation is scalable and the corresponding learning algorithm is described in detail. Finally, we demonstrate the effectiveness of the presented algorithm by applying it to several well-known datasets.
Keywords:Self-Organizing Map  Categorical data  Mixed data  Big data
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号