首页 | 本学科首页   官方微博 | 高级检索  
     


Discernible visualization of high dimensional data using label information
Affiliation:1. College of Foreign Studies, Yanshan University, No. 438 Hebei Street, Qinhuangdao 066004, Hebei, PR China;2. Institute of Electrical Engineering, Yanshan University, No. 438 Hebei Street, Qinhuangdao 066004, Hebei, PR China;3. College of International Programs, Shanghai International Studies University, No. 410 Dong Ti Yu Hui Road, Shanghai 200083, PR China;1. Graduate School of Information, Production and Systems, Waseda University, Hibikino 2-7, Wakamatsu-ku, Kitakyushu, Fukuoka 808-0135, Japan;2. Information, Production and Systems Research Center, Waseda University, Hibikino 2-7, Wakamatsu-ku, Kitakyushu, Fukuoka 808-0135, Japan;1. Department of Industrial Engineering and Management, Yuan-Ze University, Taiwan;2. Innovation Center for Big Data and Digital Convergence, Yuan-Ze University, Taiwan
Abstract:Visualization methods could significantly improve the outcome of automated knowledge discovery systems by involving human judgment. Star coordinate is a visualization technique that maps k-dimensional data onto a circle using a set of axes sharing the same origin at the center of the circle. It provides the users with the ability to adjust this mapping, through scaling and rotating of the axes, until no mapped point-clouds (clusters) overlap one another. In this state, similar groups of data are easily detectable. However an effective adjustment could be a difficult or even an impossible task for the user in high dimensions. This is specially the case when the input space dimension is about 50 or more.In this paper, we propose a novel method toward automatic axes adjustment for high dimensional data in Star Coordinate visualization method. This method finds the best two-dimensional view point that minimizes intra-cluster distances while keeping the inter-cluster distances as large as possible by using label information. We call this view point a discernible visualization, where clusters are easily detectable by human eye. The label information could be provided by the user or could be the result of performing a conventional clustering method over the input data. The proposed approach optimizes the Star Coordinate representation by formulating the problem as a maximization of a Fisher discriminant. Therefore the problem has a unique global solution and polynomial time complexity. We also prove that manipulating the scaling factor alone is effective enough for creating any given visualization mapping. Moreover it is showed that k-dimensional data visualization can be modeled as an eigenvalue problem. Using this approach, an optimal axes adjustment in the Star Coordinate method for high dimensional data can be achieved without any user intervention. The experimental results demonstrate the effectiveness of the proposed approach in terms of accuracy and performance.
Keywords:Visualization  Star Coordinate  High dimensionality reduction  Fisher‘s discriminant form
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号