Information synthesis based on hierarchical maximum entropy discretization |
| |
Authors: | DAVID K Y CHIU BENNY CHEUNG ANDREW K C WONG |
| |
Affiliation: | 1. Department of Computing &2. Information Science , University of Guelph , Guelph, Ontario, NIG 2W1, Canada E-mail: dchiu@snowhite.cis.uoguelph.ca;3. Department of Systems Design Engineering , University of Waterloo , Waterloo, Ontario, N2L 3GI, Canada E-mail: akcwong@watsup.bitnet |
| |
Abstract: | Abstract This paper outlines a new approach to the synthesis of information from data. Information is defined as a detected organization of data after a process of discretization (or partitioning) and event covering. The discretization is based on a hierarchical maximum entropy scheme which iteratively minimizes the loss of information according to Shannon. The event-covering process is based on an evaluation of the deviation of the observed frequencies of an event from the expectation due to prior knowledge (defined by the null hypothesis and/or domain knowledge). The hierarchical maximum entropy discretization scheme provides a rigorous and efficient way in solving the non-uniform scaling problem in multivariate data analysis. Because our method refines the boundaries dynamically depending on the detection of information, it directs the analysis on the outcome subspace with high information content. In addition, it naturally produces a hierarchical view of information so that data can be analyzed/synthesized with respect to an outcome context. The method has been tested using simulated and real life data with very good result. |
| |
Keywords: | maximum entropy discretization probabilistic uncertainty outcome context data synthesis event covering relational information |
|
|