Likelihood-Based Data Squashing: A Modeling Approach to Instance Construction |
| |
Authors: | David Madigan Nandini Raghavan William Dumouchel Martha Nason Christian Posse Greg Ridgeway |
| |
Affiliation: | (1) Rutgers University, USA;(2) AT&T Labs—Research, USA;(3) Talaria, Inc, USA;(4) University of Washington, USA |
| |
Abstract: | Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analyses carried out on the original dataset. Likelihood-based data squashing (LDS) differs from a previously published squashing algorithm insofar as it uses a statistical model to squash the data. The results show that LDS provides excellent squashing performance even when the target statistical analysis departs from the model used to squash the data. |
| |
Keywords: | instance construction data compression |
本文献已被 SpringerLink 等数据库收录! |