Good edit similarity learning by loss minimization |
| |
Authors: | Aurélien Bellet Amaury Habrard Marc Sebban |
| |
Affiliation: | 1. Laboratoire Hubert Curien UMR 5516, 18 rue Benoit Lauras, 42000, St-Etienne, France
|
| |
Abstract: | Similarity functions are a fundamental component of many learning algorithms. When dealing with string or tree-structured data, measures based on the edit distance are widely used, and there exist a few methods for learning them from data. However, these methods offer no theoretical guarantee as to the generalization ability and discriminative power of the learned similarities. In this paper, we propose an approach to edit similarity learning based on loss minimization, called GESL. It is driven by the notion of (?,??,??)-goodness, a theory that bridges the gap between the properties of a similarity function and its performance in classification. Using the notion of uniform stability, we derive generalization guarantees that hold for a large class of loss functions. We also provide experimental results on two real-world datasets which show that edit similarities learned with GESL induce more accurate and sparser classifiers than other (standard or learned) edit similarities. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|