The Role of Occam's Razor in Knowledge Discovery |
| |
Authors: | Pedro Domingos |
| |
Affiliation: | (1) Department of Computer Science and Engineering, University of Washington, Seattle, WA, 98195 |
| |
Abstract: | Many KDD systems incorporate an implicit or explicit preference for simpler models, but this use of Occam's razor has been strongly criticized by several authors (e.g., Schaffer, 1993; Webb, 1996). This controversy arises partly because Occam's razor has been interpreted in two quite different ways. The first interpretation (simplicity is a goal in itself) is essentially correct, but is at heart a preference for more comprehensible models. The second interpretation (simplicity leads to greater accuracy) is much more problematic. A critical review of the theoretical arguments for and against it shows that it is unfounded as a universal principle, and demonstrably false. A review of empirical evidence shows that it also fails as a practical heuristic. This article argues that its continued use in KDD risks causing significant opportunities to be missed, and should therefore be restricted to the comparatively few applications where it is appropriate. The article proposes and reviews the use of domain constraints as an alternative for avoiding overfitting, and examines possible methods for handling the accuracy–comprehensibility trade-off. |
| |
Keywords: | model selection overfitting multiple comparisons comprehensible models domain knowledge |
本文献已被 SpringerLink 等数据库收录! |
|