首页 | 本学科首页   官方微博 | 高级检索  
     


Coding and decoding strategies for multi-class learning problems
Affiliation:1. Novartis Institutes for BioMedical Research, 5300 Chiron Way, Emeryville, CA 94608-2916, USA;2. Target Discovery Institute and Structural Genomics Consortium, University of Oxford, NDM Research Building, Roosevelt Drive, Oxford OX3 7FZ, UK;3. The Norwegian Structural Biology Centre, Department of Chemistry, University of Tromsø, Tromsø, Norway;4. Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, CH-4002 Basel, Switzerland;5. Eli Lilly and Company, Lilly Research Laboratories, Lilly Corporate Center, DC 1931, Indianapolis, IN 46285, USA;6. University of Chicago, Dept. of Biochemistry & Molecular Biology, 929 E. 57th St., Chicago 60637, IL, USA;7. Department of Chemistry, Vanderbilt University, Nashville, TN 37232, USA;8. Astex Pharmaceuticals, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom;9. DiscoveRx Corporation, 42501 Albrae Street, Suite 100, Fremont, CA 94538, USA
Abstract:It is known that the error correcting output code (ECOC) technique, when applied to multi-class learning problems, can improve generalisation performance. One reason for the improvement is its ability to decompose the original problem into complementary two-class problems. Binary classifiers trained on the sub-problems are diverse and can benefit from combining using a simple distance-based strategy. However there is some discussion about why ECOC performs as well as it does, particularly with respect to the significance of the coding/decoding strategy. In this paper we consider the binary (0,1) code matrix conditions necessary for reduction of error in the ECOC framework, and demonstrate the desirability of equidistant codes. It is shown that equidistant codes can be generated by using properties related to the number of 1’s in each row and between any pair of rows. Experimental results on synthetic data and a few popular benchmark problems show how performance deteriorates as code length is reduced for six decoding strategies.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号