首页 | 本学科首页   官方微博 | 高级检索  
     


Syllabification rules versus data-driven methods in a language with low syllabic complexity: The case of Italian
Authors:Connie R. Adsett, Yannick Marchand,Vlado Kes   elj
Affiliation:aInstitute for Biodiagnostics (Atlantic), National Research Council Canada, 1796 Summer Street, Suite 3900, Halifax, Nova Scotia, Canada B3H 3A7;bFaculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada B3H 1W5
Abstract:Linguistic rules have been assumed to be the best technique for determining the syllabification of unknown words. This has recently been challenged for the English language where data-driven algorithms have been shown to outperform rule-based methods. It may be possible, however, that data-driven methods are only better for languages with complex syllable structures. In this study, three rule-based automatic syllabification systems and two data-driven automatic syllabification systems (Syllabification by Analogy and the Look-Up Procedure) are compared on a language with lower syllabic complexity – Italian. Comparing the performance using a lexicon containing 44,720 words, the best data-driven algorithm (Syllabification by Analogy) achieved 97.70% word accuracy while the best rule set correctly syllabified 89.77% words. These results show that data-driven methods can also outperform rule-based methods on Italian syllabification, a language of low syllabic complexity.
Keywords:Syllabification   Italian language   Rule-based systems   Data-driven methods   Analogy
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号