首页 | 本学科首页   官方微博 | 高级检索  

DiSeg 1.0: The first system for Spanish discourse segmentation
Authors:Iria da Cunha,Eric San Juan,Juan Manuel Torres-Moreno,Marina Lloberese,Irene Castelló  ne
Affiliation:a Institut Universitari de Lingüística Aplicada (Universitat Pompeu Fabra): C/Roc Boronat n° 138, 08018 Barcelona, Spain
b Laboratoire Informatique d’Avignon (Université d’Avignon et des Pays de Vaucluse): 339, chemin des Meinajaries, Agroparc, BP 91228, 84911 Avignon, Cedex 9, France
c Instituto de Ingeniería (Universidad Nacional Autónoma de México): Circuito Escolar sn, Ciudad Universitaria, CP 04510, Delegación Coyoacán, México D.F., Mexico
d École Polytechnique de Montréal: C.P. 6079, succ. Centre-ville, Montréal, Québec, Canada H3C 3A7
e Universitat de Barcelona: Gran Via de les Corts Catalanes n° 585, 08007 Barcelona, Spain
Abstract:Nowadays discourse parsing is a very prominent research topic. However, there is not a discourse parser for Spanish texts. The first stage in order to develop this tool is discourse segmentation. In this work, we present DiSeg, the first discourse segmenter for Spanish, which uses the framework of Rhetorical Structure Theory and is based on lexical and syntactic rules. We describe the system and we evaluate its performance against a gold standard corpus, divided in a medical and a terminological subcorpus. We obtain promising results, which means that discourse segmentation is possible using shallow parsing.
Keywords:Discourse parsing   Discourse segmentation   Shallow parsing   Rhetorical Structure Theory
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号