首页 | 本学科首页   官方微博 | 高级检索  
     


Annotating discourse markers in spontaneous speech corpora on an example for the Slovenian language
Authors:Darinka Verdonik  Matej Rojc  Marko Stabej
Affiliation:(1) Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova ul.17, Maribor, 2000, Slovenia;(2) Faculty of Arts, University of Ljubljana, Ljubljana, Slovenia
Abstract:Speech-to-speech translation technology has difficulties processing elements of spontaneity in conversation. We propose a discourse marker attribute in speech corpora to help overcome some of these problems. There have already been some attempts to annotate discourse markers in speech corpora. However, as there is no consistency on what expressions count as discourse markers, we have to reconsider how to set a framework for annotating, and, in order to better understand what we gain by introducing a discourse marker category, we have to analyse their characteristics and functions in discourse. This is especially important for languages such as Slovenian where no or little research on the topic of discourse markers has been carried out. The aims of this paper are to present a scheme for annotating discourse markers based on the analysis of a corpus of telephone conversations in the tourism domain in the Slovenian language, and to give some additional arguments based on the characteristics and functions of discourse markers that confirm their special status in conversation.
Keywords:Discourse markers  Speech corpora  Annotating  Conversation  Discourse analysis  Speech-to-speech translation  Spontaneous speech  Slovenian language
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号