Elementary Dependency Trees for Identifying Corpus-Specific Semantic Classes期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Elementary Dependency Trees for Identifying Corpus-Specific Semantic Classes

Authors:	B Habert C Fabre

Affiliation:	(1) UMR 9952 & LIMSI -- CNRS, Ecole Normale Supérieure de Fontenay-St Cloud, 31 av. Lombart, F-92260 Fontenay-aux-Roses (E-mail;(2) ERSS -- CNRS, 5 allées Antonio Machado, F-31058 Toulouse cédex (E-mail

Abstract:	Elementary dependency relationships between words within parse trees produced by robust analyzers on a corpus help automate the discovery of semantic classes relevant for the underlying domain. We introduce two methods for extracting elementary syntactic dependencies from normalized parse trees. The groupings which are obtained help identify coarse-grain semantic categories and isolate lexical idiosyncrasies belonging to a specific sublanguage. A comparison shows a satisfactory overlapping with an existing nomenclature for medical language processing. This symbolic approach is efficient on medium size corpora which resist to statistical clustering methods but seems more appropriate for specialized texts. This revised version was published online in July 2006 with corrections to the Cover Date.

Keywords:	clustering semantic acquisition noun phrase extraction
本文献已被 SpringerLink 等数据库收录！