SYNTACTIC SIMPLIFICATION AND SEMANTIC ENRICHMENT—TRIMMING DEPENDENCY GRAPHS FOR EVENT EXTRACTION |
| |
Authors: | Ekaterina Buyko Erik Faessler Joachim Wermter Udo Hahn |
| |
Affiliation: | Jena University Language and Information Engineering (JULIE) Lab, Friedrich‐Schiller‐Universit?t Jena, Jena, Germany |
| |
Abstract: | In our approach to event extraction, dependency graphs constitute the fundamental data structure for knowledge capture. Two types of trimming operations pave the way to more effective relation extraction. First, we simplify the syntactic representation structures resulting from parsing by pruning informationally irrelevant lexical material from dependency graphs. Second, we enrich informationally relevant lexical material in the simplified dependency graphs with additional semantic meta data at several layers of conceptual granularity. These two aggregation operations on linguistic representation structures are intended to avoid overfitting of machine learning‐based classifiers which we use for event extraction (besides manually curated dictionaries). Given this methodological framework, the corresponding JReX system developed by the Julie Lab Team from Friedrich‐Schiller‐Universität Jena (Germany) scored on 2nd rank among 24 competing teams for Task 1 in the “BioNLP’09 Shared Task on Event Extraction,” with 45.8% recall, 47.5% precision and 46.7% F1‐score on all 3,182 events. In more recent experiments, based on slight modifications of JReX and using the same data sets, we were able to achieve 45.9% recall, 57.7% precision, and 51.1% F1‐score. |
| |
Keywords: | biomedical natural language processing dependency parsing event extraction relation extraction |
|
|