Affiliation: | aJozef Stefan Institute, Department of Intelligent Systems, Jamova 39, 1000 Ljubljana, Slovenia bInstitute AIFB, University of Karlsruhe, Karlsruhe, Germany cFaculty of Organizational Sciences, University of Maribor, Kranj, Slovenia dResearch Center for Information Technologies (FZI), Karlsruhe, Germany |
Abstract: | The tremendous success of the World Wide Web is countervailed by efforts needed to search and find relevant information. For tabular structures embedded in HTML documents, typical keyword or link-analysis based search fails. The Semantic Web relies on annotating resources such as documents by means of ontologies and aims to overcome the bottleneck of finding relevant information. Turning the current Web into a Semantic Web requires automatic approaches for annotation since manual approaches will not scale in general. Most efforts have been devoted to automatic generation of ontologies from text, but with quite limited success. However, tabular structures require additional efforts, mainly because understanding of table contents requires the comprehension of the logical structure of the table on the one hand, as well as its semantic interpretation on the other. The focus of this paper is on the automatic transformation and generation of semantic (F-Logic) frames from table-like structures. The presented work consists of a methodology, an accompanying implementation (called TARTAR) and a thorough evaluation. It is based on a grounded cognitive table model which is stepwise instantiated by the methodology. A typical application scenario is the automatic population of ontologies to enable query answering over arbitrary tables (e.g. HTML tables). |