首页 | 本学科首页   官方微博 | 高级检索  
     


Multi-document summarization based on the Yago ontology
Authors:Elena Baralis  Luca Cagliero  Saima Jabeen  Alessandro Fiori  Sajid Shah
Affiliation:1. Dipartimento di Automatica e Informatica, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy;2. IRC@C: Institute for Cancer Research at Candiolo, Str. Prov. 142 Km. 3.95, 10060 Candiolo (TO), Italy;3. Dipartmento di Elettronica e Telecomunicazioni, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy
Abstract:Sentence-based multi-document summarization is the task of generating a succinct summary of a document collection, which consists of the most salient document sentences. In recent years, the increasing availability of semantics-based models (e.g., ontologies and taxonomies) has prompted researchers to investigate their usefulness for improving summarizer performance. However, semantics-based document analysis is often applied as a preprocessing step, rather than integrating the discovered knowledge into the summarization process.This paper proposes a novel summarizer, namely Yago-based Summarizer, that relies on an ontology-based evaluation and selection of the document sentences. To capture the actual meaning and context of the document sentences and generate sound document summaries, an established entity recognition and disambiguation step based on the Yago ontology is integrated into the summarization process.The experimental results, which were achieved on the DUC’04 benchmark collections, demonstrate the effectiveness of the proposed approach compared to a large number of competitors as well as the qualitative soundness of the generated summaries.
Keywords:Document summarization  Text mining  Entity recognition
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号