首页 | 本学科首页   官方微博 | 高级检索  
     


SyMSS: A syntax-based measure for short-text semantic similarity
Authors:Jesús OlivaAuthor Vitae  José Ignacio Serrano Author VitaeMaría Dolores del Castillo Author Vitae  Ángel Iglesias Author Vitae
Affiliation:
  • Bioengineering Group, CSIC, Carretera de Campo Real, km. 0,200. La Poveda, Arganda del Rey, CP: 28500, Madrid, Spain
  • Abstract:Sentence and short-text semantic similarity measures are becoming an important part of many natural language processing tasks, such as text summarization and conversational agents. This paper presents SyMSS, a new method for computing short-text and sentence semantic similarity. The method is based on the notion that the meaning of a sentence is made up of not only the meanings of its individual words, but also the structural way the words are combined. Thus, SyMSS captures and combines syntactic and semantic information to compute the semantic similarity of two sentences. Semantic information is obtained from a lexical database. Syntactic information is obtained through a deep parsing process that finds the phrases in each sentence. With this information, the proposed method measures the semantic similarity between concepts that play the same syntactic role. Psychological plausibility is added to the method by using previous findings about how humans weight different syntactic roles when computing semantic similarity. The results show that SyMSS outperforms state-of-the-art methods in terms of rank correlation with human intuition, thus proving the importance of syntactic information in sentence semantic similarity computation.
    Keywords:Linguistic tools for IS modeling  Text DBs  Natural language processing (NLP)  Semantic similarity  Sentence similarity
    本文献已被 ScienceDirect 等数据库收录!
    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号