首页 | 本学科首页   官方微博 | 高级检索  
     

基于汉语复句的语义相关度计算及类别的标识
引用本文:杨进才,陈忠忠,沈显君,胡金柱.基于汉语复句的语义相关度计算及类别的标识[J].计算机科学,2017,44(5):280-284.
作者姓名:杨进才  陈忠忠  沈显君  胡金柱
作者单位:华中师范大学计算机学院 武汉430079,华中师范大学计算机学院 武汉430079,华中师范大学计算机学院 武汉430079,华中师范大学计算机学院 武汉430079
基金项目:本文受国家社科基金(14BYY093)资助
摘    要:语义相关度计算作为中文信息处理领域中的一项关键技术,在信息检索、语义消岐、文本分类中起着重要的作用。利用汉语复句的句法理论和关系标记搭配理论,以汉语复句语料库以及搜索引擎获取的复句为语料,提出了一种基于汉语复句的语义相关度计算方法——SRCCS。本方法不仅能够计算词语的相关度,而且能够表明相关的性质与类别。与通过短文计算相关度的方法相比,本方法选取的计算对象范围更小,因而结果更准确,计算复杂度更低。在同一测试集上与搜索引擎方法的对比分析证明了基于汉语复句的语义相关度计算方法的有效性与优越性。

关 键 词:复句  语义相关度  关系标记  关系类别
收稿时间:2016/4/21 0:00:00
修稿时间:2016/7/3 0:00:00

Word Semantic Relevancy Computation and Categories Identification Based on Chinese Compound Sentences
YANG Jin-cai,CHEN Zhong-zhong,SHEN Xian-jun and HU Jin-zhu.Word Semantic Relevancy Computation and Categories Identification Based on Chinese Compound Sentences[J].Computer Science,2017,44(5):280-284.
Authors:YANG Jin-cai  CHEN Zhong-zhong  SHEN Xian-jun and HU Jin-zhu
Affiliation:School of Computer Science,Central China Normal University,Wuhan 430079,China,School of Computer Science,Central China Normal University,Wuhan 430079,China,School of Computer Science,Central China Normal University,Wuhan 430079,China and School of Computer Science,Central China Normal University,Wuhan 430079,China
Abstract:As a critical technique in the field of Chinese information processing,word semantic relevancy computation plays an important role in information retrieval,ambiguity elimination,and text processing.Using syntactic theory and the collocation theory of the relation markers of Chinese compound sentences,as well as making the corpus of Chinese compound sentences and some compound sentences from search engine as the data resource,a semantic relevancy computation method was proposed based on Chinese compound sentence (SRCCS).This method can not only compute the word semantic relevancy,but also show the property and category of the word semantic relevancy.Compared with the method of short text semantic relevancy,this method chooses a smaller scope of evaluation objects,so the results are more accurate and have little computational complexity.Compared with the result by Google Distance,the new measure is more reliable and effective.
Keywords:Complex sentences  Semantic relevancy  Relations marker  Relations category
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号