首页 | 本学科首页   官方微博 | 高级检索  
     

基于概念的文本表示模型
引用本文:陈龙,范瑞霞,高琪.基于概念的文本表示模型[J].计算机工程与应用,2008,44(20):162-164.
作者姓名:陈龙  范瑞霞  高琪
作者单位:北京理工大学 模式识别与智能系统研究所,北京 100081
摘    要:文本信息处理正朝着语义的方向发展,而当今主流的文本表示模型——向量空间模型(VSM)以单个词语作为特征项,这忽略了自然语言中词语之间的语义联系、导致文本中大量存在同义词与多义词现象,从而严重地降低了文本信息处理的精度。应用自然语言处理相关技术和成果,把概念和概念距离引入向量空间模型,从语义、概念的角度出发,以概念作为文本的特征项,建立基于概念的文本表示模型。实验证明:这种方法能较好地解决同义词和多义词问题、提高了文本分类的查全率和查准率。

关 键 词:文本表示模型  概念  概念距离  
收稿时间:2007-9-27
修稿时间:2008-1-23  

Model of text representation based on concept
CHEN Long,FAN Rui-xia,GAO Qi.Model of text representation based on concept[J].Computer Engineering and Applications,2008,44(20):162-164.
Authors:CHEN Long  FAN Rui-xia  GAO Qi
Affiliation:Beijing Instituts of Technology,Beijing 100081,China
Abstract:The information processing of text is advancing towards semantic direction,but nowadays the dominating model of text representation,which is called the Vector Space Model uses a single word to be the characteristic item.It neglects the lexical relation between words,thereby leading to a low precision of text information processing due to the fact that synonymy and polysemy exist in large numbers in natural languages.This paper uses the techniques and results of natural language processing,and introduces concept and distance of concept into the Vector Space Model.An improved model of text representation is then built based on concept as a characteristic item of the text from the perspective of semantics and concept.Proved by experiments,this method can resolve the synonymous and polysemantic problems commendably,improve the precision and recall to a great extent.
Keywords:text representation model  concept  distance of concept
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号