首页 | 本学科首页   官方微博 | 高级检索  
     

基于时间特性的微博热门话题检测算法研究
引用本文:闫光辉,赵红运,任亚缙,陈 勇.基于时间特性的微博热门话题检测算法研究[J].计算机应用研究,2014,31(1):43-46.
作者姓名:闫光辉  赵红运  任亚缙  陈 勇
作者单位:1. 哈尔滨理工大学 a. 计算机科学与技术学院; b. 软件学院, 哈尔滨 150080; 2. 哈尔滨工程大学 信息与通信工程学院, 哈尔滨 150001
基金项目:黑龙江省教育厅科学技术研究资助项目(12531106)
摘    要:为了提高词义消歧的质量, 对歧义词汇的上下文进行结构分析, 提出了一种利用句法知识来指导消歧过程的方法。在歧义词汇上下文的句法树中, 提取句法信息和词性信息作为消歧特征; 同时, 使用朴素贝叶斯模型作为消歧分类器。利用词义标注语料对分类器的参数进行优化, 然后对测试数据中的歧义词汇进行消歧。实验结果表明, 消歧的准确率有所提升, 达到了66. 7%。

关 键 词:词义消歧  句法信息  词性  消歧分类器

Chinese word sense disambiguation based on parsing analysis
YAN Guang-hui,ZHAO Hong-yun,REN Ya-jin,CHEN Yong.Chinese word sense disambiguation based on parsing analysis[J].Application Research of Computers,2014,31(1):43-46.
Authors:YAN Guang-hui  ZHAO Hong-yun  REN Ya-jin  CHEN Yong
Affiliation:1. a. School of Computer Science & Technology, b. School of Software, Harbin University of Science & Technology, Harbin 150080, China; 2. College of Information & Communication Engineering, Harbin Engineering University, Harbin 150001, China
Abstract:In order to improve the quality of word sense disambiguation, this paper analyzed the structure of the context including an ambiguous word and proposed a new method of word sense disambiguation based on parsing knowledge. It extracted parsing information and part of speech as disambiguation features. This paper used naive Bayesian model as word sense disambiguation classifier and trained its parameters on sense-annotated corpus. Then it applied the classifier to disambiguate ambiguous words in test data. Experimental results show that the accuracy rate of disambiguation is improved and arrives at 66. 7%.
Keywords:word sense disambiguation(WSD)  parsing information  part of speech  disambiguation classifier
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号