首页 | 本学科首页   官方微博 | 高级检索  
     

融合距离度量和高斯混合模型的中文词义归纳模型
引用本文:张宜浩,刘智,朱常鹏.融合距离度量和高斯混合模型的中文词义归纳模型[J].计算机科学,2017,44(8):265-269.
作者姓名:张宜浩  刘智  朱常鹏
作者单位:重庆理工大学计算机科学与工程学院 重庆400054,重庆理工大学计算机科学与工程学院 重庆400054,重庆理工大学计算机科学与工程学院 重庆400054
基金项目:本文受重庆市教委科学技术研究项目(kj1500920,kj1500916),国家自然科学基金项目(61603065)资助
摘    要:词义归纳是解决词义知识获取的重要研究课题,利用聚类算法对词义进行归纳分析是目前最广泛采用的方法。通过比较K-Means聚类算法和EM聚类算法在 各自 词义归纳模型上的优势,提出一种新的融合距离度量和高斯混合模型的聚类算法,以期利用两种聚类算法分别在距离度量和数据分布计算上的优势,挖掘数据的几何特性和正态分布信息在词义聚类分析中的作用,从而提高词义归纳模型的性能。实验结果表明,所提混合聚类算法对于改进词义归纳模型的性能是十分有效的。

关 键 词:词义归纳  距离度量  高斯混合模型  混合聚类
收稿时间:2016/11/10 0:00:00
修稿时间:2017/2/17 0:00:00

Chinese Word Sense Induction Model by Integrating Distance Metric and Gaussian Mixture Model
ZHANG Yi-hao,LIU Zhi and ZHU Chang-peng.Chinese Word Sense Induction Model by Integrating Distance Metric and Gaussian Mixture Model[J].Computer Science,2017,44(8):265-269.
Authors:ZHANG Yi-hao  LIU Zhi and ZHU Chang-peng
Affiliation:College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China,College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China and College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China
Abstract:Word sense induction is an important topic in solving knowledge acquisition of word sense,and the most widely used method to word sense induction is based on cluster analysis algorithm.By comparing K-Means clustering algorithm with EM clustering algorithm on the model of word sense induction,we proposed a new hybrid clustering algorithm by integrating distance metric and Gaussian mixture model,which combine the advantages of distance metric and data distributed computing in the two cluster algorithms respectively to mine the role of geometrical properties and normal distribution information of training data in clustering analysis and then improve the performance of performance of word sense model.Experimental results show that the hybrid clustering algorithm proposed in this paper is very effective to improve the performance of word sense induction model.
Keywords:Word sense induction  Distance metric  Gaussian mixture model  Hybrid clustering
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号