首页 | 本学科首页   官方微博 | 高级检索  
     

基于支持向量机的中国人名的自动识别
引用本文:李丽双,黄德根,毛婷婷,徐潇潇. 基于支持向量机的中国人名的自动识别[J]. 计算机工程, 2006, 32(19): 188-190
作者姓名:李丽双  黄德根  毛婷婷  徐潇潇
作者单位:大连理工大学计算机科学与工程系,大连,116023
摘    要:提出并实现了一种基于支持向量机(SVM)的中文文本中人名的自动识别方法。对训练文本进行自动分词、词性标注及分类标注,然后按字抽取特征,并将其转化为二进制表示,在此基础上建立了训练集。然后通过对多项式Kernel函数的测试,得到了用支持向量机进行人名识别的机器学习模型。实验结果表明,所建立的SVM人名识别模型是有效的。

关 键 词:支持向量机  中文文本  人名识别  机器学习
文章编号:1000-3428(2006)19-0188-03
收稿时间:2005-10-30
修稿时间:2005-10-30

Auto Recognition of Person Names from Chinese Texts Based on Support Vector Machines
LI Lishuang,HUANG Degen,MAO Tingting,XU Xiaoxiao. Auto Recognition of Person Names from Chinese Texts Based on Support Vector Machines[J]. Computer Engineering, 2006, 32(19): 188-190
Authors:LI Lishuang  HUANG Degen  MAO Tingting  XU Xiaoxiao
Affiliation:Department of Computer Science and Engineering, Dalian University of Technology, Dalian 116023
Abstract:Based on the characteristics of person names in Chinese texts, a method of automatic recognition of Chinese person names using support vector machines (SVMs) is proposed. The character itself, character-based POS tag, the information whether a character appears in a last names table, the probability of a character’s occurrence in person names and context information are extracted as the features of the vectors. Each sample is represented by a long binary vector, and thus a training set is established. The machine learning models of automatic identification of person names are obtained by testing polynomial Kernel functions. The results show that the models are efficient in identifying person names from Chinese texts. The recall, precision and F-measure are up to 92.14%, 96.43% and 94.24% respectively in open test.
Keywords:Support vector machines(SVM)  Chinese texts  Recognition of person names  Machine learning
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号