首页 | 本学科首页   官方微博 | 高级检索  
     

基于地址语义理解的中文地址识别方法
引用本文:李晓林,张懿,李霖.基于地址语义理解的中文地址识别方法[J].计算机工程与科学,2019,41(3):551-558.
作者姓名:李晓林  张懿  李霖
作者单位:武汉工程大学智能机器人湖北省重点实验室,湖北武汉,430205;武汉大学资源与环境科学学院,湖北武汉,430079
基金项目:十三五国家重点研发计划课题(2017YFB0503701);国家863计划(2013AA12A202);测绘地理信息公益性行业科研专项(201412014);湖北省自然科学基金(2013CFA125)
摘    要:互联网中中文地址文本蕴含着丰富的空间位置信息,为了更加有效地获取文本中的地址位置信息,提出一种基于地址语义理解的地址位置信息识别方法。通过对训练语料进行词频统计,制定地址要素特征字集合和字转移概率,构造特征字转移概率矩阵,并结合字符串最大联合概率算法,设计了一种不依赖地名词典和词性标注的地址识别方法。实验结果表明,该方法对地址要素特征字突出且存在歧义的中文地址的完全匹配率为76.85%,识别准确率为93.11%。最后,与机械匹配算法和基于经验构造转移概率矩阵的方法进行对比实验,实验结果表明了该方法的可用性和有效性。

关 键 词:地址语义  要素特征字  转移概率  无词典
收稿时间:2017-12-25
修稿时间:2019-03-25

A Chinese address recognition method based on address semantics
LI Xiao lin,ZHANG Yi,LI Lin.A Chinese address recognition method based on address semantics[J].Computer Engineering & Science,2019,41(3):551-558.
Authors:LI Xiao lin  ZHANG Yi  LI Lin
Affiliation:(1.Hubei Key Laboratory of Intelligent Robot,Wuhan Institute of Technology,Wuhan 430205; 2.School of Resource and Environmental Science,Wuhan University,Wuhan 430079,China)  
Abstract:There are a large number of Chinese address text in the Internet that contains rich spatial location information. In order to obtain the address location information in the text more effectively, we propose a Chinese address location information recognition method based on address semantics. According to the statistics of word frequency of the training corpus, we obtain a set of address feature words and word transition probability. Then, we construct a feature word transition probability matrix. Finally, combining with the string maximum joint probability algorithm, we put forward an address recognition method which does not depend on address dictionary and tagging of the part of speech. Experimental results show that the exact match rate of the method is 76.85% for ambiguous Chinese addresses with prominent feature words, and the recognition accuracy is 93.11%. Compared with the mechanical matching algorithm and the methods for constructing the transition probability matrix based on experience, experimental results verify the feasibility and effectiveness of the proposed method.
Keywords:address semantics  feature character word  transfer probability  without dictionary  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号