首页 | 本学科首页   官方微博 | 高级检索  
     

基于线性链条件随机场的用户生成文本标点标注
引用本文:苏晓宝,刘臣,唐莉.基于线性链条件随机场的用户生成文本标点标注[J].软件,2019(4):145-149.
作者姓名:苏晓宝  刘臣  唐莉
作者单位:1.上海理工大学管理学院
摘    要:标点符号的正确性对于用户生成文本的词性标注,命名实体识别,依存句法分析等有着重要的作用,正确的标点标注可以使用户生成文本的语法结构准确完整。线性链条件随机场模型可以容纳任意的非独立的特征信息,本文通过选取标点符号所在位置左右词性对作为模型的观测序列,使用条件随机场进行标点符号的标注。实验的测试语料采用京东在线产品评论,结果表明基于线性链条件随机场的用户生成文本标点标注效率较高。

关 键 词:线性链条件随机场  用户生成文本  词性  特征模板  标点符号标注

User-generated Text Punctuation Labelling Based on Linear Chain Conditional Random Field
SU Xiao-bao,LIU Chen,TANG Li.User-generated Text Punctuation Labelling Based on Linear Chain Conditional Random Field[J].Software,2019(4):145-149.
Authors:SU Xiao-bao  LIU Chen  TANG Li
Affiliation:(Business School, University of Shanghai for Science & Technology, Shanghai 200082, China)
Abstract:The correctness of punctuation marks plays an important role in the part-of-speech tagging of user-generated texts, named entity recognition, and dependency parsing,proper punctuation labelling can make the grammatical structure of user-generated text accurate and complete. The linear chain condition random field model can accommodate any non-independent feature information,in this paper, we use the left and right part-of-speech of the position of the punctuation as the observation sequence of the model, and the conditional random field is used to label the punctuation .The test corpus of the experiment uses Jingdong online product reviews, and the results show that the user-generated text punctuation labelling based on the linear chain conditional random field is more efficient.
Keywords:Linear chain conditional random field  User generated text  Part-of-speech  Feature template  Punctuation labelling
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号