首页 | 本学科首页   官方微博 | 高级检索  
     

汉语语料的切分标注加工系统
引用本文:徐菁,张辉,陆汝占.汉语语料的切分标注加工系统[J].计算机工程,2003,29(9):66-68,165.
作者姓名:徐菁  张辉  陆汝占
作者单位:上海交通大学计算机科学与工程系,上海,200030
摘    要:介绍了一个对汉语语料进行切分标注粗加工的系统WegPos。该系统采用前缀码分词算法,用二元语法模型进行词性标注,并利用概率统计、规则、歧义数据库、部分句法分析等多种方法的结合排除分词和标注中产生的歧义。

关 键 词:切分  标注  语料库  自然语言理解
文章编号:1000-3428(2003)09-0066-03

A Chinese Corpus Segmentation and Part of Speech Tagging System
XU Jing,ZHANG Hui,LU Ruzhan.A Chinese Corpus Segmentation and Part of Speech Tagging System[J].Computer Engineering,2003,29(9):66-68,165.
Authors:XU Jing  ZHANG Hui  LU Ruzhan
Abstract:This paper presents the authors' recent work towards development of SegPos, a Chinese corpus segmentation & part of speech tagging system. SegPos does segmentation with prefix -method and does tagging with 2-meta syntax model. And it uses many methods such as statistics, rules, ambiguity table and part syntax analysis to decrease the ambiguity appeared during the process of segmentation and tagging.
Keywords:Segmentation  Part of speech tagging  Corpus  Natural language understanding
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号