Web新闻语料分词和标注错误分析 Analysis of inaccurate style in processing Web true news text about word segmentation and part of speech tagging期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Web新闻语料分词和标注错误分析

引用本文：	张永奎,张彦,安增波,刘睿.Web新闻语料分词和标注错误分析[J].计算机工程与应用,2007,43(15):166-169.

作者姓名：	张永奎张彦安增波刘睿

作者单位：	1. 山西大学,计算机与信息技术学院,太原,030006;计算智能与中文信息处理省部共建教育部重点实验室,太原,030006 2. 中国人民解放军91708部队,自动化工作站,广州,510320

基金项目：	国家自然科学基金 , 山西省自然科学基金 , 山西省留学回国人员科研启动基金

摘要：	通过分析Web突发事件语料库文本的加工统计得出11类错误类型,并对其中的一些错误提出了解决方案。研究结果不仅对语料库加工初期分词、标注方法的改进有启发作用,而且对中文的自动校对方法,提供一定的借鉴。
关键词：	中文信息处理分词词性标注错误类型 Web突发事件新闻语料库
文章编号：	1002-8331（2007）15-0166-04
修稿时间：	2006-12
Analysis of inaccurate style in processing Web true news text about word segmentation and part of speech tagging

ZHANG Yong-kui,ZHANG Yan,AN Zeng-bo,LIU Rui.Analysis of inaccurate style in processing Web true news text about word segmentation and part of speech tagging[J].Computer Engineering and Applications,2007,43(15):166-169.

Authors:	ZHANG Yong-kui ZHANG Yan AN Zeng-bo LIU Rui

Affiliation:	1.Department of Computer &; Information Technology，Shanxi University，Taiyuan 030006，China 2.Key Laboratory of Ministry of Education for Computation Intelligence and Chinese Information Processing，Taiyuan 030006，China 3.Workstation Automation of 91708 PLA，Guangzhou 510320，China

Abstract:	Eleven inaccurate styles are obtained through analyzing the processing of Web accidental news text,we propose resolvent for some styles.This not only illuminates the improvement of word segmentation and part of speech tagging methods in early process of corpora,but also provides references to automatic check,another branch of Chinese information processing.

Keywords:	Chinese information processing word segmentation part of speech tagging inaccurate style Web accidental news corpora
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏