中文文本分类器的设计 Design for Chinese Text Classier期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

中文文本分类器的设计

引用本文：	陆建江,张文献.中文文本分类器的设计[J].计算机工程与应用,2002,38(15):49-51.

作者姓名：	陆建江张文献

作者单位：	1. 解放军理工大学通信工程学院,南京,210007 2. 解放军理工大学理学院,南京,210007

基金项目：	国家自然科学基金重点项目(编号:69931040)

摘要：	文本分类是指在给定分类体系下,根据文本的内容自动确定文本类型的过程。文章应用球形的k-均值算法确定每个文本的类标签,并通过Boosting算法构建分类器。构建的分类器具有以下特点:分类器的设计针对未知类标签的语料库,实用性好;分类器能随着语料库中文本的变化而增加新的类,具有很好的可扩展性;分类器基于Boosting算法,具有很好的分类精度。
关键词：	文本分类中文文本机器学习球形的k-均值算法 Boosting算法
文章编号：	1002-8331-(2002)15-0049-03
修稿时间：	2002年6月1日
Design for Chinese Text Classier

Lu Jianjiang,Zhang Wenxian.Design for Chinese Text Classier[J].Computer Engineering and Applications,2002,38(15):49-51.

Authors:	Lu Jianjiang Zhang Wenxian

Affiliation:	Lu Jianjiang 1 Zhang Wenxian 21

Abstract:	Text categorization is defined as the task of assigning pre-defined category label to a new text.Spherical k-means algorithm is applied to obtain the category label of each text,and a classier is built based on the Boosting algorithm.This classier has following characteristics:good practicability,good extensibility and good category precision.

Keywords:	text categorization Chinese text machine learning spherical k-means algorithm Boosting algorithm
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏