基于机器学习的维吾尔文文本分类研究 Machine learning based Uyghur language text categorization期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于机器学习的维吾尔文文本分类研究

引用本文：	阿力木江·艾沙,吐尔根·依布拉音,艾山·吾买尔,马尔哈巴·艾力.基于机器学习的维吾尔文文本分类研究[J].计算机工程与应用,2012,48(5):110-112.

作者姓名：	阿力木江·艾沙吐尔根·依布拉音艾山·吾买尔马尔哈巴·艾力

作者单位：	1. 新疆大学现代教育技术中心,乌鲁木齐830046;新疆大学信息科学与工程学院,乌鲁木齐830046 2. 新疆大学信息科学与工程学院,乌鲁木齐,830046

基金项目：	国家自然科学基金（No.61063026 60963018）

摘要：	随着Internet上维吾尔文信息的迅速发展,维吾尔文文本分类成为处理和组织这些大量文本数据的关键技术。研究维吾尔文文本分类相关技术和方法,针对维吾尔文文本在向量空间模型(VSM)表示下的高维性,采用词干提取和IG相结合的方法对表示空间进行降维。采用基于机器学习的分类算法(kNN和Nave Bayes)对维吾尔文文本语料进行了分类实验并分析了实验结果。
关键词：	文本分类朴素贝叶斯方法 k-最近邻方法(kNN) 维吾尔语特征选择
修稿时间：
Machine learning based Uyghur language text categorization

Alimjan AYSA , Turgun IBRAHIM , Hasan OMAR , Marhaba ALI.Machine learning based Uyghur language text categorization[J].Computer Engineering and Applications,2012,48(5):110-112.

Authors:	Alimjan AYSA Turgun IBRAHIM Hasan OMAR Marhaba ALI

Affiliation:	1.Modern Education Technology Center, Xinjiang University, Urumqi 830046, China 2.College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China

Abstract:	With the rapid increase of Uyghur language text information on the Internet,Uyghur language text categorization has become a key technique for processing and organizing these text data.As to the high dimensionality of Uyghur language texts under vector space model representation,the stemming technique is used along with IG to reduce the dimensionality.The categorization experiments are performed using machine learning based text categorization algorithms such as Na?ve Bayes and kNN on Uyghur language text corpus and the experimental results are analyzed.

Keywords:	text categorization Nave Bayes k-Nearest Neighbo（rkNN） Uyghur language feature selection
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏