基于最大关联规则的文本分类 Text Classification Based on Maximal Association Rule期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于最大关联规则的文本分类

引用本文：	何玉,冯剑琳,王元珍.基于最大关联规则的文本分类[J].计算机科学,2006,33(11):143-145.

作者姓名：	何玉冯剑琳王元珍

作者单位：	华中科技大学计算机学院,武汉,430074

摘要：	我们提出了一种新颖的、基于最大关联的文本分类方法—SAT-MOD 。在文本分类中,以往的方法在挖掘频繁项集和关联规则的时候,往往是将整个文本看作一个事务来处理的,然而文本的基本的语义单元实际上是句子。那些同时出现在一个句子里的一组单词比仅仅是同时出现在同一篇文档中的一组单词有更强的语义上的联系。基于以上的考虑,SAT-MOD 把一篇文档里的某些句子作为一个单独的事务。通过在标准的文本集上的大量实验,证明了SAT-MOD 的有效性。
关键词：	文本分类关联规则最大频繁项目集
Text Classification Based on Maximal Association Rule

HE Yu,FENG Jian-Lin,WANG Yuan-Zhen.Text Classification Based on Maximal Association Rule[J].Computer Science,2006,33(11):143-145.

Authors:	HE Yu FENG Jian-Lin WANG Yuan-Zhen

Affiliation:	Department of Computer Science and Teehnology,Huazhong University of Science and Technology, Wuhan 430074

Abstract:	We propose a novel association based method called SAT-MOD for text classification. While previous methods mainly mined frequently co-occurring words (frequent itemsets) at the document-level, the basic semantic unit in a document is a sentence. Words within the same sentence are typically more semantically related than words that appear in the same document. Our proposed SAT-MOD views a sentence rather than a document as a transaction. The effectiveness of proposed SAT-MOD method has been demonstrated by extensive experimental studies using popular benchmark text collections.

Keywords:	Text classification Association rules Maximal frequent itemsets
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机科学》浏览原始摘要信息
	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏