首页 | 本学科首页   官方微博 | 高级检索  
     

基于频繁项集的多标签文本分类算法
引用本文:吕小勇,石洪波. 基于频繁项集的多标签文本分类算法[J]. 计算机工程, 2010, 36(15): 83-85
作者姓名:吕小勇  石洪波
作者单位:山西财经大学信息管理学院,太原,030006
基金项目:国家自然科学基金资助项目,山西省自然科学基金资助项目 
摘    要:针对多标签文本分类问题,提出基于频繁项集的多标签文本分类算法——MLFI。该算法利用FP-growth算法挖掘类别之间的频繁项集,同时为每个类计算类标准向量和相似度阈值,如果文本与类标准向量的相似度大于相应阈值则归到相应的类别,在分类结束后利用挖掘到的类别之间的关联规则对分类结果进行校验。实验结果表明,该算法有较高的分类性能。

关 键 词:多标签  相似度  频繁项集  关联规则

Multi-label Text Classification Algorithm Based on Frequent Item Sets
LV Xiao-yong,SHI Hong-bo. Multi-label Text Classification Algorithm Based on Frequent Item Sets[J]. Computer Engineering, 2010, 36(15): 83-85
Authors:LV Xiao-yong  SHI Hong-bo
Affiliation:(Information Management Institute, Shanxi University of Finance & Economics, Taiyuan 030006)
Abstract:Aiming at the problem of multi-label text classification, this paper proposes a multi-label text classification algorithm based on frequent item sets. It uses FP-growth algorithm for mining frequent item sets between labels, calculates prototype vector and similarity threshold for each class, if the similarity between prototype vector and text are greater than the corresponding threshold, then classifies the text into corresponding category. After classifying, the association rules between the class are utilized to verify the result of classification. Experimental results show that the algorithm has a higher ability of classification performance.
Keywords:multi-label  similarity  frequent item se  association rules
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号