首页 | 本学科首页   官方微博 | 高级检索  
     

基于类邻域字典的线性回归文本分类
引用本文:武娇,洪彩凤,顾永春,顾兴全,金世举. 基于类邻域字典的线性回归文本分类[J]. 计算机工程, 2021, 47(8): 93-99,108. DOI: 10.19678/j.issn.1000-3428.0058692
作者姓名:武娇  洪彩凤  顾永春  顾兴全  金世举
作者单位:中国计量大学 理学院,杭州 310018;中国计量大学 标准化学院,杭州 310018
基金项目:国家自然科学基金(61302190)。
摘    要:文本表示的高维性会增加文本分类时的计算复杂度.针对该问题,构建基于类邻域字典的线性回归分类模型.采用K近邻方法构造各类别的类邻域字典,根据对测试样本的不同表示,分别提出基于级联类邻域字典和基于类邻域字典的线性回归分类算法.此外,为缓解噪声数据对分类性能的影响,通过度量测试样本与各个类别之间的相关度裁剪噪声类数据.实验结...

关 键 词:稀疏表示分类  K近邻  字典学习  线性回归分类  文本分类
收稿时间:2020-06-22
修稿时间:2020-08-12

Linear Regression Text Classification Based on Class-wise Nearest Neighbor Dictionary
WU Jiao,HONG Caifeng,GU Yongchun,GU Xingquan,JIN Shiju. Linear Regression Text Classification Based on Class-wise Nearest Neighbor Dictionary[J]. Computer Engineering, 2021, 47(8): 93-99,108. DOI: 10.19678/j.issn.1000-3428.0058692
Authors:WU Jiao  HONG Caifeng  GU Yongchun  GU Xingquan  JIN Shiju
Affiliation:1. College of Sciences, China Jiliang University, Hangzhou 310018, China;2. College of Standardization, China Jiliang University, Hangzhou 310018, China
Abstract:In text classification, the high dimensionality of text representation increases the computational complexity. To address the problem, a Linear Regression Classification(LRC) model is constructed based on neighborhood dictionary. The K-Nearest Neighbor(KNN) method is used to construct the neighbor dictionary for each class, and the LRC algorithms based on the concatenate class-wise nearest neighbor dictionary and the class-wise nearest neighbor dictionary are proposed separately according to the different representations of the test sample. In addition, the correlation between the sample and the classes is measured to clip the noise data, alleviating the impact of noise data on classification performance. The experimental results show that the proposed model provides high classification accuracy and calculation efficiency for long texts and short texts. For those texts with multiple classes, the strategy of noise class clipping also enables it to display excellent classification performance.
Keywords:Spares Representation Classification(SRC)  K-Nearest Neighbor(KNN)  dictionary learning  Linear Regression Classification(LRC)  text classification  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号