基于次梯度的L1正则化Hinge损失问题求解研究 A Sub-Gadient Based Solver for L1-Rgularization+Hinge-Loss Problem期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于次梯度的L1正则化Hinge损失问题求解研究

引用本文：	孔康,陶卿,汪群山,储德军.基于次梯度的L1正则化Hinge损失问题求解研究[J].计算机研究与发展,2012,49(7):1494-1499.

作者姓名：	孔康陶卿汪群山储德军

作者单位：	1. 解放军陆军军官学院合肥 230031 2. 解放军陆军军官学院合肥 230031;中国科学院自动化研究所北京 100190

摘要：	Hinge损失函数是支持向量机(support vector machines,SVM)成功的关键,L1正则化在稀疏学习的研究中起关键作用.鉴于两者均是不可导函数,高阶梯度信息无法使用.利用随机次梯度方法系统研究L1正则化项的Hinge损失大规模数据问题求解.首先描述了直接次梯度方法和投影次梯度方法的随机算法形式,并对算法的收敛性和收敛速度进行了理论分析.大规模真实数据集上的实验表明,投影次梯度方法对于处理大规模稀疏数据具有更快的收敛速度和更好的稀疏性.实验进一步阐明了投影阈值对算法稀疏度的影响.
关键词：	L1正则化 Hinge损失稀疏性大规模数据机器学习
A Sub-Gadient Based Solver for L1-Rgularization+Hinge-Loss Problem

Kong Kang , Tao Qing , Wang Qunshan , Chu Dejun.A Sub-Gadient Based Solver for L1-Rgularization+Hinge-Loss Problem[J].Journal of Computer Research and Development,2012,49(7):1494-1499.

Authors:	Kong Kang Tao Qing Wang Qunshan Chu Dejun

Affiliation:	1(Army Of ficer Academy, Hefei 230031) 2(Institute of Automation, Chinese Academy of Sciences, Beijing 100190)

Abstract:	Hinge loss is central to the success of support vector machines (SVM) in the area of machine learning.L1 regularization plays a crucial role in sparse learning, which is essentially important for large scale classification problems. However, both hinge loss and L1 regularization are non-differentiable, and higher order of gradient information is unavailable. In this paper, the optimization problem in the form of L1 regularization plus hinge loss is systematically investigated by using the sub-gradient method. We first describe algorithms for the direct sub-gradient method and the projected sub-gradient method in a stochastic setting. To confirm the algorithms’ correctness, we conduct convergence analysis as well as convergence rate of the stochastic projected sub-gradient method. Experimental results on large scale text classification data demonstrate that the stochastic projected sub-gradient method has better convergence rate and high sparsity, and many of the elements in the weight vector are zero, when processing large scale sparse problems. Further, we also demonstrate how the project threshold affects the algorithms’ sparsity.

Keywords:	L1-regularization Hinge-loss sparsity large-scale data machine learning
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏