首页 | 本学科首页   官方微博 | 高级检索  
     

基于次梯度的L1正则化Hinge损失问题求解研究
引用本文:孔康,陶卿,汪群山,储德军.基于次梯度的L1正则化Hinge损失问题求解研究[J].计算机研究与发展,2012,49(7):1494-1499.
作者姓名:孔康  陶卿  汪群山  储德军
作者单位:1. 解放军陆军军官学院 合肥 230031
2. 解放军陆军军官学院 合肥 230031;中国科学院自动化研究所 北京 100190
摘    要:Hinge损失函数是支持向量机(support vector machines,SVM)成功的关键,L1正则化在稀疏学习的研究中起关键作用.鉴于两者均是不可导函数,高阶梯度信息无法使用.利用随机次梯度方法系统研究L1正则化项的Hinge损失大规模数据问题求解.首先描述了直接次梯度方法和投影次梯度方法的随机算法形式,并对算法的收敛性和收敛速度进行了理论分析.大规模真实数据集上的实验表明,投影次梯度方法对于处理大规模稀疏数据具有更快的收敛速度和更好的稀疏性.实验进一步阐明了投影阈值对算法稀疏度的影响.

关 键 词:L1正则化  Hinge损失  稀疏性  大规模数据  机器学习

A Sub-Gadient Based Solver for L1-Rgularization+Hinge-Loss Problem
Kong Kang , Tao Qing , Wang Qunshan , Chu Dejun.A Sub-Gadient Based Solver for L1-Rgularization+Hinge-Loss Problem[J].Journal of Computer Research and Development,2012,49(7):1494-1499.
Authors:Kong Kang  Tao Qing  Wang Qunshan  Chu Dejun
Affiliation:1(Army Of ficer Academy, Hefei 230031) 2(Institute of Automation, Chinese Academy of Sciences, Beijing 100190)
Abstract:Hinge loss is central to the success of support vector machines (SVM) in the area of machine learning.L1 regularization plays a crucial role in sparse learning, which is essentially important for large scale classification problems. However, both hinge loss and L1 regularization are non-differentiable, and higher order of gradient information is unavailable. In this paper, the optimization problem in the form of L1 regularization plus hinge loss is systematically investigated by using the sub-gradient method. We first describe algorithms for the direct sub-gradient method and the projected sub-gradient method in a stochastic setting. To confirm the algorithms’ correctness, we conduct convergence analysis as well as convergence rate of the stochastic projected sub-gradient method. Experimental results on large scale text classification data demonstrate that the stochastic projected sub-gradient method has better convergence rate and high sparsity, and many of the elements in the weight vector are zero, when processing large scale sparse problems. Further, we also demonstrate how the project threshold affects the algorithms’ sparsity.
Keywords:L1-regularization  Hinge-loss  sparsity  large-scale data  machine learning
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号