首页 | 本学科首页   官方微博 | 高级检索  
     

一种具有O(1/T)收敛速率的稀疏随机算法
引用本文:姜纪远, 夏良, 章显, 陶卿. 一种具有O(1/T)收敛速率的稀疏随机算法[J]. 计算机研究与发展, 2014, 51(9): 1901-1910. DOI: 10.7544/issn1000-1239.2014.20140161
作者姓名:姜纪远  夏良  章显  陶卿
作者单位:1.(中国人民解放军陆军军官学院 合肥 230031) (jyjianggle@gmail.com)
基金项目:国家自然科学基金项目,安徽省自然科学基金项目
摘    要:随机梯度下降(stochastic gradient descent, SGD)是一种求解大规模优化问题的简单高效方法,近期的研究表明,在求解强凸优化问题时其收敛速率可通过α-suffix平均技巧得到有效的提升.但SGD属于黑箱方法,难以得到正则化优化问题所期望的实际结构效果.另一方面,COMID(composite objective mirror descent)是一种能保证L1正则化结构的稀疏随机算法,但对于强凸优化问题其收敛速率仅为O(logT/T).主要考虑“L1+Hinge”优化问题,首先引入L2强凸项将其转化为强凸优化问题,进而将COMID算法和α-suffix平均技巧结合得到L1MD-α算法.证明了L1MD-α具有O(1/T)的收敛速率,并且获得了比COMID更好的稀疏性.大规模数据库上的实验验证了理论分析的正确性和所提算法的有效性.

关 键 词:机器学习  随机优化  稀疏性  L1正则化  COMID

A Sparse Stochastic Algorithm with O(1/T) Convergence Rate
Jiang Jiyuan, Xia Liang, Zhang Xian, Tao Qing. A Sparse Stochastic Algorithm with O(1/T) Convergence Rate[J]. Journal of Computer Research and Development, 2014, 51(9): 1901-1910. DOI: 10.7544/issn1000-1239.2014.20140161
Authors:Jiang Jiyuan  Xia Liang  Zhang Xian  Tao Qing
Affiliation:1.(Army Officer Academy of PLA, Hefei 230031)
Abstract:Stochastic gradient descent (SGD) is a simple but efficient method for large-scale optimization problems. Recent researches have shown that its convergence rate can be effectively improved by using the so-called α-suffix averaging technique in solving the strongly convex problems. However, SGD is purely a black-box method, so it is difficult to obtain the expected effect on the learning structure while solving the regularized optimization problems. On the other hand, composite objective mirror descent (COMID) in the stochastic setting is a scalable algorithm which can effectively keep the sparsity imposed by L1 regularization;But it can only obtain an O(logT/T) convergence rate for the strongly convex optimization problems. In this paper, we focus on the generally convex optimization problem in the form of “L1 + Hinge”. To make full use of the α-suffix averaging technique, we first change it into a strongly convex optimization problem by adding an L2 strongly convex term. Then, we present an L1MD-α algorithm which combines the COMID algorithm and the α-suffix averaging technique. We prove that the L1MD-α algorithm can achieve an O(1/T) convergence rate. As a result, our L1MD-α algorithm not only obtains faster convergence rate but also better sparsity than COMID. Through extensive experiments on some typical large-scale datasets we finally verify the correctness of the theoretical analysis and effectiveness of the proposed algorithm.
Keywords:machine learning  stochastic optimization  sparsity  L1 regularization  COMID (composite objective mirror descent)
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机研究与发展》浏览原始摘要信息
点击此处可从《计算机研究与发展》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号