Smoothed Variable Sample-size Accelerated Proximal Methods for Nonsmooth Stochastic Convex Programs

21 May 2020  ·  Jalilzadeh Afrooz, Shanbhag Uday V., Blanchet Jose H., Glynn Peter W. ·

We consider minimizing $f(x) = \mathbb{E}[f(x,\omega)]$ when $f(x,\omega)$ is possibly nonsmooth and either strongly convex or convex in $x$. (I) Strongly convex... When $f(x,\omega)$ is $\mu-$strongly convex in $x$, we propose a variable sample-size accelerated proximal scheme (VS-APM) and apply it on $f_{\eta}(x)$, the ($\eta$-)Moreau smoothed variant of $\mathbb{E}[f(x,\omega)]$; we term such a scheme as (m-VS-APM). We consider three settings. (a) Bounded domains. In this setting, VS-APM displays linear convergence in inexact gradient steps, each of which requires utilizing an inner (SSG) scheme. Specifically, mVS-APM achieves an optimal oracle complexity in SSG steps; (b) Unbounded domains. In this regime, under a weaker assumption of suitable state-dependent bounds on subgradients, an unaccelerated variant mVS-PM is linearly convergent; (c) Smooth ill-conditioned $f$. When $f$ is $L$-smooth and $\kappa = L/\mu \ggg 1$, we employ mVS-APM where increasingly accurate gradients $\nabla_x f_{\eta}(x)$ are obtained by VS-APM. Notably, mVS-APM displays linear convergence and near-optimal complexity in inner proximal evaluations (upto a log factor) compared to VS-APM. But, unlike a direct application of VS-APM, this scheme is characterized by larger steplengths and better empirical behavior; (II) Convex. When $f(x,\omega)$ is merely convex but smoothable, by suitable choices of the smoothing, steplength, and batch-size sequences, smoothed VS-APM (or sVS-APM) produces sequences for which expected sub-optimality diminishes at the rate of $\mathcal{O}(1/k)$ with an optimal oracle complexity of $\mathcal{O}(1/\epsilon^2)$. Finally, sVS-APM and VS-APM produce sequences that converge almost surely to a solution of the original problem. read more

PDF Abstract
No code implementations yet. Submit your code now


Optimization and Control