Error Bounds and Applications for Stochastic Approximation with Non-Decaying Gain

15 Mar 2020  ·  Zhu Jingyi ·

This work analyzes the stochastic approximation algorithm with non-decaying gains as applied in time-varying problems. The setting is to minimize a sequence of scalar-valued loss functions $f_k(\cdot)$ at sampling times $\tau_k$ or to locate the root of a sequence of vector-valued functions $g_k(\cdot)$ at $\tau_k$ with respect to a parameter $\theta\in R^p$. The available information is the noise-corrupted observation(s) of either $f_k(\cdot)$ or $g_k(\cdot)$ evaluated at one or two design points only. Given the time-varying stochastic approximation setup, we apply stochastic approximation algorithms with non-decaying gains, so that the recursive estimate denoted as $\hat{\theta}_k$ can maintain its momentum in tracking the time-varying optimum denoted as $\theta_k^*$. Chapter 3 provides a bound for the root-mean-squared error $ \sqrt{E(\|\hat{\theta}_k-\theta_k^*\|^2})$. Overall, the bounds are applicable under a mild assumption on the time-varying drift and a modest restriction on the observation noise and the bias term. After establishing the tracking capability in Chapter 3, we also discuss the concentration behavior of $\hat{\theta}_k $ in Chapter 4. The weak convergence limit of the continuous interpolation of $\hat{\theta}_k$ is shown to follow the trajectory of a non-autonomous ordinary differential equation. Both Chapter 3 and Chapter 4 are probabilistic arguments and may not provide much guidance on the gain-tuning strategies useful for one single experiment run. Therefore, Chapter 5 discusses a data-dependent gain-tuning strategy based on estimating the Hessian information and the noise level. Overall, this work answers the questions "what is the estimate for the dynamical system $\theta_k^*$" and "how much we can trust $\hat{\theta}_k $ as an estimate for $\theta_k^*$."

PDF Abstract
No code implementations yet. Submit your code now

Categories


Optimization and Control