Directions in ML: Taking Advantage of Randomness in Expensive Optimization Problems

Optimization is at the heart of machine learning, and gradient computation is central to many optimization techniques. Stochastic optimization, in particular, has taken center stage as the principal method of fitting many models, from deep neural networks to variational Bayesian posterior approximations. Generally, one uses data subsampling to efficiently construct unbiased gradient estimators for stochastic optimization, but this is only one possibility. In this talk, I discuss two alternative approaches to constructing unbiased gradient estimates in machine learning problems. The first approach uses randomized truncation of objective functions defined as loops or limits. Such objectives arise in settings ranging from hyperparameter selection, to fitting parameters of differential equations, to variational inference using lower bounds on the log-marginal likelihood. The second approach revisits the Jacobian accumulation problem at the heart of automatic differentiation, observing that it is possible to colla

1 view