Scaled Gradient Methods for Ill-conditioned Low-rank Matrix and Tensor Estimation
Many problems encountered in machine learning and signal processing can be formulated as estimating a low-rank object from incomplete, and possibly corrupted, linear measurements; prominent examples include matrix completion and tensor completion. Through the lens of matrix and tensor factorization, one of the most popular approaches is to employ simple iterative algorithms such as gradient descent to recover the low-rank factors directly, which allow for small memory and computation footprints. However, the convergence rate of gradient descent depends linearly, and sometimes even quadratically, on the condition number of the low-rank object, and therefore, slows down painstakingly when the problem is ill-conditioned. This thesis introduces a new algorithm: scaled gradient descent (ScaledGD), which provably converges linearly at a constant rate independent of the condition number of the low-rank object, while maintaining the low per-iteration cost of gradient descent. In addition, a nonsmooth variant of ScaledGD provides further robustness to corruptions by optimizing the least absolute deviation loss. In total, ScaledGD highlights the power of appropriate preconditioning in accelerating nonconvex statistical estimation, where the iterationvarying preconditioners promote desirable invariance properties of the trajectory with respect to the symmetry in low-rank factorization.
- Electrical and Computer Engineering
- Doctor of Philosophy (PhD)