Randomly Pivoted Cholesky: Faster Matrix Approximation for Scientific Machine Learning

Ethan Epperly, California Institute of Technology

Photo of Ethan Epperly

Kernel methods like spectral clustering and kernel ridge regression are widely used by scientists to interpret and learn from scientific data. However, traditional implementations of these methods require storing and processing a large dense kernel matrix, severely limiting their scalability. This talk introduces randomly pivoted Cholesky, a fast and accurate algorithm for low-rank compression of kernel matrices. The power of randomly pivoted Cholesky is that it is guaranteed to produce near-optimal approximations to kernel matrix while reading only a fraction of its entries. Randomly pivoted Cholesky can lead to dramatic speedups for scientific machine learning problems: When used to identify meta-stable configurations from molecular dynamics trajectories, randomly pivoted Cholesky is up to three million times faster than existing implementations that manipulate the full uncompressed kernel matrix. This talk concludes by discussing high-performance blocked implementations of randomly pivoted Cholesky that use rejection sampling to accelerate the basic algorithm by up to a factor of forty.