We give a classical algorithm for linear regression analogous to the quantum matrix inversion algorithm [Harrow, Hassidim, and Lloyd, Physical Review Letters'09] for low-rank matrices [Wossnig et al., Physical Review Letters'18], when the input matrix $A$ is stored in a data structure applicable for QRAM-based state preparation. Namely, given an $A \in \mathbb{C}^{m\times n}$ with minimum singular value $\sigma$ and which supports certain efficient $\ell_2$-norm importance sampling queries, along with a $b \in \mathbb{C}^m$, we can output a description of an $x \in \mathbb{C}^n$ such that $\|x - A^+b\| \leq \varepsilon\|A^+b\|$ in $\tilde{\mathcal{O}}\Big(\frac{\|A\|_{\mathrm{F}}^6\|A\|^2}{\sigma^8\varepsilon^4}\Big)$ time, improving on previous "quantum-inspired" algorithms in this line of research by a factor of $\frac{\|A\|^{14}}{\sigma^{14}\varepsilon^2}$ [Chia et al., STOC'20]. The algorithm is stochastic gradient descent, and the analysis bears similarities to those of optimization algorithms for regression in the usual setting [Gupta and Sidford, NeurIPS'18]. Unlike earlier works, this is a promising avenue that could lead to feasible implementations of classical regression in a quantum-inspired setting, for comparison against future quantum computers.
Minor errata
In proposition 2.3, we require \(\varepsilon\) to be smaller than stated (in fact, \(\lesssim (\log\frac{\|A\|_F\|A\|}{\sigma^2+\lambda})^{-\frac12}(\log\log\frac{\|A\|_F\|A\|}{\sigma^2+\lambda})^{-\frac12}\)). This is necessary for the equations at the bottom of page 11 to go through.
The \(s\) in the equation at the top of page 8 should be a \(\|b\|_0\).