30 Nov

Last week, I explained why typical distributed gradient descent leads to inexact solution, even though the convergence rate is linear (in log scale). There is a very important paper that shows that we can in fact have exact solution by incorporating the previous gradient. This algorithm is called...

  • «
  • »