"But more importantly, we don’t care if you can find a global minimum of the training error. We care if you can find a global minimum of the test error."
I agree but to be fair to optimization folks, this is generally done by setting up a loss that should guarantee low test error.
↓
add a skeleton here at some point
5 months ago