←back to thread

346 points swatson741 | 1 comments | | HN request time: 0.001s | source
Show context
jamesblonde ◴[] No.45788699[source]
I have to be contrarian here. The students were right. You didn't need to learn to implement backprop in NumPy. Any leakiness in BackProp is addressed by researchers who introduce new optimizers. As a developer, you just pick the best one and find good hparams for it.
replies(5): >>45788770 #>>45788820 #>>45788864 #>>45788882 #>>45790922 #
1. HarHarVeryFunny ◴[] No.45790922[source]
The problem isn't with backprop itself or the optimizer - it's potentially in (the dervatives of) the functions you are building the neural net out of, such as the Sigmoid and ReLU examples that Karpathy gave.

Just because the framework you are using provides things like ReLU doesn't mean you can assume someone else has done all the work and you can just use these and expect them to work all the time. When things go wrong training a neural net you need to know where to look, and what to look for - things like exploding and vanishing gradients.