NoProp: Training neural networks without back-propagation or forward-propagation

(arxiv.org)

161 points belleville | 1 comments | 14 Apr 25 00:03 UTC | HN request time: 0.204s | source

Show context

itsthecourier ◴[14 Apr 25 03:01 UTC] No.43677688[source]▶

>>43676837 (OP) #

"Whenever these kind of papers come out I skim it looking for where they actually do backprop.

Check the pseudo code of their algorithms.

"Update using gradient based optimizations""

replies(4): >>43677717 #>>43677878 #>>43684074 #>>43725019 #

f_devd ◴[14 Apr 25 03:47 UTC] No.43677878[source]▶

>>43677688 #

I mean the only claim is no propagation, you always need a gradient of sorts to update parameters. Unless you just stumble upon the desired parameters. Even genetic algorithms effectively has gradients which are obfuscated through random projections.

replies(3): >>43678034 #>>43679597 #>>43679675 #

erikerikson ◴[14 Apr 25 04:24 UTC] No.43678034[source]▶

>>43677878 #

No you don't. See Hebbian learning (neurons that fire together wire together). Bonus: it is one of the biologically plausible options.

Maybe you have a way of seeing it differently so that this looks like a gradient? Gradient keys my brain into a desired outcome expressed as an expectation function.

replies(4): >>43678091 #>>43679021 #>>43680033 #>>43683591 #

1. srean ◴[14 Apr 25 17:14 UTC] No.43683591[source]▶

>>43678034 #

Nope that update with the rank one update is exactly the projected gradient of the reconstruction loss. That's not the way it is usually taught. So Hebbian learning was an unfortunate example.

Gradient descent is only one way of searching for a minima, so in that sense it is not necessary, for example, when one can analytically solve for the extrema of the loss. As an alternative one could do Monte Carlo search instead of gradient descent. For a convex loss that would be less efficient of course.

↑