←back to thread

161 points belleville | 2 comments | | HN request time: 0.654s | source
Show context
itsthecourier ◴[] No.43677688[source]
"Whenever these kind of papers come out I skim it looking for where they actually do backprop.

Check the pseudo code of their algorithms.

"Update using gradient based optimizations""

replies(4): >>43677717 #>>43677878 #>>43684074 #>>43725019 #
f_devd ◴[] No.43677878[source]
I mean the only claim is no propagation, you always need a gradient of sorts to update parameters. Unless you just stumble upon the desired parameters. Even genetic algorithms effectively has gradients which are obfuscated through random projections.
replies(3): >>43678034 #>>43679597 #>>43679675 #
1. bob1029 ◴[] No.43679597[source]
In genetic algorithms, any gradient found would be implied by way of the fitness function and would not be something to inherently pursue. There are no free lunches like with chain rule of calculus.

GP is essentially isomorphic with beam search where the population is the beam. It is a fancy search algorithm. It is not "training" anything.

replies(1): >>43679880 #
2. f_devd ◴[] No.43679880[source]
True, genetic algorithms are only implied, but those implied gradients are used in the more successful evolutionary strategies. So while they might not look like it (because it's not used in a continuous descent) they still very much work like (although they represent a smoother function than) regular back-prop gradients when aggregated.