←back to thread

161 points belleville | 1 comments | | HN request time: 0s | source
Show context
itsthecourier ◴[] No.43677688[source]
"Whenever these kind of papers come out I skim it looking for where they actually do backprop.

Check the pseudo code of their algorithms.

"Update using gradient based optimizations""

replies(4): >>43677717 #>>43677878 #>>43684074 #>>43725019 #
f_devd ◴[] No.43677878[source]
I mean the only claim is no propagation, you always need a gradient of sorts to update parameters. Unless you just stumble upon the desired parameters. Even genetic algorithms effectively has gradients which are obfuscated through random projections.
replies(3): >>43678034 #>>43679597 #>>43679675 #
1. gsf_emergency_2 ◴[] No.43679675[source]
GP glancing at the pseudo-code is certainly an efficient way to dismiss an article, but something tells me he missed the crucial sentence in the abstract:

>"We believe this work takes a first step TOWARDS introducing a new family of GRADIENT-FREE learning methods"

I.e. for the time being, authors can't convince themselves not to take advantage of efficient hw for taking gradients

(*Checks that Oxford University is not under sanctions*)