(arxiv.org)

161 points belleville | 1 comments | 14 Apr 25 00:03 UTC | HN request time: 0s | source

Show context

itsthecourier ◴[14 Apr 25 03:01 UTC] No.43677688[source]▶

>>43676837 (OP) #

"Whenever these kind of papers come out I skim it looking for where they actually do backprop.

Check the pseudo code of their algorithms.

"Update using gradient based optimizations""

replies(4): >>43677717 #>>43677878 #>>43684074 #>>43725019 #

f_devd ◴[14 Apr 25 03:47 UTC] No.43677878[source]▶

>>43677688 #

I mean the only claim is no propagation, you always need a gradient of sorts to update parameters. Unless you just stumble upon the desired parameters. Even genetic algorithms effectively has gradients which are obfuscated through random projections.

replies(3): >>43678034 #>>43679597 #>>43679675 #

1. gsf_emergency_2 ◴[14 Apr 25 09:59 UTC] No.43679675[source]▶

>>43677878 #

GP glancing at the pseudo-code is certainly an efficient way to dismiss an article, but something tells me he missed the crucial sentence in the abstract:

>"We believe this work takes a first step TOWARDS introducing a new family of GRADIENT-FREE learning methods"

I.e. for the time being, authors can't convince themselves not to take advantage of efficient hw for taking gradients

(*Checks that Oxford University is not under sanctions*)

↑

NoProp: Training neural networks without back-propagation or forward-propagation