←back to thread

235 points tosh | 1 comments | | HN request time: 0.267s | source
Show context
xanderlewis ◴[] No.40214349[source]
> Stripped of anything else, neural networks are compositions of differentiable primitives

I’m a sucker for statements like this. It almost feels philosophical, and makes the whole subject so much more comprehensible in only a single sentence.

I think François Chollet says something similar in his book on deep learning: one shouldn’t fall into the trap of anthropomorphising and mysticising models based on the ‘neural’ name; deep learning is simply the application of sequences of operations that are nonlinear (and hence capable of encoding arbitrary complexity) but nonetheless differentiable and so efficiently optimisable.

replies(12): >>40214569 #>>40214829 #>>40215168 #>>40215198 #>>40215245 #>>40215592 #>>40215628 #>>40216343 #>>40216719 #>>40216975 #>>40219489 #>>40219752 #
andoando ◴[] No.40214569[source]
What does "differentiable primitives" mean here?
replies(4): >>40214623 #>>40214658 #>>40215206 #>>40215221 #
xanderlewis ◴[] No.40214658[source]
I think it’s referring to ‘primitive functions’ in the sense that they’re the building blocks of more complicated functions. If f and g are differentiable, f+g, fg, f/g (as long as g is never zero)… and so on are differentiable too. Importantly, f composed with g is also differentiable, and so since the output of the whole network as a function of its input is a composition of these ‘primitives’ it’s differentiable too.

The actual primitive functions in this case would be things like the weighted sums of activations in the previous layer to get the activation of a given layer, and the actual ‘activation functions’ (traditionally something like a sigmoid function; these days a ReLU) associated with each layer.

‘Primitives’ is also sometimes used as a synonym for antiderivatives, but I don’t think that’s what it means here.

Edit: it just occurred to me from a comment below that you might have meant to ask what the ‘differentiable’ part means. See https://en.wikipedia.org/wiki/Differentiable_function.

replies(1): >>40215568 #
andoando ◴[] No.40215568[source]
Is this function composition essentially lambda calculus then?
replies(2): >>40216384 #>>40216613 #
1. OJFord ◴[] No.40216384[source]
Function composition is just f(g(x)), considered as a single function that's the composition of f and g; it has the domain of f and the range of g.

In lambda calculus terminology it's an 'application' (with a function argument).