←back to thread

235 points tosh | 3 comments | | HN request time: 0.752s | source
Show context
xanderlewis ◴[] No.40214349[source]
> Stripped of anything else, neural networks are compositions of differentiable primitives

I’m a sucker for statements like this. It almost feels philosophical, and makes the whole subject so much more comprehensible in only a single sentence.

I think François Chollet says something similar in his book on deep learning: one shouldn’t fall into the trap of anthropomorphising and mysticising models based on the ‘neural’ name; deep learning is simply the application of sequences of operations that are nonlinear (and hence capable of encoding arbitrary complexity) but nonetheless differentiable and so efficiently optimisable.

replies(12): >>40214569 #>>40214829 #>>40215168 #>>40215198 #>>40215245 #>>40215592 #>>40215628 #>>40216343 #>>40216719 #>>40216975 #>>40219489 #>>40219752 #
jxy ◴[] No.40215245[source]
> > Stripped of anything else, neural networks are compositions of differentiable primitives

> I’m a sucker for statements like this. It almost feels philosophical, and makes the whole subject so much more comprehensible in only a single sentence.

And I hate inaccurate statements like this. It pretends to be rigorous mathematical, but really just propagates erroneous information, and makes the whole article so much more amateur in only a single sentence.

The simple relu is continuous but not differentiable at 0, and its derivative is discontinuous at 0.

replies(3): >>40215358 #>>40215380 #>>40233579 #
xanderlewis ◴[] No.40215358[source]
It’s not ‘inaccurate’. The mark of true mastery is an ability to make terse statements that convey a huge amount without involving excessive formality or discussion of by-the-by technical details. If ever you’ve spoken to world-renowned experts in pure mathematics or other highly technical and pendantic fields, you’ll find they’ll say all sorts of ‘inaccurate’ things in conversation (or even in written documents). It doesn’t make them worthless; far from it.

If you want to have a war of petty pedantry, let’s go: the derivative of ReLU can’t be discontinuous at zero, as you say, because continuity (or indeed discontinuity) of a function at x requires the function to have a value at x (which is the negation of what your first statement correctly claims).

replies(3): >>40215704 #>>40216643 #>>40218402 #
kragen ◴[] No.40215704[source]
my experience with world-renowned experts in pure mathematics is that they are much more careful than the average bear to explicitly qualify inaccurate things as inaccurate, because their discipline requires them to be very clear about precisely what they are saying

discontinuity of a function at x does not, according to the usual definition of 'continuity', require the function to have a value at x; indeed, functions that fail to have a value at x are necessarily discontinuous there, precisely because (as you say) they are not continuous there. https://en.wikipedia.org/wiki/Continuous_function#Definition...

there are other definitions of 'discontinuous' in use, but i can't think of one that would give the result you claim

replies(1): >>40216360 #
xanderlewis ◴[] No.40216360[source]
> they are much more careful than the average bear to explicitly qualify inaccurate things as inaccurate

Sure. But what part of this entirely worded in natural language, and very short statement made you think it was a technical, formal statement? I think you’re just taking an opportunity to flex your knowledge of basic calculus, and deliberately attributing intent to the author that isn’t there in order to look clever.

Regarding a function being discontinuous at a point outside its domain: if you take a completely naive view of what ‘discontinuous’ means, then I suppose you can say so. But discontinuity is just the logical negation of continuity. Observe:

To say that f: X —> Y (in this context, a real-valued function of real numbers) is continuous means precisely

∀x∈X ∀ε>0 ∃δ>0 |x - p| < δ ⇒ |f(x) - f(p)| < ε

and so its negation looks like

∃x∈X ⌐ …

that is, there is a point in X, the domain of f where continuity fails.

For example, you wouldn’t talk about a function defined on the integers being discontinuous at pi, would you? That would just be weird.

To prove the point further, observe that the set of discontinuities (according to your definition) of any given function would actually include every number… in fact every mathematical object in the universe — which would make it not even a set in ZFC. So it’s absurd.

Even more reasons to believe functions can only be discontinuous at points of their domain: a function is said to be discontinuous if it has at least one discontinuity. By your definition, every function is discontinuous.

…anyway, I said we were going to be petty. I’m trying to demonstrate this is a waste of time by wasting my own time.

replies(2): >>40216448 #>>40219519 #
1. kragen ◴[] No.40219519[source]
you have an interesting point of view, and some of the things you have said are correct, but if you try to use gradient descent on a function from, say, ℤ → ℝ, you are going to be a very sad xanda. i would indeed describe such a function as being discontinuous not just at π but everywhere, at least with the usual definition of continuity (though there is a sense in which such a function could be, for example, scott-continuous)

even in the case of a single discontinuity in the derivative, like in relu', you lose the intermediate value theorem and everything that follows from it; it's not an inconsequential or marginally relevant fact

replies(1): >>40220344 #
2. jj3 ◴[] No.40220344[source]
Note that any function ℤ → ℝ is continuous on its domain but nowhere differentiable.

A Scott-continuous function ℤ → ℝ must be monontonous. So not every such function is Scott-continuous.

replies(1): >>40229945 #
3. kragen ◴[] No.40229945[source]
aha, thanks!