> 100 000 points drawn from a Gaussian distribution and passed through a randomly initialized neural network.
It gives you a sense of how complex the folding of the space by NNs can be. And also the complexity of the patterns that they can pick up.
> 100 000 points drawn from a Gaussian distribution and passed through a randomly initialized neural network.
It gives you a sense of how complex the folding of the space by NNs can be. And also the complexity of the patterns that they can pick up.
The way the author defines definite scale is that there is a max and a minimum, but that is not true for a gaussian distribution. It is also not true that if we keep sampling wealth (an example of a distribution without definite scale used in the article), there is no limit to the maximum.
Mean from article 163.
So the facts check out.
Author is correct.
Also very interesting the suggestion that human height is not Gaussian.
Snip :
“ Why female soldiers only? If we were to mix male and female soldiers, we would get a distribution with two peaks, which would not be Gaussian.
“
Which begs the question what other human statistics are non Gaussian if sexes are mixed and does this apply to other strong differentiators like historical time, nutrition, neural tribes ?
Statistics is highly non-trivial. “
I mean sure it is possible for all air molecules to randomly all go to the same corner of the room at the same time (heck it is inevitable in some sense), you can play it back in reverse to check no laws of physics were broken, but practically that simply does not happen.
Perhaps I have a different understanding of "symbolic". The article proceeds to use various symbolic expressions and equations. Why say this above if you're not going to follow through? Visuals are there but peppered in.
(Not complaining about this article, which is illuminating).
It also felt like this could be a good topic for a 3b1b video... and... here's the 3b1b video on gaussians: https://www.youtube.com/watch?v=d_qvLDhkg00
To me, this is one of the most astonishing things about probability theory, as well as one of the most useful.
The normal distribution is just one of a class of "stable distributions," all sharing the properties of sums of their R.V.s being in the same family.
The same idea can be generalized much further. The underlying idea is the distribution of "things" as they get asymptotically "bigger." The density of eigenvalues of random matrices with I.I.D entries approach the Wigner Semicircle Distribution, which is exactly what it sounds like. It plays the role of the normal distribution in the very practically-promising theory of free (noncommutative) probability.
https://en.wikipedia.org/wiki/Wigner_semicircle_distribution
Further reading:
https://terrytao.wordpress.com/2010/01/05/254a-notes-2-the-c...
*there's a few normal distribution CLTs, but this is the intuitive one that usually matters in practice