←back to thread

247 points nabla9 | 1 comments | | HN request time: 0s | source
Show context
gcanyon ◴[] No.41833456[source]
One that isn't listed here, and which is critical to machine learning, is the idea of near-orthogonality. When you think of 2D or 3D space, you can only have 2 or 3 orthogonal directions, and allowing for near-orthogonality doesn't really gain you anything. But in higher dimensions, you can reasonably work with directions that are only somewhat orthogonal, and "somewhat" gets pretty silly large once you get to thousands of dimensions -- like 75 degrees is fine (I'm writing this from memory, don't quote me). And the number of orthogonal-enough dimensions you can have scales as maybe as much as 10^sqrt(dimension_count), meaning that yes, if your embeddings have 10,000 dimensions, you might be able to have literally 10^100 different orthogonal-enough dimensions. This is critical for turning embeddings + machine learning into LLMs.
replies(5): >>41833539 #>>41834446 #>>41835280 #>>41835565 #>>41861970 #
1. sigmoid10 ◴[] No.41835280[source]
This is actually just another way to see the third example (concentration of measure). As you increase the number of dimensions, the contribution of each base vector component in the calculation of, say, the cosine angle (i.e. via the scalar product) becomes less important. So in three dimensions you'll have a pretty high angle if one vector component points along a different base vector. But in 10,000 dimensions, the angle will be tiny.