Embeddings are underrated (2024)

Yeah, the ethics around _training_ models that generate embeddings is still suspect to me, but the use of embeddings as a cheap, efficient way to provide semantic similarity seems very valuable. I've started dipping my toes in doing real, honest-to-goodness "machine learning" at work and it's mostly involved having OpenAI create embeddings for support logs my team generates, and we're starting to get value out of being able to cluster certain types of issues together, which I'm excited by. But this kind of stuff is truly augmentative: representing complex ideas in easily-searchable vector spaces, making connections in datasets too vast for humans to comb through alone, that's actual value.