←back to thread

223 points benkaiser | 1 comments | | HN request time: 0.208s | source
Show context
ChuckMcM ◴[] No.42544992[source]
Interesting that it takes an LLM with 405 BILLION parameters to accurately recall text from a document with slightly less than 728 THOUSAND words. (not quite three decimal orders of magnitude smaller but still).
replies(3): >>42545195 #>>42545228 #>>42545577 #
paxys ◴[] No.42545577[source]
How many books/documents are stored in those 405 billion parameters?
replies(1): >>42546451 #
1. ChuckMcM ◴[] No.42546451[source]
That is a good question, and it implies the definition of a parameter as a compression artifact/constant? If you've read chapter 4 of Feynman's lectures in computation where he talks about information coding, you get a sense of where I'm coming from. There is some reversible function in LLMs that go from book/document => parameters => book/document. The 'parameters' are the controlling information of that function, what does the information contained in a parameter represent with respect to a book document?