←back to thread

296 points todsacerdoti | 1 comments | | HN request time: 0s | source
Show context
marcosdumay ◴[] No.44367966[source]
Yeah, make the network deeper.

When all you have is a hammer... It makes a lot of sense that a transformation layer that makes the tokens more semantically relevant will help optimize the entire network after it and increase the effective size of your context window. And one of the main immediate obstacle stopping those models from being intelligent is context window size.

On the other hand, the current models already cost something on the line of the median country GDP to train, and they are nowhere close to that in value. The saying that "if brute force didn't solve your problem, you didn't apply enough force" is intended to be listened as a joke.

replies(4): >>44368591 #>>44368640 #>>44374381 #>>44377110 #
jagraff ◴[] No.44368591[source]
I think the median country GDP is something like $100 Billion

https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(PPP)

Models are expensive, but they're not that expensive.

replies(5): >>44368728 #>>44368795 #>>44368858 #>>44369293 #>>44371000 #
kordlessagain ◴[] No.44368858[source]
The median country GDP is approximately $48.8 billion, which corresponds to Uganda at position 90 with $48.769 billion.

The largest economy (US) has a GDP of $27.7 trillion.

The smallest economy (Tuvalu) has a GDP of $62.3 million.

The 48 billion number represents the middle point where half of all countries have larger GDPs and half have smaller GDPs.

replies(1): >>44374638 #
1. hoseja ◴[] No.44374638[source]
Well then you have to agree that $48.8 billion IS "something like $100 billion".