←back to thread

Zamba2-7B

(www.zyphra.com)
282 points dataminer | 1 comments | | HN request time: 0.291s | source
Show context
wg0 ◴[] No.41843436[source]
If a model was trained in 1837, would it be useful even today? How models would be trained in 2037 when most of the web might be autogenerated on the fly like that cgi-bin era?
replies(2): >>41843481 #>>41843487 #
Etheryte ◴[] No.41843487[source]
State of the art models aren't trained the same way as the first models were. High quality datasets are both much more valuable and more useful than simply feeding everything you could possibly crawl into it. Throwing in the kitchen sink and then some is a great way to burn money while also hurting your model accuracy.
replies(2): >>41843784 #>>41845190 #
1. kettleballroll ◴[] No.41845190[source]
Are there any publications out there analyzing this more in depth? How are these datasets scheduled? Do you have your highest quality data first, or do you actually train using "dumb" data first until you establish some general language understanding before giving the high quality information? There is a lot of interesting research to do here that I'm sure people have already investigated....