(www.zyphra.com)

282 points dataminer | 1 comments | 14 Oct 24 22:45 UTC | HN request time: 0.205s | source

Show context

wg0 ◴[14 Oct 24 23:45 UTC] No.41843436[source]▶

If a model was trained in 1837, would it be useful even today? How models would be trained in 2037 when most of the web might be autogenerated on the fly like that cgi-bin era?

replies(2): >>41843481 #>>41843487 #

Etheryte ◴[14 Oct 24 23:51 UTC] No.41843487[source]▶

>>41843436 #

State of the art models aren't trained the same way as the first models were. High quality datasets are both much more valuable and more useful than simply feeding everything you could possibly crawl into it. Throwing in the kitchen sink and then some is a great way to burn money while also hurting your model accuracy.

replies(2): >>41843784 #>>41845190 #

1. kettleballroll ◴[15 Oct 24 05:18 UTC] No.41845190[source]▶

>>41843487 #

Are there any publications out there analyzing this more in depth? How are these datasets scheduled? Do you have your highest quality data first, or do you actually train using "dumb" data first until you establish some general language understanding before giving the high quality information? There is a lot of interesting research to do here that I'm sure people have already investigated....

↑

Zamba2-7B