←back to thread

Zamba2-7B

(www.zyphra.com)
282 points dataminer | 1 comments | | HN request time: 0.194s | source
Show context
potatoman22 ◴[] No.41843419[source]
I wonder how much of the performance gains can be attributed to their improved dataset rather than their architecture. That would be an expensive experiment.
replies(1): >>41854593 #
1. hack_ml ◴[] No.41854593[source]
The ablation studies and the dataset can be found here: https://www.zyphra.com/post/building-zyda-2