(twitter.com)

171 points martinald | 3 comments | 12 Jul 25 01:07 UTC | HN request time: 0s | source

Show context

krackers ◴[12 Jul 25 01:49 UTC] No.44538619[source]▶

Probably the results were worse than K2 model released today. No serious engineer would say it's for "safety" reasons given that ablation nullifies any safety post-training.

replies(1): >>44538817 #

1. simonw ◴[12 Jul 25 02:33 UTC] No.44538817[source]▶

>>44538619 #

I'm expecting (and indeed hoping) that the open weights OpenAI model is a lot smaller than K2. K2 is 1 trillion parameters and almost a terabyte to download! There's no way I'm running that on my laptop.

I think the sweet spot for local models may be around the 20B size - that's Mistral Small 3.x and some of the Gemma 3 models. They're very capable and run in less than 32GB of RAM.

I really hope OpenAI put one out in that weight class, personally.

replies(2): >>44539739 #>>44541806 #

2. NitpickLawyer ◴[12 Jul 25 06:09 UTC] No.44539739[source]▶

>>44538817 (TP) #

Early rumours (from a hosting company that apparently got early access) was that you'd need "multiple h100s to run it", so I doubt it's a gemma - mistral small tier model..

3. aabhay ◴[12 Jul 25 13:07 UTC] No.44541806[source]▶

>>44538817 (TP) #

You will get at 20gb model. Distillation is so compute efficient that it’s all but inevitable that if not OpenAI, numerous other companies will do it.

I would rather have an open weights model that’s the best possible one I can run and fine tune myself, allowing me to exceed SOTA models on the narrower domain my customers care about.

↑

OpenAI delays launch of open-weight model