Apple just released a weirdly interesting coding language model

(9to5mac.com)

164 points ksec | 5 comments | 05 Jul 25 11:44 UTC | HN request time: 0.758s | source

Show context

vessenes ◴[08 Jul 25 10:51 UTC] No.44498842[source]▶

>>44472062 (OP) #

Short version: A Qwen-2.5 7b model that has been turned into a diffusion model.

A couple notable things: first is that you can do this at all, (left to right model -> out of order diffusion via finetuning) which is really interesting. Second, the final version beats original by a small margin on some benchmarks. Third is that it’s in the ballpark of Gemini diffusion, although not competitive — to be expected for any 7B parameter model.

A diffusion model comes with a lot of benefits in terms of parallelization and therefore speed; to my mind the architecture is a better fit for coding than strict left to right generation.

Overall, interesting. At some point these local models will get good enough for ‘real work’ and they will be slotted in at API providers rapidly. Apple’s game is on-device; I think we’ll see descendants of these start shipping with Xcode in the next year as just part of the coding experience.

replies(6): >>44498876 #>>44498921 #>>44499170 #>>44499226 #>>44499376 #>>44501060 #

1. koakuma-chan ◴[08 Jul 25 12:08 UTC] No.44499226[source]▶

>>44498842 #

> At some point these local models will get good enough for ‘real work’

Are these small models good enough for anything but autocomplete?

replies(3): >>44499252 #>>44499320 #>>44503219 #

2. _heimdall ◴[08 Jul 25 12:14 UTC] No.44499252[source]▶

>>44499226 (TP) #

Isn't that all they're designed for?

They predict more than just the second half of a word you are typing, but at the end of the day they're still just predicting what a human would have typed.

replies(1): >>44499283 #

3. koakuma-chan ◴[08 Jul 25 12:18 UTC] No.44499283[source]▶

>>44499252 #

I'm disappointed because I don't use autocomplete.

4. MangoToupe ◴[08 Jul 25 12:25 UTC] No.44499320[source]▶

>>44499226 (TP) #

Given that's 99% of my usage of it, that alone would make me quite happy.

5. Eggpants ◴[08 Jul 25 19:30 UTC] No.44503219[source]▶

>>44499226 (TP) #

Most of the "magic" of large models are really just function calls, so as long as the small models have access to the same functions they work well. They fixed the "how many R's in Strawberry" issue by offloading the question to a function, not spending a godly amount of money/energy on training another model.

Oops, sorry "Tools". Gotta maintain the grift these statistic based lossy text compression cool bar tricks are "thinking".

↑