Windsurf SWE-1: Our First Frontier Models

(windsurf.com)

Show context

firejake308 ◴[15 May 25 21:35 UTC] No.43999554[source]▶

I'm confused why they are working on their own frontier models if they are going to be bought by OpenAI anyway. I guess this is something they were working on before the announcement?

replies(6): >>44002084 #>>44002155 #>>44002246 #>>44002598 #>>44002975 #>>44003869 #

allenleee ◴[16 May 25 06:03 UTC] No.44002246[source]▶

>>43999554 #

It seems OpenAI acquired Windsurf but is letting it operate independently, keeping its own brand and developing its own coding models. That way, if Windsurf runs into technical problems, the backlash lands on Windsurf—not OpenAI. It’s a smart way to innovate while keeping the main brand safe.

replies(4): >>44002285 #>>44003466 #>>44005701 #>>44006118 #

riffraff ◴[16 May 25 06:12 UTC] No.44002285[source]▶

>>44002246 #

But doesn't this mean they have twice the costs in training? I was under the impression that was still the most expensive part of these companies' balance.

replies(2): >>44002309 #>>44003091 #

1. rfoo ◴[16 May 25 08:40 UTC] No.44003091[source]▶

>>44002285 #

mid/post training does not cost that much, except maybe large scale RL, but even this is more of an infra problem. If anything, the cost is mostly in running various experiments (i.e. the process of doing research).

It is very puzzling why "wrapper" companies don't (and religiously say they won't ever) do something on this front. The only barrier is talents.

replies(1): >>44003222 #

2. anshumankmr ◴[16 May 25 09:01 UTC] No.44003222[source]▶

>>44003091 (TP) #

You might be underestimating the barrier to hiring the really smart people. Open AI/Google etc would be hiring and poaching people like crazy, offering cushy bonuses and TCs that would make blow your mind.(Like say Noam Brown at Open AI) And some of the more ambitious ones would start their own ventures (like say Ilya etc.).

That being said I am sure a lot of the so called wrapper companies are paying insanely well too, but competing with FAANGMULA might be trickier for them.

replies(2): >>44003495 #>>44003716 #

3. NitpickLawyer ◴[16 May 25 09:51 UTC] No.44003495[source]▶

>>44003222 #

FAANGMULA ... Microsoft, Uber?, L??, Anthropic? Who's the L?

replies(2): >>44003605 #>>44006161 #

4. Archonical ◴[16 May 25 10:11 UTC] No.44003605{3}[source]▶

>>44003495 #

Lyft.

5. whywhywhywhy ◴[16 May 25 10:31 UTC] No.44003716[source]▶

>>44003222 #

Any half decent and methodical software engineer can fine tune/repurpose a model if you have the data and the money to burn on compute and experiment runs, which they do.

replies(2): >>44004352 #>>44004373 #

6. anshumankmr ◴[16 May 25 12:00 UTC] No.44004352{3}[source]▶

>>44003716 #

Fine tuning/distilling etc is fine. I was speaking to the original commenter's question about research, which is where things are trickier. Fine tuning is something I even managed and Unsloth has removed even barriers for training some of the more commonly used open source models.

7. brookst ◴[16 May 25 12:01 UTC] No.44004373{3}[source]▶

>>44003716 #

They can absolutely do it, but they will get poorer results than someone who really understands LLMs. There is still a huge amount of taste and art in the sourcing and curation of data for fine tuning.

8. riffraff ◴[16 May 25 14:48 UTC] No.44006161{3}[source]▶

>>44003495 #

A is Airbnb, afair.

↑