←back to thread

190 points arittr | 8 comments | | HN request time: 1.01s | source | bottom
Show context
firejake308 ◴[] No.43999554[source]
I'm confused why they are working on their own frontier models if they are going to be bought by OpenAI anyway. I guess this is something they were working on before the announcement?
replies(6): >>44002084 #>>44002155 #>>44002246 #>>44002598 #>>44002975 #>>44003869 #
allenleee ◴[] No.44002246[source]
It seems OpenAI acquired Windsurf but is letting it operate independently, keeping its own brand and developing its own coding models. That way, if Windsurf runs into technical problems, the backlash lands on Windsurf—not OpenAI. It’s a smart way to innovate while keeping the main brand safe.
replies(4): >>44002285 #>>44003466 #>>44005701 #>>44006118 #
riffraff ◴[] No.44002285[source]
But doesn't this mean they have twice the costs in training? I was under the impression that was still the most expensive part of these companies' balance.
replies(2): >>44002309 #>>44003091 #
1. rfoo ◴[] No.44003091[source]
mid/post training does not cost that much, except maybe large scale RL, but even this is more of an infra problem. If anything, the cost is mostly in running various experiments (i.e. the process of doing research).

It is very puzzling why "wrapper" companies don't (and religiously say they won't ever) do something on this front. The only barrier is talents.

replies(1): >>44003222 #
2. anshumankmr ◴[] No.44003222[source]
You might be underestimating the barrier to hiring the really smart people. Open AI/Google etc would be hiring and poaching people like crazy, offering cushy bonuses and TCs that would make blow your mind.(Like say Noam Brown at Open AI) And some of the more ambitious ones would start their own ventures (like say Ilya etc.).

That being said I am sure a lot of the so called wrapper companies are paying insanely well too, but competing with FAANGMULA might be trickier for them.

replies(2): >>44003495 #>>44003716 #
3. NitpickLawyer ◴[] No.44003495[source]
FAANGMULA ... Microsoft, Uber?, L??, Anthropic? Who's the L?
replies(2): >>44003605 #>>44006161 #
4. Archonical ◴[] No.44003605{3}[source]
Lyft.
5. whywhywhywhy ◴[] No.44003716[source]
Any half decent and methodical software engineer can fine tune/repurpose a model if you have the data and the money to burn on compute and experiment runs, which they do.
replies(2): >>44004352 #>>44004373 #
6. anshumankmr ◴[] No.44004352{3}[source]
Fine tuning/distilling etc is fine. I was speaking to the original commenter's question about research, which is where things are trickier. Fine tuning is something I even managed and Unsloth has removed even barriers for training some of the more commonly used open source models.
7. brookst ◴[] No.44004373{3}[source]
They can absolutely do it, but they will get poorer results than someone who really understands LLMs. There is still a huge amount of taste and art in the sourcing and curation of data for fine tuning.
8. riffraff ◴[] No.44006161{3}[source]
A is Airbnb, afair.