←back to thread

135 points lnyan | 1 comments | | HN request time: 0.212s | source
Show context
llm_trw ◴[] No.42731112[source]
>The estimated training time for the end-to-end model on an 8×H100 machine is 2.6 days.

That's a $250,000 machine for the micro budget. Or if you don't want to do it locally ~$2,000 to do it on someone else's machine for the one model.

replies(4): >>42731300 #>>42731535 #>>42732654 #>>42738646 #
1. vessenes ◴[] No.42738646[source]
It’s still super cheap. I think the better nit to pick is that it’s almost a freebie after NVIDIA’s (hardware) and the rest of the world (software) billions of dollars of R&D and CapEx have gone into the full training stack. So, this is more like a blessing of the commons.