←back to thread

41 points fangpenlin | 1 comments | | HN request time: 0.205s | source
Show context
colordrops ◴[] No.43235976[source]
This looks fun. The author mentions machine learning workloads. What are typical machine learning use cases for a cluster of lower end GPUs?

While on that topic, why must large model inferencing be done on a single large GPU and/or bank of memory rather than a cluster of them? Is there promise of being able to eventually run large models on clusters of weaker GPUs?

replies(2): >>43236735 #>>43237588 #
fangpenlin ◴[] No.43236735[source]
You can check Exo out:

https://github.com/exo-explore/exo

It's a project designed to run a large model in a distributed manner. My need for GPU is to run my own machine learning research pet project (mostly evolutionary neuron network models for now), and it's a bit different from inferencing needs. Training is yet another different story.

But yeah, I agreed. I think machine learning should be distributed more in the future.

replies(1): >>43237473 #
1. colordrops ◴[] No.43237473[source]
Exo looks awesome, exactly what I had in mind, thank you.