←back to thread

1311 points msoad | 2 comments | | HN request time: 0s | source
1. abhimskywalker ◴[] No.35397063[source]
"The recent change also means you can run multiple LLaMA ./main processes at the same time, and they'll all share the same memory resources." So this could have a main and multiple sub-worker llm processes possibly collaborating while sharing same memory footprint?
replies(1): >>35398555 #
2. l33tman ◴[] No.35398555[source]
Yes, if the model is mmap'ed read-only (as I'm sure it is).

There are other bottlenecks than CPU cores though, it might not be very useful to run multiple in parallel..