←back to thread

1311 points msoad | 1 comments | | HN request time: 0.2s | source
Show context
abhimskywalker ◴[] No.35397063[source]
"The recent change also means you can run multiple LLaMA ./main processes at the same time, and they'll all share the same memory resources." So this could have a main and multiple sub-worker llm processes possibly collaborating while sharing same memory footprint?
replies(1): >>35398555 #
1. l33tman ◴[] No.35398555[source]
Yes, if the model is mmap'ed read-only (as I'm sure it is).

There are other bottlenecks than CPU cores though, it might not be very useful to run multiple in parallel..