Erlang is not as performant for heavy computational loads, this is bared out in many benchmarks, thats not what it's good at.
Message passing share nothing adds overhead when trying to reference data between processes because it must be copied, how would you do multithreaded processing of a large amount of data without a shared heap in a performant way? Only one thread can work on a heap at a time, so what do you do? Chop up the data and copy it around then piece it back together afterward? Thats overhead vs just working on a single heap in a lockless way. Far as I can tell the main Erlang image processing libraries just call out to C libraries that says something of that kind of work.
Yes Erlang indirects computation to a OS thread pool, multiplexing all those little Erlang process on real threads creates scheduling overhead. Those threads cannot work on the same data at the same time unless they call out to a libraries written in another language like C to do the heavy lifting.
.Net does similar things for say web server implementations, it uses a thread pool to execute many concurrent requests and if you use async it can yield those threads back to the pool while say waiting on a DB call to complete, you would not create a thread per http connection so the 4mb stack size is not an issue just like its not with Erlangs thread pool.