Whats currently stopping me (apart from library support) from running a single command that starts up WSGI workers and Celery workers in a single process?
Nothing, it's just that these aren't first class features of the language. Also someone already explained that the GIL is mostly about technical debt in the CPython interpreter, so there are reasons other than full parallelism to get rid of the GIL.