How does this differ from direct threading interpreters?
It seems like it solves the same problem (saving the function call overhead) and has the same downsides (requires non-standard compiler extensions)
EDIT: it seems the answer is that compilers do not play well with direct-threaded interpreters and they are able to perform more/better optimizations when looking at normal-sized functions rather than massive blocks
replies(3):