The abstraction makes it possible to submit multiple requests and only then begin to inquire about their results.
The abstraction allows for, but does not require, a concurrent implementation.
However, the intent behind the abstraction is that there be concurrency. The motivation is to obtain certain benefits which will not be realized without concurrency.
Some asynchronous abstractions cannot be implemented without some concurrency. Suppose the manner by which the requestor is informed about the completion of a request is not a blocking request on a completion queue, but a callback.
Now, yes, a callback can be issued in the context of the requesting thread, so everything is single-threaded. But if the requesting thread holds a non-recursive mutex, that ruse will reveal itself by causing a deadlock.
In other words, we can have an asynchronous request abstraction that positively will not work single threaded;
1 caller locks a mutex
2 caller submits request
3 caller unlocks mutex
4 completion callback occurs
If step 2 generates a callback in the same thread, then step 3 is never reached.
The implementation must use some minimal concurrency so that it has a thread waiting for 3 while allowing the requestor to reach that step.