Eventually I wrote an "image sorter" that I found was hanging up when the browser was trying to download images in parallel, the image serving should not have been CPU bound, I was even using sendfile(), but I think other requests would hold up the CPU and would be block the tiny amount of CPU needed to set up that sendfile.
So I switched from aiohttp to the flask API and serve with either Flask or Gunicorn, I even front it with Microsoft IIS or nginx to handle the images so Python doesn't have to. It is a minor hassle because I develop on Windows so I have to run Gunicorn inside WSL2 but it works great and I don't have to think about server performance anymore.
Reminds me of how long it took some to go from Python 2 to Python 3.
I personally blame low async adoption in Python on 1) general reduction in its popularity vs Typescript+node, which is driven by the desire to have a single stack on the frontend and backend, not by bad or good async implementations in Python (see also: Rails, once the poster child of the Web, now nearly forgotten) 2) lack of good async stdlib. parallelism and concurrency are distant thirds.
Not worth the trouble. Shell pipelines are way easier to use. Or simply waiting —no pun intended— for the synchronous to finish.
"There should be one-- and preferably only one --obvious way to do it : Aim for a single, clear solution to a problem. "
I'm personally halfway through that journey (having spent like 4h reading docs/learning, on top of the development). I suspect it could have been designed in such a way so that it's less trivially easy to mess up.
The truth is that in python, async was too little, too late. By the time it was introduced, most people who actually needed to do lots of io concurrently had their own workarounds (forking, etc) and people who didn't actually need it had found out how to get by without it (multiprocessing etc).
Meanwhile, go showed us what good green threads can look like. Then java did it too. Meanwhile, js had better async support the whole time. But all it did was show us that async code just plain sucks compared to green thread code that can just block, instead of having to do the async dances.
So, why engage with it when you already had good solutions?
Asyncio means learning different syntax that buys me nothing over the existing tools. Why would I bother?
The traditional argument against the above assertion has been that asyncio is good for I/O work, not for CPU work, but this constraint is not realistic because CPU usage is guaranteed to creep in.
In summary, I can use threading/process/interpreter pools and concurrent futures, considering I need them anyway, without really needing to introduce yet another unnecessary concurrency paradigm (of asyncio).
The vast majority of the Python code I wrote in the last 5-6 years uses asyncio, and most of the complaints I see about it (hard to debug, getting stuck, etc.) were -- at least in my case -- because there were some other libraries doing unexpected things (like threading or hard sleep()).
Coming from a networking background, the way I can deal with I/O has been massively simplified, and coroutines are quite useful.
But as always in HN, I'm prepared for that to be an unpopular opinion.
Even then, nginx might be a netter solution.
One of the most memorable "real software engineering" bugs of my career involved async Python. I was maintaining a FastAPI server which was consistently leaking file descriptors when making any outgoing HTTP requests due to failing to close the socket. This manifested in a few ways: once the server ran out of available file descriptors, it degraded to a bizarre world where it would accept new HTTP requests but then refuse to transmit any information, which was also exciting due to increasing the difficulty of remotely debugging this. Occasionally the server would run out of memory before running out of file descriptors on the OS, which was a fun red herring that resulted in at least one premature "I fixed the problem!" RAM bump.
The exact culprit was never found - I spent a full week debugging it, and concluded that the problem had to do with someone on the library/framework/system stack of FastAPI/aiohttp/asyncio having expectations about someone else in the stack closing the socket after picking up the async context, but that never actually occurring. It was impenetrable to me due to the constant context switching between the libraries and frameworks, such that I could not keep the thread of who (above my application layer) should have been closing it.
My solution was to monkey patch the native python socket class and add a FastAPI middleware layer so that anytime an outgoing socket opened, I'd add it to a map of sockets by incoming request ID. Then when the incoming request concluded I'd lookup sockets in the map and close them manually.
It worked, the servers were stable, and the only follow-up request was to please delete the annoying "Socket with file descriptor <x> manually closed" message from the logs, because they were cluttering things up. And thus, another brick in the wall of my opinion that I do not prefer Python for reliable, high-performance HTTP servers.
I take so much flak for this opinion at work, but I agree with you 100%.
Code that looks synchronous, but is really async, has funny failure modes and idiosyncracies, and I generally see more bugs in the async parts of our code at work.
Maybe I’m just old, but I don’t think it’s worth it. Syntactic sugar over continuations/closures basically..
greenlet which is sort of minimal stackless .. before 2008
pycoev which is on one hand greenlets without memmove()s, on the other hand sort of io-scheduled m:n threading I wrote myself in 2009.
so, at least idk, 20 years?
It was first needed. Then 10 years passed, people got around to pushing it through the process aaand by the time it was done it was already not needed. so it all stalled. Same with Rust.
Nowadays server-side async is handled very differently. And client-side is dominated by that abomination called JS.
async is a concurrency mechanism.
I realized, years later, that the (non-)documentation was directed at people who were already familiar with the feature from Javascript. But I hadn't been familiar with it from Javascript and I didn't even know that Javascript had had such a feature.
So that's my tiny contribution to this discussion, one data point: Python's async might have been one unit more popular if it had had any documentation, or even a crossreference to the Javascript documentation.
AWSCLI was broken for over a year- we had to do a ton of work to deal with the various packaging issues.
Don't break userspace.
Technical things are largely popular for the same reason non-technical things are popular: trends. In other words, they are popular because other people perceive them to be popular. Humans are herd animals.
async is harder and associated with Node.js/JavaScript which probably makes it uncool for a certain influential python subculture.
But actually Fast API has basically taken over and now I think people should recognize that means async IS popular in python at this point.
Snide remark aside, I actually like the Zen of Python as programming language folklore but in 2025 AD it's kinda crazy to pretend that Python actually adheres to those tenets or whatever you wish to call them, and I'd go as far as to claim that it does a disservice to a language flexible enough for a lot of use cases. There's even someone on YouTube developing a VR game with Python.
Use a tuple, maybe walrus, and return the last item[-1].
That is, if you use external stuff and can delegate work to them, then async is concurrent (async io for instance)
But if you do not, then async is regular code with extra steps
But, to sum it all up for those who want to talk here, there are several ways to look at concurrency but only one that matters. Is my program correct? How long will it take to make my program correct? Structured concurrency makes that clear(er) in the syntax of the language. Unstructured concurrency requires that you hold all the code in your head.
[1]: https://glyph.twistedmatrix.com/2014/02/unyielding.html
[2]: https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...
[3]: https://vorpus.org/blog/notes-on-structured-concurrency-or-g...
the documentation is directed at people who want coroutines and futures, and know what that means. if you don't know what coroutines and futures are, the python docs aren't going to help you. the documentation isn't going to guide anybody into using the async features who aren't already seeking them out. and maybe that's intentional, but it's not going to grow adoption of the async features.
1) its infectious. You need to wrap everything in async or nothing.
2) it has non-obvious program flow.
Even though it is faster in a lot of cases (I had a benchmark off for a web/socket server for multi-threaded vs async with a colleague, and the async was faster.) for me it is a shit to force into a class.
The thing I like about threads is that the flow of data is there and laid out neatly _per thread_, where as to me, async feels like surprise goto. async feels like it accepts a request, and then will at some point at the future either trigger more async, or crap out mixing loads of state from different requests all over the place.
To me it feels like a knotted wool bundle, where as threaded/multi-process feels like a freshly wound bobbin.
Now, this is all viiiiiibes man, so its subjective.
asyncio is easier than threads or multiprocess: less locking issue, easier to run small chunks of code in // (easier to await something than to create a thread that run some method)
I'd add one other aspect that we sort of take for granted these days, but affordable multi-threaded CPUs have really taken off in the last 10 years.
Not only does the stack based on green-threads "just work" without coloring your codebase with async/no-async, it allows you to scale a single compute instance gracefully to 1 instance with N vCPUs vs N pods of 2-vCPU instances.
The comment you are responding to prefers green threads to be managed like goroutines, where the code looks synchronous, but really it's cooperative multitasking managed by the runtime, to explicit async/await.
But then you criticize "code that looks synchronous but is really async". So you prefer the explicit "async" keywords? What exactly is your preferred model here?
For any new app that is mostly IO constraint I'd still encourage the use of asyncio from the beginning.
The bug has been open for 2 years, with zero fucks given. The workaround is "just use libuv": https://github.com/encode/uvicorn/issues/2167
I've seen other such cases, and I just gave up on trying to use async.
You may think of use of an async keyword as explicit async code but that is very much not the case.
If you want to see async code without the keyword, most of the code of Linux is asynchronous.
To be fair that also happens with other solutions.
I'll be sold on this when a green thread native UI paradigm becomes popular but it seems like all the languages with good native UI stories have async support.
> Because parallelism in Python using threads has always been so limited, the APIs in the standard library are quite rudimentary. I think there is an opportunity to have a task-parallelism API in the standard library once free-threading is stabilized.
> I think in 3.14 the sub-interpreter executor and free-threading features make more parallel and concurrency use cases practical and useful. For those, we don’t need async APIs and it alleviates much of the issues I highlighted in this post.
Armin recently put up a post that goes into those issue in more depth: https://lucumr.pocoo.org/2025/7/26/virtual-threads/
Which lead me to a pre-PEP discussion regarding the possibility of Virtual Threads in Python, which was probably way more than I needed to know but found interesting: https://discuss.python.org/t/add-virtual-threads-to-python/9...
The memory and execution model for higher level work needs to not have async. Go is the canonical example of it done well from the user standpoint IMO.
Goroutines feel like old-school, threaded code to me. I spawn a goroutine and interact with other “threads” through well defined IPC. I can’t tell if I’m spawning a green thread or a “real” system thread.
C#’s async/await is different IMO and I prefer the other model. I think the async-concept gets overused (at my workplace at least).
If you know Haskell, I would compare it to overuse of laziness, when strictness would likely use fewer resources and be much easier to reason about. I see many of the same problems/bugs with async/await..
Doing async in python has the same fundamental design. You have an executer, a scheduler, and event-driven wakers on futures or promises. But you’re doing it in a fundamentally hand-cuffed environment.
You don’t get benefits like static compilation, real work-stealing, a large library ecosystem, or crazy performance boosts. Except in certain places in the stack.
Using fastapi with async is a game-changer. Writing a cli to download a bunch of stuff in parallel is great.
But if you want to use async to parse faster or make a parallel-friendly GUI, you are more than likely wasting your time using python. The benefits will be bottlenecked by other language design features. Still the GIL mostly.
I guess there is no reason you can’t make tokio in python with multiprocessing or subinterpreters, but to my knowledge that hasn’t been done.
Learning tokio was way more fun, too.
By now, the downsides are well-known, but I think Python's implementation did a few things that made it particularly unpleasant to use.
There is the usual "colored functions" problem. Python has that too, but on steroids: There are sync and async functions, but then some of the sync functions can only be called from an async function, because they expect an event loop to be present, while others must not be called from an async function because they block the thread or take a lot of CPU to run or just refuse to run if an event loop is detected. That makes at least four colors.
The API has the same complexity: In JS, there are 3 primitives that you interact with in code: Sync functions, async functions and promises. (Understanding the event loop is needed to reason about the program, but it's never visible in the code).
Whereas Python has: Generators, Coroutines, Awaitables, Futures, Tasks, Event Loops, AsyncIterators and probably a few more.
All that for not much benefit in everyday situations. One of the biggest advantages of async/await was "fearless concurrency": The guarantee that your variables can only change at well-defined await points, and can only change "atomically". However, python can't actually give the first guarantee, because threaded code may run in parallel to your async code. The second guarantee already comes for free in all Python code, thanks to the GIL - you don't need async for that.
I recognise that this situation is possible, but I don't think I've ever seen it happen. Can you give an example?
In contrast preemptive green threads are too easy. Be it IO or CPU load all threads will get their slice of CPU time. Nothing is blocked so you can debug your logic errors instead of deadlocks everywhere.
Async works in JS so well because the entire language is designed for it, instead of async being just bolted on. You can't even run plain `sleep` to block, you need setTimeout.
Taking a general case, let's say a forum, in order to render a thread one needs to search for all posts from that thread, then get all the extra data needed for rendering and finally send the rendered output to the client.
In the "regular" way of doing this, one will compose a query, that will filter things out, join all the required data bla bla, send it to the database, wait for the answer from the database and all the data to be transferred over, loop over the results and do some rendering and send the thing over to the client.
It doesn't matter how async your app code is, in this way of doing things, the bottle neck is the database, as there is a fixed limit on how many things a db server can do at once and if doing one of these things takes a long time, you still end up waiting too much.
In order for async to work, one needs to split the work load into very small chunks that can be done in parallel and very fast, therefore, sending a big query and waiting for all the result data is out of the window.
An async approach would split the db query into a search query, that returns a list of object ids, say posts, then create N number of async tasks that given a post id will return a rendered result. These tasks will do their own query to retrieve the post data, then assemble another list of async tasks to get all the other data required and render each chunk and so on. Throw in a bunch of db replicas and you get the benefits of async.
This approach is not generally used, because, let's face it, we like making the systems we use do complicated things, eg complicated sql requests.
In general, the architectures developed because of the GIL, like Celery and gunicorn and stuff like that, handles most of the problems we run into that async/await solves with slightly better horizontal scaling IMO. The problem with a lot of async code is that it tends not to think beyond the single machine that's running it, and by the time you do, you need to rearchitect things to scale better horizontally anyway.
For most Python applications, especially with web development, just start with something like Celery and you're probably fine.
Thank you for explaining much more clearly than I could.
> none of the function coloring stuff
And it’s this part that I don’t like (and see colleagues struggling to implement correctly at work).
Ultimately Python already has function coloring, and libraries are forced into that. This proposal seems poorly thought out, and also too little too late.
If I remember correctly, the Python async API was still in experimental phase at that time.
I am happy to hear stories of using pypy or something to radically improve an architecture. I don’t have any from personal experience.
I guess twisted and stackless, a long time ago.
I use async all the time.
The evidence this post provides is that flask and Django aren’t all in on async.
That’s meaningless.
Let me try to clarify my point of view:
I don’t mean that async/await is more or less explicit than goroutines. I mean regular threaded code is more explicit than async/await code, and I prefer that.
I see colleagues struggle to correctly analyze resource usage for instance. Someone tries to parallelize some code (perhaps naiively) by converting it to async/await and then run out of memory.
Again, I don’t mean to judge anyone. I just observe that the async/await-flavor has more bugs in the code bases I work on.
> and also too little too late.
I think it very likely that Python will still be around and popular 10 years from now. Probably 20 years from now. And maybe 30 years from now. I think that's plenty of time for a new and good idea that addresses significant pain points to take root and become a predominant paradigm in the ecosystem.
So I don't agree that it's too little too late. But whether or not a Virtual Threads implementation can/will be developed and be good enough to gain wide adoption, I just can't speak to. If it's possible to create a better devx than async and get multi-core performance and usage, I'm all for the effort.
* Asyncio is pretty good, and is usually the best choice for non-blocking I/O in python these days.
* Asyncio doesn't add multi-core scaling to python. It's not a replacement for threads and doesn't lift the GIL-imposed scaling limitations. If these things are what you're after from asyncio you'll be disappointed, but they're not what it's trying to add and not adding them doesn't make it a failure.
* "Coloured functions" is a nonsense argument and that article made the whole world slightly more dumb.
* The GIL is part of the reason for python's success. I hope nogil either somehow manages to succeed without compromising the benefits the GIL has brought (I'll be amazed if that happens) or fails entirely. Languages are tools and every tool in your toolbox doesn't have to eventually turn into a drill. If your use case requires in-process parallelisation of interpreted CPU-bound workloads across multiple cores, python is just the wrong thing to use.
* It is indeed extremely annoying that we don't have async file access yet. I hope we get it soon.
Kernel-style async code, where everything is explicit:
* You write a poller that opens up queues and reads structs representing work
* Your functions are not tagged as "async" but they do not block
* When those functions finish, you explicitly put that struct in another queue based on the result
Async-await code, where the runtime is implicit:
* All async functions are marked and you await them if they might block
* A runtime of some sort handles queueing and runnability
Green threads, where all asynchrony is implicit:
* Functions are functions and can block
* A runtime wraps everything that can block to switch to other local work before yielding back to the kernel
[1] https://www.tomshardware.com/pc-components/cpus/amd-announce...
This point doesn't get enough coverage. When I saw async coming into Python and C# (the two ecosystems I was watching most closely at the time) I found it depressing just how much work was going into it that could have been productively expended elsewhere if they'd have gone with blocking calls to green threads instead.
To add insult to injury, when implementing async it seems inevitable that what's created is a bizarro-world API that mostly-mirrors-but-often-not-quite the synchronous API. The differences usually don't matter, until they do.
So not only does the project pay the cost of maintaining two APIs, the users keep paying the cost of dealing with subtle differences between them that'll probably never go away.
> I do not prefer Python for reliable, high-performance HTTP servers
I don't use it much anymore, but Twisted Matrix was (is?) great at this. Felt like a superpower to, in the oughties, easily saturate a network interface with useful work in Python.
This is used by most of asyncio's synchronization primitives, e.g. async.Queue.
A consequence is that you cannot use asyncio Queues to pass messages or work items between async functions and worker threads. (And of course you can't use regular blocking queues either, because they would block).
The only solution is to build your own ad-hoc system using loop.call_soon_threadsafe() or use third-party libs like Janus[2].
[1] https://github.com/python/cpython/blob/e4e2390a64593b33d6556...
You must be an experienced developer to write maintenable code with Twisted, otherwise, when the codebase increase a little, it will quickly become a bunch of spaghetti code.
My take on gunicorn is that it doesn't need any tuning or care to handle anything up to the large workgroup size other than maybe "buy some more RAM" -- and now if I want to do some inference in the server or use pandas to generate a report I can do it.
If I had to go bigger I probably wouldn't be using Python in the server and would have to face up to either dual language or doing the ML work in a different way. I'm a little intimidated about being on the public web in 2025 though with all the bad webcrawlers. Young 'uns just never learned everything that webcrawler authors knew in 1999. In 2010 there were just two bad Chinese webcrawlers that never sent a lick of traffic to anglophone sites, but now there are new bad webcrawlers every day it seems.
The more significant problem was that Stackless was a separate distribution. Every time CPython updated, there would be a delay until Stackless updated, and tooling like Python IDEs varied in whether they supported Stackless.
A lot of the async problems in other languages is because they haven't bought up into the concept fully with some 3rd party code using it and some don't. JS went all-in with async.
[1]: Yes I know about service workers, but they are not threads in the sense that there is no shared memory*. It is good for some types of parallelization problems, but not others because of all the memory copying required.
[2]: Yes I know about SharedArrayBuffer and there is a bunch of proposals to add support for locks and all that fun stuff to them, which also brings all the complexity back.
When I need to do concurrent stuff I either use fork to multiprocess or use the threading library, no import necessary, couple of lines of code, no need to make specialized code with await keywords and stuff.
This line made me question myself though:
"Then Flask is and probably always will be synchronous (Quart is an async alternative with similar APIs)."
I use flask, and I literally spent the last hour questioning whether I was an idiot and needed to dm my previous clients asking them to fix my code. I'm wondering how my apps passed stress tests of thousands of concurrent users, maybe I did the tests wrong?
Chatgpt says
"s flask asynchronous? ChatGPT said:
Flask itself is not asynchronous. It is a WSGI-based framework, which means it is synchronous by design — it handles one request at a time per worker. Each request is processed sequentially, and concurrency is typically achieved by running multiple worker processes"
Oh shit, I didn't use gunicorn, I just run the python script raw. I'm an idiot. Let's write a test server that sleeps for 1 second before responding to a request:
" import flask import requests import time app = flask.Flask("test")
@app.route("/") def hi(): time.sleep(1) #requests.get("https://google.com") return "Hello, World!"
app.run("0.0.0.0",8088) "
This should block for like 25ms, if 50 concurrent users ask for this resource, there will be an average 500ms of extra latency!
And a Test client that does 50 calls at once, will it take 50 seconds?:
"import threading import requests
URL = "http://127.0.0.1:8088/"
def make_request(i): try: print("req") response = requests.get(URL) print("res") except: print("fail")
threads = []
for i in range(5): t = threading.Thread(target=make_request, args=(i,)) threads.append(t) t.start()
for t in threads: t.join()
print("All requests completed")
"
Then we run with time binary in linux:
>time python3 client.py
All requests completed
real 0m1.216s user 0m0.203s sys 0m0.039s
Ok, turn off the alarms, Flask is fine.
I'm not sure what's going on with async, but the only experience I had with it was a junior dev that came from writing horrible node apps with react and nest (his frontend connected to a supabase db directly with credentials exposed, even if there was a node backend).
He wanted to pivot to python because that's what I used and I had good results, so he installed Quartz instead of Flask, and he was writing Node like code in python, and it was of course a mess.
Not saying that it's always going to be a mess, but you are better off learning the native way of a language instead of trying to shoehorn other abstractions and claiming that the way it is done in python is inefficient, it's one of the most popular languages in the world, these are massively used libraries, it's unlikely that "something is terribly wrong". It's more of a meme that python is slow.
What async is, is an alternative and supposedly cleaner abstraction to do multithreading. What ends up happening is that people use it without understanding multithreading and operating systems in general, they just think that they need to use it to get parallelism.
There's 15 solutions to do parallelism, 1 is the native, vanilla solution (threading library), then there's 3 additional experimental ways in the standard library or futures library, and 11 solutions that you need to pip install. Newbies ask chatgpt or see a stackoverflow thread (or come from node), and they have a 1 in 15 chance of using the regular solution that newbies should be using, because they can't distinguish the wheat from the chaffe.
OP might have suffered from this and even believed that this 15th "async" way to do concurrency was the only way, and is judging python's concurrency by this feature. OP maybe believes that python is just now getting multithreading support? That we are all cavemen running toy applications that server 2 or 3 users? Word to the wise, focus on features that have existed on early versions like python2 BEFORE you focus on features that are being introduced in the later versions like 3.14, this in general, you should first learn how a UNIX machine from the 90s did its thing before you learn the kubernetes spark majiggy
During development, asyncio was called tulip. A quick search turns up this talk by Guido:
https://www.youtube.com/watch?v=aurOB4qYuFM
I seem to recall that Guido was in touch with the author of Twisted at the time, so design ideas from that project may have helped shape asyncio.
Before asyncio, Python had asyncore, a minimal event loop/callback module. I think it was was introduced in Python 1.5.2, and remained part of the standard library until 3.12.
The problem is not python, it's a skill issue.
First of all forking is not a workaround, it's the way multiprocessing works at the low level in Unix systems.
Second of all, forking is multiprocessing, not multithreading.
Third of all, there's the standard threading library which just works well. There's no issue here, you don't need async.
I still remember the days when all the libs started adopting async and how so many of them (to this day) support both passing callbacks or returning promises. Async just so naturally fixed the callback hell of 2010s JS that it just became standard even though it is not even heavily used in the browser APIs.
My understanding is that JS can't do that (besides service workers which are non-shared memory), but it still has multiple concurrent code-blocks being executed at the same time, just in linear fashion. It will just never use multiple CPU cores at the same time (unless calling some non-JS non-shared-memory code)
Function colours can get pretty verbose when you want to write functional wrappers. You can end up writing nearly the exact same code twice because one needs to be async to handle an async function argument, even if the real functionality of the wrapper isn't async.
Coroutines vs futures vs tasks are odd. More than is pleasant, you have one but need the other for an API for no intuitive reason. Some waiting functions work on some types and not on others. But you can usually easily convert between them - so why make a distinction in the first place?
I think if you create a task but don't await it (which is plausible in a server type scenario), it's not guaranteed to run because of garbage collection or something. That's weird. Such behaviour should be obviously defined in the API.
In JavaScript async doesn’t have a good way to nice your tasks, which is an important feature of green threads. Sindre Sorhus has a bunch of libraries that get close, but there’s still a hole.
What coroutines can do is optimize the instruction cache. But I’m not sure goroutines entirely accomplish that. There’s nothing preventing them from doing so but implementation details.
More explicit in what sense? I've written both regular threaded Python and async/await Python. Only the latter shows me precisely where the context switches occur.
The article says SQLalchemy added async support in 2023 but actually it was 2020.
In order to appease the various flavours they mixed and matched stuff from Tornado, gevent, etc.
They should have stuck with the most seamless of those (gevent) and instead of having it monkey-patch the runtime go the Java VirtualThread route and natively yield in all the I/O APIs.
This would have given a Go-esque ease of use and likely would have been immensely more popular.
Hey! We have a product that we clearly want to release worldwide. Let's build it on something that doesn't have Unicode. Or any real threading. And is slow as hell.
You picked a platform that was going to have to break user space.
At least it wasn't JavaScript
which are no different from app POV from kernel threads, or any threads for that matter.
the whole async stuff came up because context switch per event is way more expensive than just shoveling down a page of file descriptor state.
thus poll, kqueue, epoll, io_uring, whatever.
think of it as batch processing
The default linter in Vs Code keeps marking those functions with warnings though. Says I should mark them as async
C# has a dictator with a budget: Microsoft integrated async into C# in a formal way, with 5.0, including standard libs, debugging, docs, samples, clear guidance going forward, etc. What holes there were were dealt with in an orderly and timely manner.
JavaScript actually had a pretty messy start with async, with divergent conventions and techniques. Ultimately this got smoothed out with language additions, but it wasn't all that wonderful in the early days. Also, JavaScript started from a simpler place (single-threaded event loop) that never had "fork" and threads and all that comes with those, so there was less legacy to accommodate and fewer problems to overcome.
Python had a vast base of existing non-async software chock full of blocking code, plus an incomplete and haphazard concurrency evolution. There are several legacy concurrency solutions in Python, most still in use today. Python async is still competing and conflicting with it all. Not unlike the Python 2->3 transition.
How do I get variables for not redoing long-running computations that depend on one-another? So, what if the third tuple value depends on the second and the second in turn depends on the first?
1) Use the network thread pool to also run application code. Then your entire program has to be super careful to not block or do CPU intensive work. This is efficient but leads to difficult to maintain programs.
2) The network thread pool passes work back and forth between an application executor. That way, the network thread pool is never starved by the application, since it is essentially two different work queues. This works great, but now every request performs multiple thread hops, which increases latency.
There has been a lot of interest lately to combine scheduling and work stealing algorithms to create a best of both worlds executor.
You could imagine, theoretically, an executor that auto-scales, and maintains different work queues and tries to avoid thread hops when possible. But ensures there are always threads available for the network.
Everything is in a run loop that does not exist in my codebase.
The context switching points are obvious but the execution environment is opaque.
At least that's how it looks to me.
Maybe a useful approach for a language would be to make "colors" a first-class part of the type system and support them in generics, etc.
Or go a step further and add full-fledged time complexity tracking to the type system.
I would just rather write JS where everything is async by default.
But I think generators are still sometimes mentioned in tutorials for this reason.
[x
for x in [some_complicated_expression]
if x > 0
for y in [x + 1]
...
][0]
That said, I wouldn't recommend this because of poor readability.In JS you can do:
async function foo(){...}
function bar(){foo().then(...);}
In python though async and sync code runs in a fundamentally different way as far as I understand it.Promises/thenables gave people the time to get used to the idea of deferred evaluation via a familiar callback approach... Then when async/await came along, people didn't see it as a radically new feature but more as syntactic sugar to do what they were already doing in a more succinct way without callbacks.
People in the Node.js community were very aware of async concepts since the beginning and put a lot of effort in not blocking the event loop. So Promises and then async/await were seen as solutions to existing pain points which everyone was already familiar with. A lot of people refactored their existing code to async/await.
DESPITE THAT: even if you're doing everything "right" (TM) -- using a single thread and doing all your networking I/O sequentially is simply slow as hell. A very very good example of this is bottle.py. Lets say you host a static web server with bottle.py. Every single web request for files leads to sequential loading, which makes page load times absolutely laughable. This isn't the case for every python web frame work, but it seems to be a common theme to me. (Cause: single thread, event loop.)
With asyncio, the most consistent behavior I've had with it seems to be to avoid having multiple processes and then running event loops inside them. Even though this approach seems like its necessary (or at least threading) to avoid the massive down sides of the event loop. But yeah, you have to keep everything simple. In my own library I use a single event loop and don't do anything fancy. I've learned the hard way how asyncio punishes trying to improve it. It's a damn cool piece of software, just has some huge limitations for performance.
Anyway I think the main difference is that in Python you control the event loop whereas in JS there's one fixed event loop and you have no choice about it.
If you try to do that with Python, you get performance that is not acceptable. So why even bother?
- Async is a legitimately hard to get if you are just starting to learn it, which is probably why its isn't more popular in the python community.
- If you need async, that implies you need hi I/O performance. At that point, you probably should have picked a more performant language + runtime (Java, Node), bc use case should dictate tooling.
- It's not enough to make a language + web framework to be async -- the DB drivers need to be async too (author mentions sqlalechemy got async support in 2023 and django orm is a WIP).
I like python, but not bc its async or multi-threaded. I like it bc when I use it, I know I do not have to worry about those things and the new set of problems I have to handle when I do.
For the i/o and multi-threaded perf, give me java and node (maybe erlang/elixir if I am feeling extra spicy). For the fast and easy scripting, with massive community of open source of talent and high quality libraries (including the vast majority of web app slop), give me python.
Green threads are better (IMHO), because they actually do hide all the machinery. As a developer in a language with mature green threads (Erlang), I don't have to know about the machinery[1], I just write code that blocks from my perspective and BEAM makes magic happen. As I understand it, that's the model for Java's Project Loom aka Java Green Threads 2: 2 Green 2 Threads. The first release had some issues with the machinery, but I think I read the second release was much better, and I haven't seen much since... I'm not a Cafe Babe, so I don't follow Java that closely.
[1] It's always nice to know about the machinery, but I don't have to know about it, and I was able to get started pretty quick and figure out the machinery later.
Its either because its the only language they know or they just don't really care about performance and want to finish the project fast.
And there is nothing wrong with that. In fact, this should be the norm.
Which isn't to argue that they did a good or a bad job adding the ability to the language. It just isn't the long pole in performance concerns for most programs.
Async Python is practically a new language. I think for most devs, it's a larger than than 2 to 3 was. One of the things that made python uptake easy was the vast number of libraries and bindings to C libraries. With async you need new versions of that stuff, you can definitely use synchronous libraries but then you get to debug why your stuff blocks.
Async Python is a different debugging experience for most python engineers. I support a small handful of async python services and think it would be an accellerator for our team to rewrite them on Go.
When you hire python engineers, most don't know async that well, if at all.
If you have a mix of synchronous and asynchronous code in your org, you can't easily intermix it. Well you can, but it won't behave as you usually desire it to, it's probably more desirable to treat them as different code bases.
Not to be too controversial, but depending upon your vintage and they was you've learned to write software I think you can come to python and think async is divine manna. I think there are many more devs that come to python from datascience or scripting or maybe as a first language and I think they have a harder time accepting the value and need of async. Like I said above, it's almost an entirely different language.
Sorry for the possibly naive question. If I need to call a synchronous function from an async function, why can't I just call await on the async argument?
def foo(bar: str, baz: int):
# some synchronous work
pass
async def other(bar: Awaitable[str]):
foo(await bar, 0)
And while Python implements async directly in the VM, its semantics is such that it can be treated as syntactic sugar for callbacks there also.
Actually, I was and am primarily a Dart developer, not a JS developer. But function color is a problem in any language that uses that style of asynchrony: JS, Dart, etc.
Rust has been trying to do that with "keyword generics": https://blog.rust-lang.org/inside-rust/2023/02/23/keyword-ge...
async is popular in JS because the browser is often waiting on many requests.
command-line tools are commonly computing something. even grep has to process the pattern matching so concurrent IO doesn't help a single-threaded pattern match.
Sure there are applications where async would help a CLI app, but there are fewer than JS.
Plus JS devs love rewriting code very 3 months.
future = lambda age: (
print('Your age is:', age),
older := age + 5,
print('Your age in the future:', older),
older,
)[-1]
print(future(20))
# out
Your age is: 20
Your age in the future: 25
25
However, gevent has to do its magic by monkeypatching. Wanting to avoid that, IIRC, was a significant reason why the async/await syntax and the underlying runtime implementation was developed for Python.
Another significant reason, of course, was wanting to make async functions look more like sync functions, instead of having to be written very differently from the ground up. Unfortunately, requiring the "async" keyword for any async function seriously detracted from that goal.
To me, async functions should have worked like generator functions: when generators were introduced into Python, you didn't have to write "gen def" or something like it instead of just "def" to declare one. If the function had the "yield" keyword in it, it was a generator. Similarly, if a function has the "await" keyword in it, it should just automatically be an async function, without having to use "async def" to declare it.
The Trio library felt easy to learn and just worked without much fuss.
This is what languages with higher-kinded types do and it's glorious. In Scala you write your code in terms of a generic monad and then you can reuse it for sync or async.
I would say that green threads still have "function coloring stuff", we just decided that every function will be async-colored.
Now, what happens if you try to cross an FFI-border and try to call a function that knows nothing about your green-thread runtime is an entirely different story...
One thing that I don't see being mentioned in any of the threads here talking about green threads is cancellation. A huge benefit, IMO, of anyio is that it makes cancellation really easy to handle. With asyncio, cancellation is pretty hard. And with green threads, cancellation is often impossible.
The same thing happened with Perl and its weird threading (for different reasons, but still)... I guess Python didn't learn that lesson. Perl also gained async and coroutine support, but I think they were added a while after I left the community. I doubt many people use them today. Anyone used them and can comment on ease vs Python?
I didn't go digging into it, but I'd guess they used the ubiquitous "six" library for backporting unicode functionality, but the point is likely "but why start underwater?!"
Structured concurrency libraries like anyio or trio are actually pretty nice -- "stacks" and stack traces are good things. Python multi exception concept is weird --- but also I think probably good ish.
It is still a pita to orchestrate around the gil/how terrible python multiprocessing side effects are wherever cpu bound workloads actually exist ...
What does `create_client` return?! don't you worry your pretty head about it, it'll be whatever you want it to be! flexability!!11
Well I had a fix https://news.ycombinator.com/item?id=43982570
python is kind of a slow choice for that sort of thing regardless and i don't think the complexity of async is all that justified for most usecases.
i still maintain my position that a good computer system should let you write logic synchronously and the system will figure out how to do things concurrently with high performance. (although getting this right would be very hard!)
Generations of programmers have given up on downloading data async in their Python scripts and just gone to bash and added a & at the end of a curl call inside a loop.
And Python's async cancellation model is pretty nice! You can reason about interruptions, timeouts, and the like pretty well. It's not all roses: things can ignore/defer cancellations, and the various wrappers people layer on make it hard to tell where, exactly, Tasks get cancelled--awaitable functions are simple here, at least. But even given that, Python's approach is a decent happy medium between Node's dangling coroutines and Rust's no-cleanup-ever disappearing ones (glib descriptor: "it's pre-emptive parallelism, but without the parallelism").
More than a little, I think, of the "nobody does it this way" weirdness and frustration in Python asyncio arises from that. That doesn't excuse the annoyances imposed by the resulting APIs, but it is good to know.
Once you have this in place, you can notice that you can "submit the task to the same thread", and just switch between tasks at every `await` point; you get coroutines. This is how generators work: `yield` is the `await` point.
If all the task is doing is waiting for I/O, and your runtime is smart enough to yield to another coroutine while the I/O is underway, you can do something useful, or at least issue another I/O task, not waiting for the first one to complete. This allows typical server code that does a lot of different I/O requests to run faster.
Older things like `gevent` just automatically added yield / await points at certain I/O calls, with an event loop running implicitly.
No, if you call both function one will try and fetch a none responding url and the other will immediately raise an exception.
If any code in your coroutine, including library code, has a broad try/except, there's good chances that eventually the cancellation exception will be swallowed up and ignored.
Catch-all try/except of course isn't the pinnacle of good software engineering, but it happens a lot, in particular in server-tyoe applications. You may have some kind of handler loop that handles events periodically, and if one such handling fails, with an unknowabl exception, you want to log it and continue. So then you have to remember to explicitly reraise cancellation errors.
Maybe it's the least bad Pythonic option, but it's quite clunky for sure.
IME writing an asyncio Python application is a bit like fixing a broken Linux boot. You frantically Google things, the documentation doesn't mention it, and eventually you find a rant on a forgotten Finnish embedded electronics forum where someone has the same problem as you, and is kindly sharing a solution. After 30 mins of C&P of random commands from a stranger on the web, it works, for no reason you can decipher. Thank goodness for the Finns and Google Translate.
But it took me some time to realize I can do the same idioms in Go as in Scala:
// Scala
f := Future(x)
// Do something else until you need f
...
for r <- f { ... }
can be written as c := channel
// Do something else until you need the result
...
r<-c
My mind model was channel as a queue, but it can easily be used like channel as a future for one value.And `select` for more complicated versions.
I miss the easy composition and delaying of futures though (f.map etc.)
At it's heart it's kind of like an asynchronous task execution engine that sits on top of an I/O layer which allows the high-level code to coordinate the activities of various equipment. Stuff like robot arms, furnace PID controllers, gantry systems, an automatic hydraulic press/spot welder (in one case), various kinds of pneumatic or stepper actuated mechanisms, and of course, measurement instruments. Often there might be a microcontroller intermediary, but the vast majority of the work is handled by Python.
My experience with async Python has been pretty positive, and I'm very happy with our choice to lean heavily into async. Contrary to some of the comments here I don't find the language's async facilities to be rough at all. Having cancellation work smoothly is also pretty important to us and I can't say I've experienced any pain points with exception-based cancellation. Maybe we've been lucky, but injecting an exception into a task to cancel it actually does work pretty reliably. Integrating dependencies that expose blocking APIs has never been a big deal either. Usually you want to have an interface layer for every third party dependency anyways, and it's no big to deal to just write an async wrapper that uses a threads or a thread pool to keep the blocking stuff off of the main thread.
I personally think that a lot of people's negative experiences here might have more to do with asyncio than the language's async features. Prior to stepping into my current role, I also had some rough experiences with asyncio, which is why we chose to build all of our async code on top of curio. There was some uncertainty at first about how well supported it would be compared to a package in the standard library, but honestly curio is a really well put together package that just works really smoothly.
I agree that that's annoying but tbh it sounds like any other piece of code to me that relies on global state. (Man, I can't wait for algebraic effects to become mainstream…)
An obvious advantage of doing it that way is you don’t need any runtime/OS-level support. Eg your runtime doesn’t need to even have a concept of threads. It works on bare metal embedded.
Another advantage is that it’s fully cooperative model. No magic preemption. You control the points where the switch can happen, there is no magic stuff suddenly running in background and messing up the state.
This is sort of like the article's Problem 3, but it's not just maintaining two APIs, it's even creating the second API in the first place.
Some time ago I tried to run just 10k OS threads on a small PC and it just crashed. So clearly OS threads have not improved much.
I really tried to make it work, but eventually gave up and went back to deque. Now life is great.
I already kinda had this idea while working with Rust. In Rust, Futures won’t execute unless `await`ed. In practice, that meant that all my futures were joined. It was just the only way I could wrap my head around doing anything useful with async.
Problem is it that it self reinforces and before you look every little function is suddenly async.
The irony is that it is used where you want to write in a synchronous style...
To me, Go is really well designed when it comes to multithreading because it is built upon a mutual contract where it will break easily and at compile time when you mess up the contract between the scheduling thread and the sub threads.
But, for the love of Go, I have no idea who the person was that decided that the map data type has to be not threadsafe. Once you start scaling / rewriting your code to use multiple goroutines, it's like you're being thrown in the cold water without having learnt to swim before.
Mutexes are a real pain to use in Go, and they could have been avoided if the language just decided to make read/write access threadsafe for at least maps that are known to be accessed from different threads.
I get the performance aspect of that decision, but man, this is so painful because you always have to rewrite large parts of your data structures everywhere, and abstract the former maps away into a struct type that manages the mutexes, which in return feels so dirty and unclean as a provided solution.
For production systems I just use haxmap from the start, because I know its limitations (of hashes of keys due to atomics), because that is way easier to handle than forgetting about mutexes somewhere down the codebase when you are still before the optimization phase of development.
async with mk_nursery() as nursery:
with os.fopen(...) as file:
nursery.start_soon(lambda: file.read())
The with block may have ended before the task starts...>Having cancellation work smoothly is also pretty important to us
+10000. Threads don't have good cancellation semantics, so we never had a robust solution to the "emergency shutdown" problem where you need to tell all the running equipment to stop whatever they're doing and return to safe positions.
Every day I worked on that codebase I wished it had been async from the beginning, but I couldn't see a way to migrate gradually because function coloring makes it an all-or-nothing affair.
Many people in my bubble (around 2013-2017) just never went with python 3, but chose other languages.
The company I was working for started important applications in python 2 as late as 2014 because the libraries we needed weren't ported yet. We never went python 3 later, but went to go instead, so we completely missed any python async thing.
The main difference being that now both models are simultaneously supported instead of being an implementation detail of each JVM.
Like others are saying, if I want it fast and efficient (processing), I'll just use Go. Python isn't like JS in browsers, you don't have to use it, you have to want to use it. and the same goes with its features. Maybe if python tutorials/books and "How do i ____ in python?" search results used async, map, filter, collections,etc.. these awesome python features would be more prevalent. But, I can see how mature projects should probably mandate their usage where it makes sense.
Async in C# is awesome, and there's nothing stopping you from writing sync code where appropriate or using threads if you want proper multi threading. Async is primarily used to avoid blocking for non-cpu-bound work, like waiting for API/db/filesystem etc. If you use it everywhere then it's used everywhere, if you don't then it isn't. For a lot of apps it makes sense to use it a lot, like in web apis that do lots of db calls and such. This incurs some overhead but it has the benefit of avoiding blocked threads so that no threads sit idle waiting for I/O.
You can imagine in a web API receiving a large number of requests per second there's a lot of this waiting going on and if threads were idle waiting for responses you wouldn't be able to handle nearly as much throughout.
What I did have issues with though, was async. For example pytest's async thingy is buggy for years with no fix in sight, so in one project I had to switch to manually making an event loop in that those tests.
But isn't the whole purpose of async, that it enabled concurrency, not parallelism, without the weight of a thread? I agree that in most cases it is not necessary to go there, but I can imagine systems with not so many resources, that benefit from such an approach when they do lots of io.
However, async tasks on a single core means potentially a lot of switching between those tasks. So async alone does not save the day here. It will have to be combined with true parallelism, to result in the speedup we want. Otherwise a single task rendering all the parts in sequence would be faster.
Also not, that it depends on where your db is. the process you describe implies at least 2 rounds of db communication. First one for the initial get forum thread query, then second one for all the async get forum replies requests. So if communication with the db takes a long time, you might as well lose what you gained, because you did 2 rounds of that communication.
So I guess it's not a trivial matter.
https://nodejs.org/en/learn/asynchronous-work/overview-of-bl...
I will agree with what some is said a above, BEAM is pretty great. I have been using it recently through Elixir.
I think that use case doesn't work well in async, because async effectively creates a tree of Promises that resolve in order. A task that doesn't get await-ed is effectively outside it's own tree of Promises because it may outlive the Promise it is a child of.
I think the solution would be something like Linux's zombie process reaping, and I can see how the devs prefer just not running those tasks to dealing with that mess.
If you choose a non-preemptive system, you naturally need yield points for cooperation. Those can either be explicit (await) or implicit (e.g. every function call). But you can get away with a minimal runtime and a stackless design.
Meanwhile, in a preemptive system you need a runtime that can interrupt other units of work. And it pushes you towards a stackful design.
All those decisions are downstream of the preemptive vs. cooperative.
In either case, you always need to be able to interface with CPU-heavy work. Either through preemption, or by isolating the CPU-heavy work.
The essential idea was I could be processing ~100 requests per vCPU in the async event loop while threading would max out 2-4 threads per CPU. Of course let us assume for either model we're waiting for 50-2000ms DB query or service call to finish before sending the response.
Is this not true? And if it is true, why isn't the juice is worth the squeeze: more than an order of magnitude more saturation/throughput for the same hardware and same language, just with a new engine at its heart?
Similarly, a function that calls an async function wouldn't itself be async unless it also had the await keyword. But of course the usual way of calling an async function would be to await it. And calling it without awaiting it wouldn't return a value, just as with a generator; calling a generator function without yielding from it returns a generator object, and calling an async function without awaiting it would return a future object. You could then await the future later, or pass it to some other function that awaited it.
If you just do
async def myAsyncFunction():
...
await someOtherAsyncFunction()
...
then the call to someOtherAsyncFunction will not spawn any kind of task or delegate to the event loop at all - it will just execute someOtherAsyncFunction() within the task and event loop iteration that myAsyncFunction() is already running in. This is a major difference from JS.If you just did
someOtherAsyncFunction()
without await, this would be a fire-and-forget call in JS, but in Python, it doesn't do anything. The statement creates a coroutine object for the someOtherAsyncFunction() call, but doesn't actually execute the call and instead just throws the object away again.I think this is what triggers the "coroutine is not awaited" warning: It's not complaining about fire-and-forget being bad style, it's warning that your code probably doesn't do what you think it does.
The same pitfall is running things concurrently. In JS, you'd do:
task1 = asyncFunc1();
task2 = asyncFunc2();
await task1;
await task2;
In Python, the functions will be run sequentially, in the await lines, not in the lines with the function calls.To actually run things in parallel, you have to to
loop.create_task(asyncFunc())
or one of the related methods. The method will schedule a new task and return a future that you can await on, but don't have to. But that "await" would work completely differently from the previous awaits internally.If you do `someOtherAsyncFunction()` without await and Python tried to execute similarly to a version with `await`, then the one without await would happen in the same task and event loop iteration but there's no guarantee that it's done by the time the outer function is. Thus the existing task/event loop iteration has to be kept alive or the non-await'ed task needs to be reaped to some other task/event loop iteration.
> loop.create_task(asyncFunc())
This sort of intuitively makes sense to me because you're creating a new "context" of sorts directly within the event loop. It's similar-ish to creating daemons as children of PID 1 rather than children of more-ephemeral random PIDs.
As far as I understood it, calling an async function without await (or create_task()) does not run the function at all - there is no uncertainty involved.
Async functions work sort of like generators in that the () operator just creates a temporary object to store the parameters. The 'await' or create_task() are the things that actually execute the function - the first immediately runs it in the same task as the containing function, the second creates a new task and puts that in the event queue for later execution.
So
asyncFunc()
without anything else is a no-op. It creates the object for parameter storage ("coroutine object") and then throws it away, but never actually calls (or schedules) asyncFunc.When queuing the function in a new task with create_task(), then you're right - there is no guarantee the function would finish, or even would have started before the outer function completed. But the new task won't have any relationship to the task of the outer function at all, except if the outer function explicitly chooses to wait for the other task, using the Future object that was returned by create_task.