OFC it would be nice to just write python and everything would be 12x accelerated, but i don't see how there would not be any draw-backs that would interfere with what makes python so approachable.
With the critical mass Python acquired over the years, GIL becomes a very sore bottleneck in some cases. This is why I decided to learn Go, for example. Properly threaded (and green threaded) programming language which is higher level than C/C++, but lower than Python which allows me to do things which I can't do with Python. Compilation is another reason, but it was secondary with respect to threading.
https://www.youtube.com/watch?v=_9B__0S21y8 is fairly concise and gives some recommendations for literature and techniques, obviously making an effort in promoting PlusCal/TLA+ along the way but showcases how even apparently simple algorithms can be problematic as well as how deep analysis has to go to get you a guarantee that the execution will be bug free.
Of course, while the transcription is in action the rest of the UI (Qt via Pyside) should remain usable. And multiple transcription requests should be supported - I'm thinking of a pool of transcription threads, but I'm uncertain how many to allocate. Half the quantity of CPUs? All the CPUs under 50% load?
Advise welcome!
What changes for you? Nothing unless you start using threads. You probably weren't using threads anyway because there is little to no point in python to using them. Most python code bases completely ignore the threading module and instead use non blocking IO, async, or similar things. The GIL thing only kicks in if you actually use threads.
If you don't use threads, removing the GIL changes nothing. There's no code that will break. All those C libraries that aren't thread safe are still single threaded, etc. Only if you now start using threads do you need to pay attention.
There's some threaded python code of course that people may have written in python somewhat naively in the hope that it would make things faster that is constantly hitting the GIL and is effectively single threaded. That code now might run a little faster. And probably with more bugs because naive threaded code tends to have those.
But a simple solution to address your fears: simply don't use threads. You'll be fine.
Or learn how to use threads. Because now you finally can and it isn't that hard if you have the right abstractions. I'm sure those will follow in future releases. Structured concurrency is probably high on the agenda of some people in the community.
Im not worried about new code. Im worried about stuff written 15 years ago by a monkey who had no idea how threads work and just read something on stack overflow that said to use threading. This code will likely break when run post-GIL. I suspect there is actually quite a bit of it.
Most C extensions that will break are not written by monkeys, but by conscientious developers that followed best practices.
Older code will break, but they break all the time. A language changes how something behaves in a new revision, suddenly 20 year old bedrock tools are getting massively patched to accommodate both new and old behavior.
Is it painful, ugly, unpleasant? Yes, yes and yes. However change is inevitable, because some of the behavior was rooted in inability to do some things with current technology, and as hurdles are cleared, we change how things work.
My father's friend told me that length of a variable's name used to affect compile/link times. Now we can test whether we have memory leaks in Rust. That thing was impossible 15 years ago due to performance of the processors.
def f(x):
for _ in range(N):
l.append(x)
I've tried it out and they start interleaving when N is set to 1000000.I feel some trepidation about threads, but at least for debugging purposes there's only one process to attach to.
Use SharedMemory to pass the data back and forth.
The only code that is going to break because of "No GIL" are C extensions and for very obvious reasons: You can now call into C code from multiple threads, which wasn't possible before, but is now. Python code could always be called from multiple python threads even in the presence of the GIL in python.
In a language conceived for this kind of work it's not as easy as you'd like. In most languages you're going to write nonsense which has no coherent meaning whatsoever. Experiments show that humans can't successfully understand non-trivial programs unless they exhibit Sequential Consistency - that is, they can be understood as if (which is not reality) all the things which happen do happen in some particular order. This is not the reality of how the machine works, for subtle reasons, but without it merely human programmers are like "Eh, no idea, I guess everything is computer?". It's really easy to write concurrent programs which do not satisfy this requirement in most of these languages, you just can't debug them or reason about what they do - a disaster.
As I understand it Python without the GIL will enable more programs that lose SC.
No it does not. I hate that analogy so much because it leads to such bad behavior. Software is a digital artifact that can does not degrade. With the right attitude, you'd be able to execute the same binary on new machines for as long as you desired. That is not true of organic matter that actually rots.
The only reason we need to change software is that we trade that off against something else. Instructions are reworked, because chasing the universal Turing machine takes a few sacrifices. If all software has to run on the same hardware, those two artifacts have to have a dialogue about what they need from each other.
If we didnt want the universal machine to do anything new. If we had a valuable product. We could just keep making the machine that executes that product. It never rots.
if len(my_list) > 5:
print(my_list[5])
(i.e. because a different thread can pop from the list in-between the check and the print), that could just as easily happen today. The GIL makes sure that only one python interpreter runs at once, but it's entirely possible that the GIL is released and switches to a different thread after the check but before the print, so there's no extra thread-safety issue in free-threaded mode.The problems (as I understand it, happy to be corrected), are mostly two-fold: performance and ecosystem. Using fine-grained locking is potentially much less efficient than using the GIL in the single-threaded case (you have to take and release many more locks, and reference count updates have to be atomic), and many, many C extensions are written under the assumption that the GIL exists.
But if you tried to compile it on today’s libc, making today’s syscalls… good luck with that.
Software “rots” in the sense that it has to be updated to run on today’s systems. They’re a moving target. You can still run HyperCard on an emulator, but good luck running it unmodded on a Mac you buy today.
If software is implicitly built on wrong understanding, or undefined behaviour, I consider it rotting when it starts to fall apart as those undefined behaviours get defined. We do not need to sacrifice a stable future because of a few 15 year old programs. Let the people who care about the value that those programs bring, manage the update cycle and fix it.
Coming from the Java world, you don't know what you're missing. Looking inside an application and seeing a bunch of threadpools managed by competing frameworks, debugging timeouts and discovering that tasks are waiting more than a second to get scheduled on the wrong threadpool, tearing your hair out because someone split a tiny sub-10μs bit of computation into two tasks and scheduling the second takes a hundred times longer than the actual work done, adding a library for a trivial bit of functionality and discovering that it spins up yet another threadpool when you initialize it.
(I'm mostly being tongue in cheek here because I know it's nice to have threading when you need it.)
A fairly common pattern for me is to start a terminal UI updating thread that redraws the UI every second or so while one or more background threads do their thing. Sometimes, it’s easier to express something with threads and we do it not to make the process faster (we kind of accept it will be a bit slower).
The real enemy is state that can me mutated from more than one place. As long as you know who can change what, threads are not that scary.
Even more fun: allocating memory could trigger Python's garbage collector which would also run `__del_-` functions. So every allocation was also a possible (but rare) thread switch.
The GIL was only ever intended to protect Python's internal state (esp. the reference counts themselves); any extension modules assuming that their own state would also be protected were likely already mistaken.
I think if someone set out to write a new dynamic scripting language today, from scratch, that multithreading it would not pose any particular challenge. Beyond that fact that it's naturally a difficult problem, I mean, but nothing special compared to the many other languages that have implemented threading. It's all about all that code from before the threading era that's the problem, not the threading itself. And Python has a loooot of that code.
"Python programmers are so incompetent that Python succeeds as a language only because it lacks features they wouldn't know to use"
Even if it's circumstantially true, doesn't mean it's the right guiding principle for the design of the language.
I'm thankful that it does, or I would have been out of work long ago. It's not that the files change (literal rot), it is that hardware, OSes, libraries, and everything else changes. I'm also thankful that we have not stopped innovating on all of the things the software I write depends on. You know, another thing changes - what we are using the software for. The accounting software I wrote in the late 80s... would produce financial reports that were what was expected then, but would not meet modern GAAP requirements.
> A global interpreter lock (GIL) is used internally to ensure that only one thread runs in the Python VM at a time. In general, Python offers to switch among threads only between bytecode instructions; how frequently it switches can be set via sys.setswitchinterval(). Each bytecode instruction and therefore all the C implementation code reached from each instruction is therefore atomic from the point of view of a Python program.
https://docs.python.org/3/faq/library.html#what-kinds-of-glo...
If this is not the case please let the official python team know their documentation is wrong. It indeed does state that if Py_DECREF is invoked the bets are off. But a ton of operations never do that.
Software doesn't rot, it remains constant. But the context around it changes, which means it loses usefulness slowly as time passes.
What is the name for this? You could say 'software becomes anachronistic'. But is there a good verb for that? It certainly seems like something that a lot more than just software experiences. Plenty of real world things that have been perfectly preserved are now much less useful because the context changed. Consider an Oxen-yoke, typewriters, horse-drawn carriages, envelopes, phone switchboards, etc.
It really feels like this concept should have a verb.
When you look from the program's perspective, the context changes and becomes unrecognizable, IOW, it rots.
When you look from the context's perspective, the program changes by not evolving and keeping up with the context, IOW, it rots.
Maybe we anthropomorphize both and say "they grow apart". :)
Multithreaded code is incredibly hard to reason about. And reasoning about it becomes a lot easier if you have certain guarantees (e.g. this argument / return value always has this type, so I can always do this to it). Code written in dynamic languages will more often lack such guarantees, because of the complicated signatures. This makes it even harder to reason about Multithreaded code, increasing the risk posed by multithreaded code.
I don't fully understand the challenge with removing it, but thought it was something about C extensions, not something most users have to directly worry about.
I don't know how that's done in Pyside, though. I couldn't find a clear example. You might have to use a QThread instead to handle it.
I was with OP's point but then you lost me. You'll always have to deal with that coworker's shitty code, GIL or not.
Could they make a worse mess with multi threading? Sure. Is their single threaded code as bad anyway because at the end of the day, you can't even begin understand it? Absolutely.
But yeah I think python people don't know what they're asking for. They think GIL less python is gonna give everyone free puppies.
Please don't - it isn't relevant.
15 years ago, new Python code was still dominantly for 2.x. Even code written back then with an eye towards 3.x compatibility (or, more realistically, lazily run through `2to3` or `six`) will have quite little chance of running acceptably on 3.14 regardless. There have been considerable removals from the standard library, `async` is no longer a valid identifier name (you laugh, but that broke Tensorflow once). The attitude taken towards """strings""" in a lot of 2.x code results in constructs that can be automatically made into valid syntax that appears to preserve the original intent, but which are not at all automatically fixed.
Also, the modern expectation is of a lock-step release cadence. CPython only supports up to the last 5 versions, released annually; and whenever anyone publishes a new version of a package, generally they'll see no point in supporting unsupported Python versions. Nor is anyone who released a package in the 3.8 era going to patch it if it breaks in 3.14 - because support for 3.14 was never advertised anyway. In fact, in most cases, support for 3.9 wasn't originally advertised, and you can't update the metadata for an existing package upload (you have to make a new one, even if it's just a "post-release") even if you test it and it does work.
Practically speaking, pure-Python packages usually do work in the next version, and in the next several versions, perhaps beyond the support window. But you can really never predict what's going to break. You can only offer a new version when you find out that it's going to break - and a lot of developers are going to just roll that fix into the feature development they were doing anyway, because life's too short to backport everything for everyone. (If there's no longer active development and only maintenance, well, good luck to everyone involved.)
If 5 years isn't long enough for your purposes, practically speaking you need to maintain an environment with an outdated interpreter, and find a third party (RedHat seems to be a popular choice here) to maintain it.
In my estimation, the only "20 year old bedrock tools" in Python are in the standard library - which currently holds itself free to deprecate entire modules in any minor version, and remove them two minor versions later - note that this is a pseudo-calver created by a coincidentally annual release cadence. (A bunch of stuff that old was taken out recently, but it can't really be considered "bedrock" - see https://peps.python.org/pep-0594/).
Unless you include NumPy's predecessors when dating it (https://en.wikipedia.org/wiki/NumPy#History). And the latest versions of NumPy don't even support Python 3.9 which is still not EOL.
Requests turns 15 next February (https://pypi.org/project/requests/#history).
Pip isn't 20 years old yet (https://pypi.org/project/pip/#history) even counting the version 0.1 "pyinstall" prototype (not shown).
Setuptools (which generally supports only the Python versions supported by CPython, hasn't supported Python 2.x since version 45 and is currently on version 80) only appears to go back to 2006, although I can't find release dates for versions before what's on PyPI (their own changelog goes back to 0.3a1, but without dates).
It seems the way to do it in Qt is with signals and slots, emitting a signal from your QThread and binding it to a slot in the UI thread, making sure to specify a "queued connection" [1]. There's also a lower-level postEvent method [2] but people disagree [3] on whether that's OK to call from a regular Python thread or has to be called from a QThread.
So I would try doing it with Qt's thread classes, not with concurrent.futures.
[1] https://doc.qt.io/qt-5/threads-synchronizing.html#high-level...
[2] https://doc.qt.io/qt-6/qcoreapplication.html#postEvent
[3] https://www.mail-archive.com/pyqt@riverbankcomputing.com/msg...
Especially when they've already been force-fed with ungodly amounts of buggy threaded code that has been mistakenly advertised as bug-free simply because nobody managed to catch the problem with a fuzzer yet (and which is more likely to expose its faults in a no-GIL environment, even though it's still fundamentally broken with a GIL)?
For the web/network workloads most of us write, I'd highly recommend this.
Certain operations that look atomic to the user are actually comprised of multiple bytecode instructions. Now, if you are unlucky, the interpreter decides to release the GIL and yield to another thread exactly during such instructions. You won't get a segfault, but you might get unexpected results.
See also https://github.com/google/styleguide/blob/91d6e367e384b0d8aa...
One lesson I have learned is that good design cannot survive popularity and bureaucracy that comes with it. Over time people just beat down your door with requests to do cases you explicitly avoided. You’re blocking their work and not being pragmatic! Eventually nobody is left to advocate for them.
And part of that is the community has more resources and can absorb some more complexity. But this is also why I prefer tools with smaller communities.
Still, that's only a marketing move, technically the choice was still the right one, just like this one is.