Leading a large open source project must be terrible in this age of constant outrage :-(
Leading a large open source project must be terrible in this age of constant outrage :-(
I do understand people's points about "the age of outrage" and "internet 2018" but still: the PEP wasn't generally accepted as being a fantastic improvement, so why did he feel the need to fight so hard for it?
Interestingly, C++ is going through the same process, with lots of great ideas being proposed, but the sum total of them being an even more complicated language (on top of what is probably the most complicated language already).
Python has been successful, IMHO, because Guido has made several brave, controversial calls. Python 3 breakage and async turned out to be prescient, fantastic decisions.
The jury is still out on the Python 3 decision, to be honest. Heck, Python 2 is still officially supported until 2020.
Python 3 adoption is increasing, but the instability and breakage that it introduced caused a lot of knock-on effects throughout the Python community that held it back and hindered its adoption and mindshare. It'll take a while before we can really say whether the long-term gains will make up for that.
It seems Python 3 is now dominant but not total, according to TFA.
A lot of companies are choosing new languages over porting python from 2 to 3.
It's not. Python 3 has overtaken 2 and there is no stopping migration to it now. Python 3.7 is a lot better than 2.7. Just on memory use alone, 3.7 is massively better. Sure, there will be some hold outs on 2.7 for a long time. That's fine.
Also, this is not the say that migration from 2 to 3 was handled well. It wasn't. Python 3.0 should have had backwards compatible features like allowing the 'u' string prefix. Indexing byte strings should have returned length one byte strings. Byte strings should have supported at least a minimal amount of %-style formats. Etc.
That has all been mostly resolved and is in the past. Mistakes were made because, shock, the Python core developers are not perfect and didn't foresee all the migration issues. However, there is no way that we are going back and reviving the Python 2.x branch.
So yes some breaking change was indicated, but the particular change that was made was the wrong one.
To ignore that is to straight up deny what can only be described as flamebait.
Apparently it's the right thread to be rude and to assign intentions to people you don't know though?
And all because they dared say their opinion on a subject you're sensitive about?
How about that: people can have any opinion they like on Python 3, including considering it a botched migration process and a ho-hum update. And it's totally legit for them to speak about that. And it's not your place to censor them, or act up any time they express their opinions.
You can either add your arguments, or skip reading their comments. How about that?
Python 3 implementation was a step in the right direction, but the decision to allow the old language co co-exist with the new one and to break backwards compatibility between the two (for instance 'print') in places where it didn't need to break makes no sense to me.
A lot of goodwill got burned with that.
Until way too recently essentially all Python code couldn't handle basic SSL/TLS certificate validation for Internationalized Domain Names. Once you understand what's going on, this situation is a no brainer: In order to connect to a machine named X, we must have turned X into DNS A-labels that we could look up, and we can treat those as bytes. The certificate must have SAN dnsNames matching its names, and those too are written as A-labels. So we can almost just compare the literal bytes (actually DNS A-labels are "case-insensitive" so we need to handle that, and the asterisk "wild card")
But Python, in its wisdom, defined this API to take a string, and strings, as you observe, aren't bytes. So instead of the above unarguable approach they wasted precious months trying to figure out how best to turn the A-labels from a SAN dnsName into Unicode strings, which isn't even the right problem to solve.
Eventually sanity prevailed: https://bugs.python.org/issue28414
You should be supporting return being made into a function too, right? That would be much more regular.
Reminder that a guideline of python was supposed to be "practicality beats purity" which is in stark contrast to the changes to print and strings. [1] Reminds me of the gradual shift of the message on the wall in Animal Farm from "four legs good, two legs bad" to "four legs good, two legs better."
The Python 3 Readiness Project now lists [341](http://py3readiness.org/) of the 360 most common packages as Python 3 compatible.
Even that's underselling it really. For example it lists BeautifulSoup as not converted, but the link goes to BeautifulSoup 3.2.1. However, BeautifulSoup4 works great on Python 3. And for MySQL there's mysqlclient and several others, and since database packages usually follow PEP 249 pretty closely its very easy to switch. So in reality, rather than 341/360, its more like "everything worth converting has been converted." Or just "everything" for short.
Long and bitter experience has shown that people who think they can "simply ignore" the "issue" of encoding actually can't. That mindset is mostly a more polite way of saying "people who assume everything is ASCII all the time, or at most an encoding that always has one byte == one character". Those assumptions break sooner or later. I prefer having them break sooner, because I've been the person who had to clean up the mess at an unpleasant time when it was kicked to "later".
Which in turn means Python 3 made the right choice: text is text and bytes are bytes, and you should never ever pretend bytes are text no matter how much you think you'll never run into a case where the assumption fails.
Making people actually do the conversion has the advantage that when writing their string conversion code they might actually do something with the exception beyond maybe logging it and then pressing on anyway. It also gives you the opportunity to explicitly offer them alternatives like treating everything we can't encode as some sort of replacement character (works well for Unicode, not so much for ASCII), which is way too much to ask of every single function that takes bytes.
That's the ruby solution. I like the ruby API here. When ruby introduced it in 1.9, it did cause similar upgrade pain, since you weren't used to having your strings tagged with an encoding, and suddenly they all kind of need to be if you want to do anything with them as strings-not-bytes.
As someone else noted would be the result, indeed the result was lots of "incompatible encoding" exceptions.
I think ruby actually has a pretty reasonable API here, but several years on, there are still _plenty_ of developers who don't understand it.
A lot of corporate dev environments operate on a policy where you're basically allowed to fix things that sales and customer support explicitly ask to fix, but nothing else. Which in turn means an environment where doing maintenance work that indirectly sustains the software is off-limits. Which in turn means they never ever upgrade the underlying platform (that's off-limits maintenance work), and so they end up on an EOL'd platform. At which point they blame the platform, and announce they're going to switch to something better that doesn't impose this problem on them.
Those types of places were never going to upgrade to Python 3 under any circumstances. They probably would not have even upgraded to a completely-backwards-compatible Python 2.8, if that had been released. So blaming Python 3 is a red herring here.
Show me the 1995 software that gets this right, and I'll show you proof that time travel is possible, by simply showing you that software right back again.
Abstractly, yes, it's true. Concretely, though, it's not a particularly valid criticism.
So you customer, not knowing any better, used whatever was on the machine already. If that's the case, that's really not the Python's community's fault.
The problem was that after 8 years there were still around libraries and frameworks that worked only with Python 2. That's a huge failure. If developers want to keep using the old stuff it means that the new one is either badly designed or badly managed.
Compare it with Ruby. There were big changes from 1.8 to 1.9 (unicode stuff among the others) and again with the 2.x series. The language mostly maintained backward compatibility and we can still write Ruby on 2.5 with the old 1.8 syntax. Community ported libraries and frameworks, started using the new features and all went well.
Sorry if I wasn't clear. I meant to suggest that the byte-functions wouldn't know or do anything about encodings. They just work with bytes. It's the other functions, that take the encoding-annotated bytes (or optionally a "pure" unicode type), that would care about encodings.
this shows a superficial, almost entertainment-industry sort of view of a software development lifecycle..
in the latest Ubuntu Bionic with apps, building right now on a local machine, I see 143 python-xx packages installed and 43 python3-xxx ..
saavy package authors use import future and six to side-step the whole issue, while core maintainers struggle, and outsiders invoke a mob voice
Python 2.7 for LTS
Cro (https://cro.services) is a set of libraries for building reactive distributed systems. Comma IDE (https://commaide.com) is an IDE for Perl 6, based on the JetBrains IDEA platform, now in (paid) beta.
If you want to keep up-to-date on Perl 6 development, check out the Perl 6 Weekly (https://p6weekly.wordpress.com).
Fail fast. It's better to break right away than having false senses of security. There is always __future__ too.
There are so many languages in which the type "string" is not the one to use for strings...
No, they weren't. Impossible as it seems, we had encodings (including multi-byte encodings) for decades before UTF.
Python couldn't use UTF-8 in 1991, but it could very well tag strings with a specific encoding, instead of treating them as a bucket of bytes C-style.
>Java did do it right but by 1995 the industry had already seen the problems of differing character sets.
We had seen the problems of "differing character sets" for decades already (Windows vs DOS version of the same language encodings was a classic example for most users, but the problems go back to EBCDIC and so on).
Java just did a more right thing, but we already have a need for generic string types that can handle multiple encodings and know what they contain and how to convert from one to another.
If yoy truly don't care whether or not the text is decodable (which is sometimes the case), then don't read it as a string, read it as a byte array.
You can still have methods that are generic over byte arrays and text, since that is an orthogonal concern.
Python preceded unicode, though.
Since Google couldn't convince folks to let them make Python faster, they created a NEW language instead.