https://mail.python.org/pipermail/python-dev/2006-February/0...
https://mail.python.org/pipermail/python-dev/2006-February/0...
Things that were not useful in 2006 might be totally useful in 2026 ;P
Still, like you, I'm curious wether he has anything to say about it.
It explains a lot about the design of Python container classes, and the boundaries of polymorphism / duck typing with them, and mutation between them.
I don't always agree with the choices made in Python's container APIs...but I always want to understand them as well as possible.
Also worth noting that understanding changes over time. Remember when GvR and the rest of the core developers argued adamantly against ordered dictionaries? Haha! Good times! Thank goodness their first wave of understanding wasn't their last. Concurrency and parallelism in Python was a TINY issue in 2006, but at the forefront of Python evolution these days. And immutability has come a long way as a design theme, even for languages that fully embrace stateful change.
... Well, yes; it doesn't support the methods for mutation. Thinking of ImmutableFoo as a subclass of Foo is never going to work. And, indeed, `set` and `frozenset` don't have an inheritance relationship.
I normally find Hettinger very insightful so this one is disappointing. But nobody's perfect, and we change over time (and so do the underlying conditions). I've felt like frozendict was missing for a long time, though. And really I think the language would have been better with a more formal concept of immutability (e.g. linking it more explicitly to hashability; having explicit recognition of "cache" attributes, ...), even if it didn't go the immutable-by-default route.
The new implementation has saved space, but there are opportunities to save more space (specifically after deleting keys) that they've now denied themselves by offering the ordering guarantee.
Theoretically, could `set` be a subclass of `frozenset` (and `dict` of `frozendict`)? Do other languages take that approach?
> linking [immutability] more explicitly to hashability
AFAIK immutability and hashability are equivalent for the language's "core" types. Would it be possible to enforce that equivalence for user-defined types, given that mutability and the implementation of `__hash__` are entirely controlled by the programmer?
“What has been is what will be, and what has been done is what will be done; there is nothing new under the sun.”
At one extreme: sure, anything can be made a subclass of anything else, if we wanted to.
At the other extreme: no, since Liskov substitution is an impossibly-high bar to reach; especially in a language that's as dynamic/loose as Python. For example, consider an expression like '"pop" in dir(mySet)'
This is optimizing for the common case, where memory is generally plentiful and dicts grow more than they shrink. Python has so many memory inefficiencies that occasional tombstones in the dict internal structure is unlikely to be a major effect. If you're really concerned, do `d = dict(d)` after aggressive deletion.
I can't say I've noticed any good reasons to rely on it. Didn't reach for `OrderedDict` often back in the day either. I've had more use for actual sorting than for preserving the insertion order.
He doesn't address the reason that most of us in 2025 immediately think of, which is that it's easier to reason about code if you know that certain values can't change after they're created.
What a change in culture over the last 20 years!
Given how dynamic Python is, such a subclass relationship need not be evident at the C level. You can totally make one class whose implementation is independent of another class a subclass of the other, using PEP 3119. This gives implementations complete flexibility in how to implement the class while retaining the ontological subclass relationship.
Still well before the talk.
I never said it's a problem (and I never said it's not!). I was specifically addressing two things:
- The "theoretical" nature of the question I quoted (i.e. ignoring other aspects like subjectivity, practicality, convention, etc.)
- The reasoning about "Liskov violation", which was quoted further up this thread.
For context, here's Liskov's definition of their principle (from https://en.wikipedia.org/wiki/Liskov_substitution_principle ):
> Barbara Liskov and Jeannette Wing described the principle succinctly in a 1994 paper as follows:[1]
> > Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.
My expression `"pop" in dir(mySet)` gives an explicit example of how `set` and `frozenset` are not subtypes of each other (regardless of how they're encoded in the language, with "subclasses" or whatever). In this case `ϕ(x)` would be a property like `'"pop" in dir(x)' = 'False'`, which holds for objects x of type frozenset. Yet it does not hold for objects y of type set.
Your example of `hasattr(mySet, 'pop')` gives another property that would be violated.
My point is that avoiding "Liskov violations" is ("theoretically") impossible, especially in Python (which allows programs to introspect/reflect on values, using facilities like 'dir', 'hasattr', etc.).
(FYI I became rather jaded on the Liskov substitution principle after reading https://okmij.org/ftp/Computation/Subtyping )
I would expect to use a different data structure if I needed an ordered set.
This says "if hasattr(parent, 'pop') == True then hasattr(child, 'pop') must be True". This is not violated in this case, since hasattr(parent, 'pop') is False. If you want to extend the above definition so that negative proofs concerning the parent should also hold true for the child, then subtyping becomes impossible since all parent and child types must be identical, by definition.
This morning for example, I tested an object serialized through a JSON API. My test data seems to never match the next run.
After a while, I realized one of the objects was using a set of objects, which in the API was turned into a JSON array, but the order of said array would change depending of the initial Python VM state.
3 days ago, I used itertools.group by to group a bunch of things. But itertools.group by only works on iterable that are sorted by the grouping key.
Now granted, none of those recent example are related to dicts, but dict is not a special case. And it's iterated over regularly.
Kotlin _kinda_ does this as well, but if you have a reference to an immutable map in Kotlin, you are still free to mutate the values (and even keys!) as much as you like.
Granted, I live and work in TypeScript, where I can't `===` two objects but I could see this deterministic behavior making it easier for a language to compare two objects, especially if equality comparison is dependent on a generated hash.
The other is guaranteed iteration order, if you are reliant on the index-contents relationship of an iterable, but we're talking about Dicts which are keyed, but extending this idea to List, I see this usefulness in some scenarios.
Beyond that, I'm not sure it matters, but I also realize I could simply not have enough imagination at the moment to think of other benefits
For me, it creates more reproducible programs and scripts, even simple ones.
str
bytes
int, float
complex
tuple
frozenset
Aside from int and float, you cannot perform comparisons between objects of different types. Moreover, you cannot sort complex numbers at all.I have crossed that bridge, and I'm telling you (again) that a sorted tuple is not a generic solution.
The root of the issue here is that Liskov substitution principle simply references ϕ(x) to be some property satisfied by objects of a class. It does not distinguish between properties that are designed by the author of the class to be satisfied or properties that happen to be satisfied in this particular implementation. But the Hyrum’s Law also states that properties that are accidentally true can become relied upon and as time passes become an intrinsic property. This to me suggests that the crux of the problem is that people don’t communicate sufficiently about invariants and non-invariants of their code.
When you interpret Liskov substitution properly, it's very rare that anything Liskov-substitutes anything, making the entire property meaningless. So just do things based on what works best in the real world and aim for as much Liskov-substitution as is reasonable. Python is duck-typed anyway.
It's a decent guiding principle - Set and ImmutableSet are more substitutable than Set and Map, so Set deriving from ImmutableSet makes more sense than Set deriving from Map. It's just not something you can ever actually achieve.
> If you want to extend the above definition so that negative proofs concerning the parent should also hold true for the child, then subtyping becomes impossible since all parent and child types must be identical, by definition.
The distinction isn’t “negative proofs”, but yes, that’s their point. In Python, you have to draw a line as to which observable properties are eligible.
If you in fact return e.g. an Rc::new(thing) or Arc::new(thing), that's forever (though of course you can unwrap the last reference!)
Type the dict as a mapping when you want immutability:
x: Mapping[int, int] = {1: 1}
x[1] = 2 # Unsupported target for indexed assignment ("Mapping[int, int]").
The only problem I've seen with this is: y = {}
y[x] = 0 # Mypy thinks this is fine. Mapping is hashable, after all!
The issue here is less that dict isn't hashable than that Mapping is, though.Might be worth noting that "dropped" in this context doesn't necessarily correspond to the reference going out of scope:
fn get_first(v: &Vec<i32>) -> &i32 {
&v[0]
}
fn main() {
let mut v = vec![0, 1, 2];
let first = get_first(&v);
print!("{}", first});
v.push(3); // Works!
// print!("{}", first); // Doesn't work
} x: dict[list, int] = {}
x[[1, 2, 3]] = 0Not that it's entirely unwarranted, of course.