A “frozen” dictionary for Python

1. drhagen ◴[11 Dec 25 11:45 UTC] No.46230206[source]▶

>>46229467 (OP) #

Great! Now make `set` have a stable order and we're done here.

replies(1): >>46230375 #

2. cr125rider ◴[11 Dec 25 12:06 UTC] No.46230375[source]▶

>>46230206 (TP) #

Aren’t sets unsorted by definition? Or do repeated accesses without modification yield different results?

replies(4): >>46230852 #>>46230876 #>>46230887 #>>46231151 #

3. sltkr ◴[11 Dec 25 13:02 UTC] No.46230852[source]▶

>>46230375 #

So are dictionary keys, but Python decided to make them insertion ordered (after having them be unordered just like set elements for decades). There is no fundamental reason sets couldn't have a defined order. That's what languages like JavaScript have done too.

replies(1): >>46231235 #

4. ledauphin ◴[11 Dec 25 13:04 UTC] No.46230876[source]▶

>>46230375 #

this is likely in reference to the fact that dicts have maintained insertion order since Python ~3.6 as property of the language. Mathematically there's no defined order to a set, and a dict is really just a set in disguise, but it's very convenient for determinism to "add" this invariant to the language.

replies(2): >>46230993 #>>46231182 #

5. Retr0id ◴[11 Dec 25 13:06 UTC] No.46230887[source]▶

>>46230375 #

A stable order does not imply sorted order. If you convert the same set to a list twice you'll probably get the same order both times but it isn't guaranteed anywhere, and that order may change between python implementations and versions. The order may also change as the set grows or shrinks.

6. jonathaneunice ◴[11 Dec 25 13:17 UTC] No.46230993{3}[source]▶

>>46230876 #

Debugging is a completely different and better animal when collections have a predictable ordering. Else, every dict needs ordering before printing, studying, or comparing. Needlessly onerous, even if philosophically justifiable.

7. adrian_b ◴[11 Dec 25 13:32 UTC] No.46231151[source]▶

>>46230375 #

Not related to Python, but one of the possible implementations of a set, i.e. of an equivalence class on sequences, is as a sorted array (with duplicates eliminated, unless it is a multiset, where non-unique elements are allowed in the sorted array), as opposed to the unsorted array that can store an arbitrary sequence.

So sets can be viewed as implicitly sorted, which is why the order of the elements cannot be used to differentiate two sets.

Being sorted internally to enforce the equivalence between sets with elements provided in different orders does not imply anything about the existence of an operation that would retrieve elements in a desired order or which would return subsets less or greater than a threshold. When such operations are desired, an order relation must be externally defined on the set.

So a possible definition of sets and multisets is as sorted arrays with or without unicity of elements, while sequences are unsorted arrays (which may also have the constraint of unique elements). However the standard set operations do not provide external access to the internal order, which is an order between arbitrary identifiers attached to the elements of the set, which have no meaning externally.

8. zahlman ◴[11 Dec 25 13:35 UTC] No.46231182{3}[source]▶

>>46230876 #

Sets use a different implementation intentionally (i.e. they are not "a dict without values") exactly because it's expected that they have different use cases (e.g. union/intersection operations).

9. cpburns2009 ◴[11 Dec 25 13:41 UTC] No.46231235{3}[source]▶

>>46230852 #

Python's decision to make dict keys ordered in the spec was a mistake. It may be the best implementation so far, but it eliminates potential improvements in the future.

replies(1): >>46232034 #

10. mrweasel ◴[11 Dec 25 14:48 UTC] No.46232034{4}[source]▶

>>46231235 #

Agreed. The only reason to make them sorted is because people would wrongly assume that they where. You can argue that a programming language should not have unexpected behaviors, and apparently unsorted dictionary keys where a surprise to many, on the other hand I feel like it's a failure of education.

The problem was that assuming that keys would be sorted was frequently true, but not guaranteed. An alternative solution would have been to randomize them more, but that would probably break a lot of old code. Sorting the keys makes no difference if you don't expect them to be, but it will now be a greater surprise if you switch language.

replies(1): >>46257787 #

11. zahlman ◴[13 Dec 25 20:38 UTC] No.46257787{5}[source]▶

>>46232034 #

"sorted" and "ordered" mean very different things in this context.

And the reason we have ordered dict keys now is because it's trivial with the new compact structure (the actual hash table contains indices to an auxiliary array, which can just be appended to with every insertion). It has nothing to do with any randomization of the hashing process.