←back to thread

Indices, not Pointers

(joegm.github.io)
102 points vitalnodo | 4 comments | | HN request time: 0.002s | source
Show context
zahlman ◴[] No.45110423[source]
> There is a pattern I’ve learned while using Zig which I’ve never seen used in any other language.

I've done this in small applications in C (where nodes were already being statically allocated) and/or assembly (hacking on an existing binary).

No idea about the effect on speed in general; I was trying to save a few bytes of storage in a place where that mattered.

replies(5): >>45111351 #>>45111431 #>>45111876 #>>45112740 #>>45112993 #
anonymousiam ◴[] No.45111431[source]
But in C, there's not really any difference between pointers and indices. You can access array elements using either method, and they're both equally efficient.
replies(4): >>45111456 #>>45111634 #>>45112113 #>>45112598 #
kazinator ◴[] No.45111634[source]
There are some differences.

- You can check indices that are out of bounds without running into formal undefined behavior. ISO C does not require pointers to distinct objects to be comparable via inequality, only exact equality. (In practice it works fine in any flat-address-space implementation and may be regarded as a common extension.)

- Indices are implicitlyl scaled. If you have done a range check that index is valid, then it refers to an entry in your array. At worst it is some unoccupied/free entry that the caller shouldn't be using. If you have checked that a pointer points into the array, you don't know that it's valid; you also have to check that its displacement from the base of the array is a multiple of the element size; i.e. it is aligned.

replies(1): >>45112142 #
cyber_kinetist ◴[] No.45112142[source]
You do have to take care of the ABA problem - if you access memory using an index that became invalid before and another object is using instead, you will have some weird hard-to-debug logic errors (worse than use-after-free, since even Valgrind can't save you). To prevent this you need another generational counter to store along with your id (which is either incremented for every usage or assigned a random hash)
replies(1): >>45112628 #
adrian_b ◴[] No.45112628[source]
This matters only for shared data structures. It is irrelevant for thread-local data.

For shared data structures, you have more to worry about, so regardless if you use indices or pointers you must use either atomic operations or means to ensure exclusive access to the entire data structure or means to detect the need for retries when using optimistic accesses.

replies(1): >>45112672 #
1. kazinator ◴[] No.45112672[source]
Well, solving/mitigating the ABA ambiguity can debug use-after-free errors in single-threaded programs also. Because when a pointer A is freed to B, and then recycled again for a new object, we can make it into a different pointer A' (e.g. with a tagging scheme). So then when the old A pointer copies are lingering around, we can tell they are invalid due to having the wrong tag.

Solving ABA is probably a point in favor of indices (if we are working in a higher level language) because their type supports the bit operations for tagging. However, some hardware has support for hardware tagging for pointers. E.g. ARM; Android uses it.

replies(1): >>45114096 #
2. adrian_b ◴[] No.45114096[source]
With indices what you say can be implemented trivially, much simpler than with pointers, by always incrementing a reallocated index (i.e. an index extracted from the free list) with the array size and always addressing the array with the indices modulo the array size.

With the array size chosen to be a power of two, this adds negligible overhead in time and no overhead in space.

replies(1): >>45114681 #
3. duped ◴[] No.45114681[source]
That's equivalent to packing two counters into one.
replies(1): >>45115063 #
4. adrian_b ◴[] No.45115063{3}[source]
True.