I love C because it doesn't make my life very inconvenient to protect me from stubbing my toe in it. I hate C when I stub my toe in it.
I love C because it doesn't make my life very inconvenient to protect me from stubbing my toe in it. I hate C when I stub my toe in it.
I understand where this is coming from, but I think this is less true than it used to be, and (for that reason) it often devolves into arguments about whether the C standard is the actual source of truth for what you're "really" allowed to do in C. For example, the standard says I must never:
- cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)
- allow a signed integer to overflow
- pass a NULL pointer to memcpy, even if the length is zero
- read an unitialized object, even if I "don't care" what value I get
- read and write a value from different threads without locking or atomics, even if I know exactly what instructions those reads and writes compile into and the ISA manual says it's 100% fine to do that
All of these are ways that (modern, standard) C doesn't really "do what the programmer said". A lot of big real-world projects build with flags like -fno-strict-aliasing, so that they can get away with doing these things even though the standard says they shouldn't. But then, are they really writing C or "C with custom extensions"? When we compare C to other languages, whose extensions are we talking about?
cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)
Use the union type. Abusing it for aliasing violates the standard too, but GCC and Clang implement an extension that permits this. Alternatively, just allocate a char array and cast it as you please. Strict aliasing does not apply to char arrays if I recall. allow a signed integer to overflow
Is this still true? I thought that the reason for this is because C left the implementation to define how signed arithmetic worked, meaning you could not assume two’s complement, but the most recent C standard was supposed to mandate two’s complement. pass a NULL pointer to memcpy, even if the length is zero
There is a reason for this. memcpy is allowed to start reading early as a performance optimization, before it does a branch that checks if reading is only. I do wonder what happens if you only want to copy 1 byte and that byte has invalid memory right next to it. Presumably, this optimization would read more than a byte. read an unitialized object, even if I "don't care" what value I get
You are probably doing something wrong if you do this. It is not even good as an entropy source. read and write a value from different threads without locking or atomics, even if I know exactly what instructions those reads and writes compile into and the ISA manual says it's 100% fine to do that
Earlier C standards likely did not say anything about this because they did not support multithreading, but outside of possibly reading/writing to hardware registers, you do not want to do this because of races. Even if you think you know better, you almost certainly do not.> Use the union type. Abusing it for aliasing violates the standard too, but GCC and Clang implement an extension that permits this. Alternatively, just allocate a char array and cast it as you please. Strict aliasing does not apply to char arrays if I recall.
I could be misreading, but you seem to be implying that you can trick the aliasing rules by casting Foo* to char* and then cast the char* to Bar*, but that still violates the rule. Even a union isn't allowed as a way of aliasing, but as you say it's often allowed in practice and is heavily used in the Linux kernel (and Linus has made his opinion on this part of the language standard very clear).
In theory, the right way to access the bits of a Foo as a Bar is to memcpy to a fresh Bar object, and then memcpy back if you want to update the original variable. The compiler is then allowed to optimise this into a direct access of the bits.
If you insist on doing what you described, just skip char * and mark the pointer with __attribute__((may_alias)) and then it will be okay. That is a compiler extension that lets you turn off strict aliasing rules.
char x[sizeof(struct Foo)];
struct Foo* f = (struct Foo*)&x;
struct Bar* b = (struct Bar*)&x;
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
https://www.iso-9899.info/n1570.html#6.5p7
I realise that many implementations will allow it anyway but if you're relying on that then you may as well fall back to a straight cast from Foo* to Bar*, which is also not allowed in theory.