Hacktical C: practical hacker's guide to the C programming language

(github.com)

Show context

9d ◴[14 Apr 25 13:49 UTC] No.43681256[source]▶

> C doesn't try to save you from making mistakes. It has very few opinions about your code and happily assumes that you know exactly what you're doing. Freedom with responsibility.

I love C because it doesn't make my life very inconvenient to protect me from stubbing my toe in it. I hate C when I stub my toe in it.

replies(5): >>43682578 #>>43683142 #>>43683157 #>>43683835 #>>43684772 #

oconnor663 ◴[14 Apr 25 18:54 UTC] No.43684772[source]▶

>>43681256 #

> It has very few opinions about your code

I understand where this is coming from, but I think this is less true than it used to be, and (for that reason) it often devolves into arguments about whether the C standard is the actual source of truth for what you're "really" allowed to do in C. For example, the standard says I must never:

- cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)

- allow a signed integer to overflow

- pass a NULL pointer to memcpy, even if the length is zero

- read an unitialized object, even if I "don't care" what value I get

- read and write a value from different threads without locking or atomics, even if I know exactly what instructions those reads and writes compile into and the ISA manual says it's 100% fine to do that

All of these are ways that (modern, standard) C doesn't really "do what the programmer said". A lot of big real-world projects build with flags like -fno-strict-aliasing, so that they can get away with doing these things even though the standard says they shouldn't. But then, are they really writing C or "C with custom extensions"? When we compare C to other languages, whose extensions are we talking about?

replies(1): >>43701472 #

1. ryao ◴[16 Apr 25 04:27 UTC] No.43701472[source]▶

>>43684772 #

  cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)

Use the union type. Abusing it for aliasing violates the standard too, but GCC and Clang implement an extension that permits this. Alternatively, just allocate a char array and cast it as you please. Strict aliasing does not apply to char arrays if I recall.

  allow a signed integer to overflow

Is this still true? I thought that the reason for this is because C left the implementation to define how signed arithmetic worked, meaning you could not assume two’s complement, but the most recent C standard was supposed to mandate two’s complement.

  pass a NULL pointer to memcpy, even if the length is zero

There is a reason for this. memcpy is allowed to start reading early as a performance optimization, before it does a branch that checks if reading is only. I do wonder what happens if you only want to copy 1 byte and that byte has invalid memory right next to it. Presumably, this optimization would read more than a byte.

  read an unitialized object, even if I "don't care" what value I get

You are probably doing something wrong if you do this. It is not even good as an entropy source.

  read and write a value from different threads without locking or atomics, even if I know exactly what instructions those reads and writes compile into and the ISA manual says it's 100% fine to do that

Earlier C standards likely did not say anything about this because they did not support multithreading, but outside of possibly reading/writing to hardware registers, you do not want to do this because of races. Even if you think you know better, you almost certainly do not.

replies(3): >>43701746 #>>43703321 #>>43704285 #

2. lifthrasiir ◴[16 Apr 25 05:17 UTC] No.43701746[source]▶

>>43701472 (TP) #

> the most recent C standard was supposed to mandate two’s complement.

While that's true, overflows are not automatically wrapping because they instead may trap for several reasons. (C++ does require wrapping now in comparison. [1])

[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf

> memcpy is allowed to start reading early as a performance optimization, [...]

Most modern memcpy implementations would branch on the length anyway, because word-based copying is generally faster than byte-based copying whenever possible. Also many would try SIMD when the copy size exceeds some threshold for the same reason.

>> read an unitialized object, even if I "don't care" what value I get

> You are probably doing something wrong if you do this.

The GP meant the case like this. Consider `struct foo { bool avail; int value; } foos[100];` where `value` would be only set when `avail` is true. If we are summing all available `value`s, we may want to avoid a branch misprediction by something like `accum += foos[i].avail * foos[i].value;` for each `foos[i]`, since the actual `value` shouldn't matter when `avail` is false. But the current specification prohibits this construction because it considers that each read from `foos[i].value` may be different from each other (!). In reality, this kind of issues is so widespread that LLVM has a special "poison" value which gets resolved to some fixed value after the first use.

replies(1): >>43702937 #

3. ryao ◴[16 Apr 25 08:30 UTC] No.43702937[source]▶

>>43701746 #

Thanks for the explanations.

As for the last one, I would probably bzero() that structure, as it is faster than setting just 1 field to zero in a loop, which presumably is what you would do until you have some need to “allocate” a value. That would avoid the problem entirely.

I know bzero() was removed from POSIX, but “bzero()” is nicer to write than “memset() it to zero”.

4. quietbritishjim ◴[16 Apr 25 09:31 UTC] No.43703321[source]▶

>>43701472 (TP) #

> > cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)

> Use the union type. Abusing it for aliasing violates the standard too, but GCC and Clang implement an extension that permits this. Alternatively, just allocate a char array and cast it as you please. Strict aliasing does not apply to char arrays if I recall.

I could be misreading, but you seem to be implying that you can trick the aliasing rules by casting Foo* to char* and then cast the char* to Bar*, but that still violates the rule. Even a union isn't allowed as a way of aliasing, but as you say it's often allowed in practice and is heavily used in the Linux kernel (and Linus has made his opinion on this part of the language standard very clear).

In theory, the right way to access the bits of a Foo as a Bar is to memcpy to a fresh Bar object, and then memcpy back if you want to update the original variable. The compiler is then allowed to optimise this into a direct access of the bits.

replies(1): >>43705218 #

5. RustyRussell ◴[16 Apr 25 12:01 UTC] No.43704285[source]▶

>>43701472 (TP) #

> ryao 7 hours ago | parent | context | flag | on: Hacktical C: practical hacker's guide to the C pro...

  cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)

  allow a signed integer to overflow

>> pass a NULL pointer to memcpy, even if the length is zero

> There is a reason for this. memcpy is allowed to start reading early as a performance optimization, before it does a branch that checks if reading is only.

Where did you get this idea from? It's not possible, since you can hand an address at the end of an array, and length 0. The array ends at the end of a page.

You can't read extra bytes in this case!

replies(1): >>43705187 #

6. ryao ◴[16 Apr 25 13:22 UTC] No.43705187[source]▶

>>43704285 #

Handing memcpy() the address at the end of an array and length 0 is undefined behavior. It is often said that the reason for this is to allow memcpy() to read before it branches to make it fast.

This lead me to think of the case where you hand it the address right before the end of a byte array where the byte after the last byte is an unmapped page and tell it to copy 1 byte. I suspect systems that have such an optimization would read beyond 1 byte into invalid memory. This is my criticism of the idea of having memcpy(NULL, NULL, 0) be undefined to make that speed trick legal. I am suggesting that an undefined number of low values to copy must also be undefined, yet they are not under the standard.

7. ryao ◴[16 Apr 25 13:24 UTC] No.43705218[source]▶

>>43703321 #

You are misreading. I said to take a char * and then cast it to whatever you want. You can cast it to struct A *. Then you can cast the original char * to struct B *. The compiler will be fine with this since the strict aliasing rule excludes char *.

If you insist on doing what you described, just skip char * and mark the pointer with __attribute__((may_alias)) and then it will be okay. That is a compiler extension that lets you turn off strict aliasing rules.

replies(1): >>43706750 #

8. quietbritishjim ◴[16 Apr 25 15:33 UTC] No.43706750{3}[source]▶

>>43705218 #

Ah, I see. Like this:

    char x[sizeof(struct Foo)];
    struct Foo* f = (struct Foo*)&x;
    struct Bar* b = (struct Bar*)&x;

replies(1): >>43718026 #

9. quietbritishjim ◴[17 Apr 25 15:08 UTC] No.43718026{4}[source]▶

>>43706750 #

(I can't edit so replying instead.) But this isn't allowed either. You can access a struct Foo variable through a char* pointer but you can't use struct Foo* to access an object whose actual type ("effective type" in the words of the standard) is char array. The standard says:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of the object,

— a type that is the signed or unsigned type corresponding to the effective type of the object,

— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

— a character type.

https://www.iso-9899.info/n1570.html#6.5p7

I realise that many implementations will allow it anyway but if you're relying on that then you may as well fall back to a straight cast from Foo* to Bar*, which is also not allowed in theory.

↑