←back to thread

218 points signa11 | 1 comments | | HN request time: 0.205s | source
Show context
9d ◴[] No.43681256[source]
> C doesn't try to save you from making mistakes. It has very few opinions about your code and happily assumes that you know exactly what you're doing. Freedom with responsibility.

I love C because it doesn't make my life very inconvenient to protect me from stubbing my toe in it. I hate C when I stub my toe in it.

replies(5): >>43682578 #>>43683142 #>>43683157 #>>43683835 #>>43684772 #
oconnor663 ◴[] No.43684772[source]
> It has very few opinions about your code

I understand where this is coming from, but I think this is less true than it used to be, and (for that reason) it often devolves into arguments about whether the C standard is the actual source of truth for what you're "really" allowed to do in C. For example, the standard says I must never:

- cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)

- allow a signed integer to overflow

- pass a NULL pointer to memcpy, even if the length is zero

- read an unitialized object, even if I "don't care" what value I get

- read and write a value from different threads without locking or atomics, even if I know exactly what instructions those reads and writes compile into and the ISA manual says it's 100% fine to do that

All of these are ways that (modern, standard) C doesn't really "do what the programmer said". A lot of big real-world projects build with flags like -fno-strict-aliasing, so that they can get away with doing these things even though the standard says they shouldn't. But then, are they really writing C or "C with custom extensions"? When we compare C to other languages, whose extensions are we talking about?

replies(1): >>43701472 #
ryao ◴[] No.43701472[source]

  cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)
Use the union type. Abusing it for aliasing violates the standard too, but GCC and Clang implement an extension that permits this. Alternatively, just allocate a char array and cast it as you please. Strict aliasing does not apply to char arrays if I recall.

  allow a signed integer to overflow
Is this still true? I thought that the reason for this is because C left the implementation to define how signed arithmetic worked, meaning you could not assume two’s complement, but the most recent C standard was supposed to mandate two’s complement.

  pass a NULL pointer to memcpy, even if the length is zero
There is a reason for this. memcpy is allowed to start reading early as a performance optimization, before it does a branch that checks if reading is only. I do wonder what happens if you only want to copy 1 byte and that byte has invalid memory right next to it. Presumably, this optimization would read more than a byte.

  read an unitialized object, even if I "don't care" what value I get
You are probably doing something wrong if you do this. It is not even good as an entropy source.

  read and write a value from different threads without locking or atomics, even if I know exactly what instructions those reads and writes compile into and the ISA manual says it's 100% fine to do that
Earlier C standards likely did not say anything about this because they did not support multithreading, but outside of possibly reading/writing to hardware registers, you do not want to do this because of races. Even if you think you know better, you almost certainly do not.
replies(3): >>43701746 #>>43703321 #>>43704285 #
RustyRussell ◴[] No.43704285[source]
> ryao 7 hours ago | parent | context | flag | on: Hacktical C: practical hacker's guide to the C pro...

  cast a `struct Foo*` into a `struct Bar*` and access the Foo through it (in practice we teach this as the "strict aliasing" rules, and that's how all(?) compilers implement it, but that's not what §6.5 paragraph 7 of the standard says!)
Use the union type. Abusing it for aliasing violates the standard too, but GCC and Clang implement an extension that permits this. Alternatively, just allocate a char array and cast it as you please. Strict aliasing does not apply to char arrays if I recall.

  allow a signed integer to overflow
Is this still true? I thought that the reason for this is because C left the implementation to define how signed arithmetic worked, meaning you could not assume two’s complement, but the most recent C standard was supposed to mandate two’s complement.

>> pass a NULL pointer to memcpy, even if the length is zero

> There is a reason for this. memcpy is allowed to start reading early as a performance optimization, before it does a branch that checks if reading is only.

Where did you get this idea from? It's not possible, since you can hand an address at the end of an array, and length 0. The array ends at the end of a page.

You can't read extra bytes in this case!

replies(1): >>43705187 #
1. ryao ◴[] No.43705187[source]
Handing memcpy() the address at the end of an array and length 0 is undefined behavior. It is often said that the reason for this is to allow memcpy() to read before it branches to make it fast.

This lead me to think of the case where you hand it the address right before the end of a byte array where the byte after the last byte is an unmapped page and tell it to copy 1 byte. I suspect systems that have such an optimization would read beyond 1 byte into invalid memory. This is my criticism of the idea of having memcpy(NULL, NULL, 0) be undefined to make that speed trick legal. I am suggesting that an undefined number of low values to copy must also be undefined, yet they are not under the standard.