Most active commenters
  • uecker(4)
  • renox(3)
  • layer8(3)
  • TZubiri(3)

←back to thread

The C23 edition of Modern C

(gustedt.wordpress.com)
397 points bwidlar | 36 comments | | HN request time: 0.612s | source | bottom
1. ralphc ◴[] No.41851601[source]
How does "Modern" C compare safety-wise to Rust or Zig?
replies(4): >>41852048 #>>41852113 #>>41852498 #>>41856856 #
2. renox ◴[] No.41852048[source]
You'd be surprised: Zig has one UB (Undefined Behaviour) that C doesn't have!

In release fast mode, unsigned overflow/underflow is undefined in Zig whereas in C it wraps.

:-)

Of course C has many UBs that Zig doesn't have, so C is far less safe than Zig, especially since you can use ReleaseSafe in Zig..

replies(2): >>41852363 #>>41852615 #
3. WalterBright ◴[] No.41852113[source]
Modern C still promptly decays an array to a pointer, so no array bounds checking is possible.

D does not decay arrays, so D has array bounds checking.

Note that array overflow bugs are consistently the #1 problem with shipped C code, by a wide margin.

replies(1): >>41852316 #
4. layer8 ◴[] No.41852316[source]
> no array bounds checking is possible.

This isn’t strictly true, a C implementation is allowed to associate memory-range (or more generally, pointer provenance) metadata with a pointer.

The DeathStation 9000 features a conforming C implementation which is known to catch all array bounds violations. ;)

replies(4): >>41852348 #>>41852932 #>>41854734 #>>41855111 #
5. uecker ◴[] No.41852348{3}[source]
Right. Also it might it sound like array-to-pointer decay is forced onto the programmer. Instead, you can take the address of an array just fine without letting it decay. The type then preserves the length.
replies(2): >>41853029 #>>41854211 #
6. uecker ◴[] No.41852363[source]
UB is does not automatically make things unsafe. You can have a compiler that implements safe defaults for most UB, and then it is not unsafe.
replies(4): >>41852548 #>>41853004 #>>41853083 #>>41853762 #
7. jandrese ◴[] No.41852498[source]
Modern C is barely any different than older C. The language committee for C is extremely conservative, changes tend to happen only around the edges.
8. ahoka ◴[] No.41852548{3}[source]
By definition UB cannot be safe.
replies(2): >>41853174 #>>41854910 #
9. secondcoming ◴[] No.41852615[source]
Does C automatically wrap? I thought you need to pass `-fwrapv` to the compiler to ensure that.
replies(3): >>41852833 #>>41852848 #>>41852877 #
10. greyw ◴[] No.41852833{3}[source]
Unsigned overflow wraps. Signed overflow is undefined behavior.
replies(1): >>41852909 #
11. renox ◴[] No.41852848{3}[source]
-fwrapv is for signed integer overflow not unsigned.
replies(1): >>41853085 #
12. ◴[] No.41852877{3}[source]
13. kbolino ◴[] No.41852909{4}[source]
This distinction does not exist in K&R 2/e which documents ANSI C aka C89, but maybe it was added in a later version of the language (or didn't make it into the book)? According to K&R, all overflow is undefined.
replies(1): >>41853245 #
14. TZubiri ◴[] No.41852932{3}[source]
"The DeathStation 9000"

The what now?

replies(2): >>41853018 #>>41853918 #
15. duped ◴[] No.41853004{3}[source]
That's implementation defined behavior, not undefined behavior. Undefined behavior explicitly refers to something the compiler does not provide a definition for, including "safe defaults."
replies(2): >>41853169 #>>41854616 #
16. layer8 ◴[] No.41853018{4}[source]
Google it.
replies(1): >>41854150 #
17. codr7 ◴[] No.41853029{4}[source]
Nice, when you know the length at compile time, which is rarely from my experience.

The holy grail is runtime access to the length, which means an array would have to be backed by something more elaborate.

replies(1): >>41856489 #
18. ◴[] No.41853083{3}[source]
19. sp1rit ◴[] No.41853085{4}[source]
Yes, as unsigned overflow is fine by default. AFAIK the issue was originally that there were still machines that used ones complement for describing negative integers instead of the now customary twos complement.
20. fuhsnn ◴[] No.41853169{4}[source]
Compilers are not prohibited to provide their own definition for UB, that's how UBsan exists.
21. marssaxman ◴[] No.41853174{4}[source]
this depends on your choice of definition for "safe"
22. wahern ◴[] No.41853245{5}[source]
I don't have my copy of K&R handy, but this distinction has existed since the initial codification. From C89:

  3.1.2.5 Types

  [...] A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type.
Source: C89 (draft) at https://port70.net/~nsz/c/c89/c89-draft.txt
23. renox ◴[] No.41853762{3}[source]
Well Zig has ReleaseSafe for this.. ReleaseFast is for using these UBs to generate the fastest code.
24. bsder ◴[] No.41853918{4}[source]
Nasal daemons for those of us of a slightly older vintage ...
25. TZubiri ◴[] No.41854150{5}[source]
Yeah, why have any type of human interaction in a forum when you can just refer your fellow brethren to the automaton.
replies(1): >>41854271 #
26. WalterBright ◴[] No.41854211{4}[source]
C: int foo(int a[]) { return a[5]; }

    int main() {
        int a[3];
        return foo(a);
    }

    > gcc test.c
    > ./a.out
Oops.

D: int foo(int[] a) { return a[5]; }

    int main() {
        int[3] a;
        return foo(a);
    }

    > ./cc array.d
    > ./array
    core.exception.ArrayIndexError@array.d(1): index [5] is out of bounds for array of length 3
Ah, Nirvana!

How to fix it for C:

https://www.digitalmars.com/articles/C-biggest-mistake.html

replies(1): >>41856518 #
27. layer8 ◴[] No.41854271{6}[source]
I’m saying this because any explanation I could offer would provide less insight than the Google results.
replies(1): >>41854957 #
28. Maxatar ◴[] No.41854616{4}[source]
The C standard says, and I quote:

>Possible undefined behavior ranges from ignoring the situation completely with unpredictable results ... or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message)

So a compiler is absolutely welcome to make undefined behavior safe. In fact every compiler I know of, such as GCC, clang, MSVC has flags to make various undefined behavior safe, such as signed integer overflow, type punning, casting function pointers to void pointers.

The Linux kernel is notorious for leveraging undefined behavior in C for which GCC guarantees specific and well defined behavior.

It looks like there is also the notion of unspecified behavior, which gives compilers a choice about the behavior and does not require compilers to document that choice or even choose consistently.

And finally there is what you bring up, which is implementation defined behavior which is defined as a subset of unspecified behavior in which compilers must document the choice.

29. trealira ◴[] No.41854734{3}[source]
> The DeathStation 9000 features a conforming C implementation which is known to catch all array bounds violations. ;)

That actually really does exist already with CHERI CPUs, whose pointers are tagged with "capabilities," which catch buffer overruns at runtime.

https://tratt.net/laurie/blog/2023/two_stories_for_what_is_c...

https://msrc.microsoft.com/blog/2022/01/an_armful_of_cheris/

30. Maxatar ◴[] No.41854910{4}[source]
The definition given by the C standard allows for safe undefined behavior.
31. TZubiri ◴[] No.41854957{7}[source]
Less insight, perhaps, but of higher quality, which is subjective.

I personally find that googling stuff provides not much connection to the subject of study, very impersonal and try to avoid it.

For example I did google the concept, and found this https://github.com/cousteaulecommandant/ds9k.

Which is not trivial to parse, bing posited the answer as authoritative, and if you look at the code it is really nothing, it seems to be a folklore concept, and as such, it is much more aptly transmitted by speaking to a human and getting a live version than by googling an authoratitative static answer.

32. Rusky ◴[] No.41855111{3}[source]
A worked example: https://github.com/pizlonator/llvm-project-deluge/blob/delug...
33. uecker ◴[] No.41856489{5}[source]
Oh, it also work for runtime length:

https://godbolt.org/z/PnaWWcK9o

replies(1): >>41856543 #
34. uecker ◴[] No.41856518{5}[source]
You need to take the address of the array instead of letting it decay and then size is encoded in the type:

  int foo(int (*a)[6]) { return a[5]; }
  int main() {
  int a[3];
    return foo(&a);
  }
Or for run-time length:

  int foo(int n, int (*a)[n]) { return (\*a)[5]; }
  int main() {
    int a[3];
    return foo(ARRAY_SIZE(a), &a);
  }
  /app/example.c:4:38: runtime error: index 5 out of bounds for 
 type 'int[n]'
https://godbolt.org/z/dxx7TsKbK\*
35. pjmlp ◴[] No.41856543{6}[source]
Now try that on a compiler without -fsanitize=bounds, yet full ISO C compliant.
36. pornel ◴[] No.41856856[source]
There's finally a way to safely add two signed numbers, without tricky overflow checks that may trigger UB themselves!