Most active commenters
  • ryao(10)
  • trealira(6)
  • jotux(3)

←back to thread

GCC 15.1

(gcc.gnu.org)
270 points jrepinc | 26 comments | | HN request time: 1.528s | source | bottom
Show context
Calavar ◴[] No.43792948[source]
> {0} initializer in C or C++ for unions no longer guarantees clearing of the whole union (except for static storage duration initialization), it just initializes the first union member to zero. If initialization of the whole union including padding bits is desirable, use {} (valid in C23 or C++) or use -fzero-init-padding-bits=unions option to restore old GCC behavior.

This is going to silently break so much existing code, especially union based type punning in C code. {0} used to guarantee full zeroing and {} did not, and step by step we've flipped the situation to the reverse. The only sensible thing, in terms of not breaking old code, would be to have both {0} and {} zero initialize the whole union.

I'm sure this change was discussed in depth on the mailing list, but it's absolutely mind boggling to me

replies(14): >>43793036 #>>43793080 #>>43793121 #>>43793150 #>>43793166 #>>43794045 #>>43794558 #>>43796460 #>>43798312 #>>43798826 #>>43800132 #>>43800234 #>>43800932 #>>43800975 #
ogoffart ◴[] No.43793121[source]
> This is going to silently break so much existing code

The code was already broken. It was an undefined behavior.

That's a problem with C and it's undefined behavior minefields.

replies(3): >>43793132 #>>43793486 #>>43796042 #
ryao ◴[] No.43793132[source]
GCC has long been known to define undefined behavior in C unions. In particular, type punning in unions is undefined behavior under the C and C++ standards, but GCC (and Clang) define it.
replies(3): >>43793225 #>>43793908 #>>43794163 #
1. mtklein ◴[] No.43793225[source]
I have always thought that punning through a union was legal in C but UB in C++, and that punning through incompatible pointer casting was UB in both.

I am basing this entirely on memory and the wikipedia article on type punning. I welcome extremely pedantic feedback.

replies(3): >>43793282 #>>43794008 #>>43794170 #
2. ryao ◴[] No.43793282[source]
There has been plenty of misinformation spread on that. One of the GCC developers told me explicitly that type punning through a union was UB in C, but defined by GCC when I asked (after I had a bug report closed due to UB). I could find the bug report if I look for it, but I would rather not do the search.
replies(3): >>43793958 #>>43794040 #>>43794333 #
3. trealira ◴[] No.43793958[source]
From a draft of the C23 standard, this is what it has to say about union type punning:

> If the member used to read the contents of a union object is not the same as the member last used to store a value in the object the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called type punning). This might be a non-value representation.

In past standards, it said "trap representation" rather than "non-value representation," but in none of them did it say that union type punning was undefined behavior. If you have a PDF of any standard or draft standard, just doing a search for "type punning" should direct you to this footnote quickly.

So I'm going to say that if the GCC developer explicitly said that union type punning was undefined behavior in C, then they were wrong, because that's not what the C standard says.

replies(2): >>43794268 #>>43794438 #
4. jotux ◴[] No.43794008[source]
Saw this recently and thought it was good: https://www.youtube.com/watch?v=NRV_bgN92DI
5. jotux ◴[] No.43794040[source]
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Typ...
replies(1): >>43794458 #
6. jcranmer ◴[] No.43794170[source]
> punning through a union was legal in C

In C89, it was implementation-defined. In C99, it was made expressly legal, but it was erroneously included in the list of undefined behavior annex. From C11 on, the annex was fixed.

> but UB in C++

C++11 adopted "unrestricted unions", which added a concept of active members that is UB to access other members unless you make them active. Except active members rely on constructors and destructors, which primitive types don't have, so the standard isn't particularly clear on what happens here. The current consensus is that it's UB.

C++20 added std::bit_cast which is a much safer interface to type punning than unions.

> punning through incompatible pointer casting was UB in both

There is a general rule that accessing an object through an 'incompatible' lvalue is illegal in both languages. In general, changing the const or volatile qualifier on the object is legal, as is reading via a different signed or unsigned variant, and char pointers can read anything.

replies(2): >>43794240 #>>43794425 #
7. trealira ◴[] No.43794240[source]
> In C99, it was made expressly legal, but it was erroneously included in the list of undefined behavior annex.

In C99, union type punning was put under Annex J.1, which is unspecified behavior, not undefined behavior. Unspecified behavior is basically implementation-defined behavior, except that the implementor is not required to document the behavior.

replies(1): >>43794733 #
8. amboar ◴[] No.43794268{3}[source]
Section J.1 _Unspecified_ behavior says

> (11) The values of bytes that correspond to union members other than the one last stored into (6.2.6.1).

So it's a little more constrained in the ramifications, but the outcomes may still be surprising. It's a bit unfortunate that "UB" aliases to both "Undefined behavior" and "Unspecified behavior" given they have subtly different definitions.

From section 4 we have:

> A program that is correct in all other aspects, operating on correct data, containing unspecified behavior shall be a correct program and act in accordance with 5.1.2.4.

9. uecker ◴[] No.43794333[source]
Union type punning is allowed and supported by GCC: https://godbolt.org/z/vd7h6vf5q
replies(1): >>43794384 #
10. ryao ◴[] No.43794384{3}[source]
I said that GCC defines type punning via unions. It is an extension to the C standard that GCC did.

That said, using “the code compiles in godbolt” as proof that it is not relying on what the standard specifies to be UB is fallacious.

replies(1): >>43796963 #
11. ryao ◴[] No.43794425[source]
The GCC developers disagree as of last December:

> Type punning via unions is undefined behavior in both c and c++.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13

replies(1): >>43800862 #
12. ryao ◴[] No.43794438{3}[source]
Here is what was said:

> Type punning via unions is undefined behavior in both c and c++.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13

Feel free to start a discussion on the GCC mailing list.

replies(1): >>43794612 #
13. ryao ◴[] No.43794458{3}[source]
What is your point? I already said that GCC defines it even though the C standard does not. As per the GCC developers:

> Type punning via unions is undefined behavior in both c and c++.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13

replies(1): >>43794667 #
14. trealira ◴[] No.43794612{4}[source]
I actually might, although not now. Thanks for the link. I'm surprised he directly contradicted the C standard, rather than it just being a misunderstanding.
replies(1): >>43794646 #
15. ryao ◴[] No.43794646{5}[source]
According to another comment, the C standard contradicts the C standard on this:

https://news.ycombinator.com/item?id=43794268

Taking snippets of the C standard out of context of the whole seems to result in misunderstandings on this.

replies(1): >>43794681 #
16. jotux ◴[] No.43794667{4}[source]
> One of the GCC developers told me explicitly that type punning through a union was UB in C, but defined by GCC when I asked

I just was citing the source of this for reference.

replies(1): >>43794683 #
17. trealira ◴[] No.43794681{6}[source]
It doesn't. That commenter is saying that in C99, it was unspecified behavior. Since C11 onward, it's been removed from the unspecified behavior annex and type punning is allowed, though it may generate a trap/non-value representation. It was never undefined behavior, which is different.

Edit: no, it's still in the unspecified behavior annex, that's my mistake. It's still not undefined, though.

replies(1): >>43794709 #
18. ryao ◴[] No.43794683{5}[source]
I see. Carry on then. :)
19. ryao ◴[] No.43794709{7}[source]
Most of the C code I write is C99 code, so it is undefined behavior either way for me (if I care about compilers other than GCC and Clang).

That said, I am going to defer to the GCC developers on this since I do not have time to make sense of all versions of the C standard.

replies(1): >>43794754 #
20. ryao ◴[] No.43794733{3}[source]
We can use UB to refer to both. :)
replies(2): >>43794893 #>>43796589 #
21. trealira ◴[] No.43794754{8}[source]
That's fair. In the end, what matters is how C is implemented in practice on the platforms your code targets, not what the C standard says.
22. trealira ◴[] No.43794893{4}[source]
Maybe, but we were talking about "undefined behavior," not "UB," so the point is moot.
replies(1): >>43794969 #
23. ◴[] No.43794969{5}[source]
24. hermitdev ◴[] No.43796589{4}[source]
> We can use UB to refer to both. :)

You can, but in the context of the standard, you'd be wrong to do so. Undefined behavior and unspecified behavior have specific, different, meanings in context of the C and C++ standards.

Conflate them at your own peril.

25. uecker ◴[] No.43796963{4}[source]
I am a member of the standards committee and a GCC maintainer. The C standard supports union punning. (You are right though that relying on godbolt examples can be misleading.)
26. saagarjha ◴[] No.43800862{3}[source]
I think they're wrong about C.