←back to thread

GCC 15.1

(gcc.gnu.org)
270 points jrepinc | 4 comments | | HN request time: 0.001s | source
Show context
Calavar ◴[] No.43792948[source]
> {0} initializer in C or C++ for unions no longer guarantees clearing of the whole union (except for static storage duration initialization), it just initializes the first union member to zero. If initialization of the whole union including padding bits is desirable, use {} (valid in C23 or C++) or use -fzero-init-padding-bits=unions option to restore old GCC behavior.

This is going to silently break so much existing code, especially union based type punning in C code. {0} used to guarantee full zeroing and {} did not, and step by step we've flipped the situation to the reverse. The only sensible thing, in terms of not breaking old code, would be to have both {0} and {} zero initialize the whole union.

I'm sure this change was discussed in depth on the mailing list, but it's absolutely mind boggling to me

replies(14): >>43793036 #>>43793080 #>>43793121 #>>43793150 #>>43793166 #>>43794045 #>>43794558 #>>43796460 #>>43798312 #>>43798826 #>>43800132 #>>43800234 #>>43800932 #>>43800975 #
ogoffart ◴[] No.43793121[source]
> This is going to silently break so much existing code

The code was already broken. It was an undefined behavior.

That's a problem with C and it's undefined behavior minefields.

replies(3): >>43793132 #>>43793486 #>>43796042 #
ryao ◴[] No.43793132[source]
GCC has long been known to define undefined behavior in C unions. In particular, type punning in unions is undefined behavior under the C and C++ standards, but GCC (and Clang) define it.
replies(3): >>43793225 #>>43793908 #>>43794163 #
flohofwoe ◴[] No.43794163[source]
> type punning in unions is undefined behavior under the C and C++ standards

Union type punning is entirely valid in C, but UB in C++ (one of the surprisingly many subtle but still fundamental differences between C and C++). There's specifically a (somewhat obscure) footnote about this in the C standard, which also has been more clarified in one of the recent C standards.

replies(1): >>43794351 #
ryao ◴[] No.43794351[source]
There is no footnote about it in the C standard. Someone proposed adding one to standardize the behavior, but it was never accepted. Ever since then, people keep quoting it even though it is a rejected amendment.
replies(1): >>43794374 #
jcranmer ◴[] No.43794374[source]
Footnote 107 in C23, on page 75 in §6.5.2.3:

> If the member used to read the contents of a union object is not the same as the member last used to store a value in the object the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called type punning). This might be a non-value representation.

(though this footnote has been present as far back as C99, albeit with different numbers as the standard has added more text in the intervening 24 years).

replies(1): >>43794415 #
ryao ◴[] No.43794415[source]
The GCC developers disagree with your interpretation:

> Type punning via unions is undefined behavior in both c and c++.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13

replies(2): >>43794552 #>>43804876 #
flohofwoe ◴[] No.43794552[source]
I'm not sure tbh what's there to 'interpret' or how a compiler developer could misread that, the wording is quite clear.
replies(1): >>43794607 #
ryao ◴[] No.43794607[source]
It is an excerpt being taken out of context. Of course it is quite clear. Taking it out of context ignores everything else that the standard says. That interpretation is wrong as far as compiler authors are concerned.
replies(1): >>43794959 #
trealira ◴[] No.43794959[source]
The context is that it's a footnote. The footnote is referenced in this paragraph:

A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member (106), and is an lvalue if the first expression is an lvalue. If the first expression has qualified type, the result has the so-qualified version of the type of the designated member.

106) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called type punning). This might be a non-value representation.

In that same document, union type punning is explicitly listed under Annex J.1, Unspecified Behavior:

(11) The values of bytes that correspond to union members other than the one last stored into (6.2.6.1).

The standard is extremely clear and explicit that it's not undefined behavior.

replies(1): >>43795063 #
1. ryao ◴[] No.43795063[source]
This is not considering the document as a whole. I will defer to the GCC developers on what the document means on this.
replies(2): >>43795578 #>>43795596 #
2. trealira ◴[] No.43795578[source]
I'm interested in hearing how considering the document as a whole leads to a different conclusion.
3. jcranmer ◴[] No.43795596[source]
I am a member of the C standards committee, and I'm telling you you're wrong here. Martin Uecker is also member of the C standards committee, and has just responded to that bug saying that the comment you linked is wrong. I, and others here, have quoted literal standards text to you explaining why type punning through unions is well-defined behavior in C.

I don't know who Andrew Pinski is, but they're factually incorrect regarding the legality of type punning via unions in C.

replies(1): >>43797252 #
4. uecker ◴[] No.43797252[source]
Andrew is a GCC developer who is very competent (much more than myself regarding GCC), but I think he was mistakenly assuming the C++ rules apply to C here as well.