←back to thread

1371 points yett | 1 comments | | HN request time: 0.198s | source
Show context
jandrese ◴[] No.43774091[source]
> Not ignore the compilation warnings – this code most likely threw a warning in the original code that was either ignored or disabled!

What compiler error would you expect here? Maybe not checking the return value from scanf to make sure it matches the number of parameters? Otherwise this seems like a data file error that the compiler would have no clue about.

replies(3): >>43775089 #>>43777497 #>>43778191 #
burch45 ◴[] No.43775089[source]
Undefined behavior to access the uninitialized memory. A sanitizer would have flagged that.
replies(2): >>43775220 #>>43775229 #
jandrese ◴[] No.43775220[source]
The compiler has no way of knowing that the memory would be undefined, not unless it somehow can verify the data file. The most I think it can do is flag the program for not checking the return value of scanf, but even that is unlikely to be true since the program probably was checking for end of file which is also in the return value. It was failing to check the number of matched parameters. This is the kind of error that is easy to miss given the semantics of scanf.
replies(2): >>43775241 #>>43775941 #
nayuki ◴[] No.43775941[source]
> The compiler has no way of knowing that the memory would be undefined

Yes it would. -fsanitize=address does a bunch of instrumentation - it allocates shadow memory to keep track of what main memory is defined, and it checks every read and write address against the shadow memory. It is a combination of compile-time instrumentation and run-time checking. And yes, it is expensive, so it should be used for debugging and not the final release.

https://clang.llvm.org/docs/AddressSanitizer.html , https://learn.microsoft.com/en-us/cpp/sanitizers/asan?view=m...

replies(3): >>43776057 #>>43776483 #>>43776929 #
hoten ◴[] No.43776483[source]
You both may be right. It could be that ASAN is not instrumenting scanf (or some other random standard lib function). Though since 2015, it certainly has been. https://github.com/google/sanitizers/issues/108

The simpler policy of "don't allow unintialized locals when declared" would also have caught it with the tools available when the game was made (though a bit ham-fisted).

replies(1): >>43776808 #
nayuki ◴[] No.43776808[source]
The problem is that after calling scanf(), the number of variables that are defined is a variable number. For example:

    int x, y, z;
    int n = scanf("%d %d %d", &x, &y, &z);
At compile time, you can make no inferences about which of x, y, and z are defined, because that depends on the returned value n. There are many ways to branch out from this.

One is to insist on definite assignment - so if we cannot prove all of them are always assigned, then we can treat them as "possibly undefined" and err out.

Another way is to avoid passing references and instead allow multiple returns, like Python (this is pseudocode):

    x, y, z = scanf("%d %d %d")
In that case, if the hypothetical `scanf()` returns a tuple that is less than 3 elements or more than 3 elements, then the unpacking will fail at run time and crash exactly at that line.

Another way is like Java, which insists that the return value is a scalar, so it can't do what C and Python can do. This can be painful on the programmer, of course.

replies(2): >>43777791 #>>43777955 #
twic ◴[] No.43777955[source]
I interpret "don't allow unintialized locals when declared" as meaning that this call:

    int n = scanf("%d %d %d", &x, &y, &z);
Would be caught, because it takes references to undeclared variables. To be allowed, the programmer would have to initialize the variables beforehand.
replies(1): >>43778926 #
dwattttt ◴[] No.43778926[source]
Then people would complain about the wasteful initialisation of out-params. Foolishly, perhaps
replies(1): >>43780936 #
1. da_chicken ◴[] No.43780936[source]
I think it would make sense to have a keyword that permits unsafe instantiation specifically for the edge cases where initialization is too expensive. But I think it makes sense for the lazy case to be a little bit safer.