←back to thread

475 points danielstocks | 1 comments | | HN request time: 0.245s | source
Show context
henvic ◴[] No.27304014[source]
As a software engineer, I hate when I add a check for something "that will never happen" but that if happens is awful, and people complain.

A classic example: you need to get a user from a session, check against a database, and continue if they're signed in.

Then I add a simple if databaseUser.Username != form.Username and people will say "if that happens we've something worse wrong". Geez, something might be wrong and such double checking might provide to be useful.

On a smaller scale, bits flip due to cosmic rays and so on. Of course, there must be a limit where we stop, but people are used to actively avoid doing such "silly assertions" even for important steps.

¯\_(ツ)_/¯

replies(9): >>27304123 #>>27304382 #>>27304569 #>>27304654 #>>27304687 #>>27304894 #>>27308296 #>>27308719 #>>27309906 #
geofft ◴[] No.27304382[source]
I think there's merit in objecting to "that will never happen" checks in some cases (though, to be clear, I'm not saying the people objecting to your code are thinking about the same thing I am).

Specifically, if you have data that is loaded from some other source, your extra safety check might be checking data that's loaded from the same source, in a way where if something did go wrong, it went wrong in both places you're checking.

In this case, it seems pretty unlikely that Klarna's bug was that they ran "SELECT * FROM users WHERE Username = 'joeuser'" and they got back a row where Username != 'joeuser'. I don't think there's a recorded case of that ever happening with databases.

However, it seems much more likely that Klarna's bug was in HTTP caching or something, that results were returned for the wrong user. Then there's no opportunity to see databaseUser.Username != form.Username: that check would have indicated that things are correct, but the username being passed into this code was wrong in the first place. That sort of problem definitely happens in the wild - see the "Kenneth" story elsewhere in these comments, or off the top of my head https://blog.zulip.com/2021/03/20/zulip-cloud-security-incid... from two months ago.

And if it is, somehow, a database bug, why do you trust the database at that point? What if the database returns part of one row and part of another? What if it returns the username you sent in because of some optimization to avoid copying data, but thanks to a bug (or a cosmic ray) it reads in the rest of the data from an unrelated row? In the unlikely but not totally impossible case that you need to protect yourself against this, validating the username isn't enough; you'd better sign the entire database row and validate the signature before trying to use any of the data that's been returned. (And come up with some reason why you trust your own app code more than the database.)

The problem with such "silly assertions" is that they make you feel like you've added test coverage, when the thing you're testing is something like a database that is extensively tested by its vendor and by everyone else using the database, and there are other seams in your code which are much more likely to break. Meanwhile, they make the code longer and harder to read, which prevents readers of the code from easily identifying what those seams are.

(And by slowing down the API endpoint that talks to the database, it motivates other developers to try to put some caching in front of that endpoint, which may actually cause this sort of problem!)

replies(1): >>27304636 #
1. henvic ◴[] No.27304636[source]
> I don't think there's a recorded case of that ever happening with databases. > and there are other seams in your code which are much more likely to break.

One such thing is the abuse of layers and layers of abstractions. For example, many people (unfortunately, in my view) love to use ORMs and query builds, and things like these are much more easier to happen when things are too generic.

And signing the entire database row and validating it, and so on, might be unjustified for most people, especially if you already count with correction from a TLS layer, and you can just have the trade-off of adding a simple conditional to check if the data you receive is sane.

This is not something essential for everything, but that is nice to have, especially the further you're out of control.

For example, if you retrieve data from an external API you should not trust it blindly, but rely on your internal references (security concerns aside, I'm talking about other kind of erratic behavior or bad data).