←back to thread

462 points jakevoytko | 1 comments | | HN request time: 0.263s | source
Show context
aetimmes ◴[] No.43493994[source]
(disclaimer: I know OP IRL.)

I'm seeing a lot of comments saying "only 2 days? must not have been that bad of a bug". Some thoughts here:

At my current day job, our postmortem template asks "Where did we get lucky?" In this instance, the author definitely got lucky that they were working at Google where 1) there were enough users to generate this Heisenbug consistently and 2) that they had direct access to Chrome devs.

Additionally - the author (and his team) triaged, root caused and remediated a JS compiler bug in 2 days. The sheer amount of complexity involved in trying to narrow down where in the browser code this could all be going wrong is staggering. Consider that the reason it took him "only" two days is because he is very, _very_ good at what he does.

replies(5): >>43494924 #>>43495048 #>>43495849 #>>43496185 #>>43497031 #
seeingnature ◴[] No.43495048[source]
I'd love to see the rest of your postmortem template! I never thought about adding a "Where did we get lucky?" question.

I recently realized that one question for me should be, "Did you panic? What was the result of that panic? What caused the panic?"

I had taken down a network, and the device led me down a pathway that required multiple apps and multiple log ins I didn't have to regain access. I panicked and because the network was small, roamed and moved all devices to my backup network.

The following day, under no stress, I realized that my mistake was that I was scanning a QR code 90 degrees off from it's proper orientation. I didn't realize that QR codes had a proper orientation and figured that their corner identifiers handled any orientation. Then it was simple to gain access to that device. I couldn't even replicate the other odd path.

replies(6): >>43495406 #>>43495546 #>>43495814 #>>43496045 #>>43496082 #>>43496261 #
1. somat ◴[] No.43495814[source]
One of my favorite man pages is scan_ffs https://man.openbsd.org/scan_ffs

    The basic operation of this program is as follows:

    1. Panic. You usually do so anyways, so you might as well get it over with. Just don't do anything stupid. Panic away from your machine. Then relax, and see if the steps below won't help you out.

    2. ...