←back to thread

462 points jakevoytko | 1 comments | | HN request time: 0s | source
Show context
BobbyTables2 ◴[] No.43490119[source]
Interesting writeup, but 2 days to debug “the hardest bug ever”, while accurate, seems a bit overdone.

Though abs() returning negative numbers is hilarious.. “You had one job…”

To me, the hardest bugs are nearly irreproducible “Heisenbugs” that vanish when instrumentation is added.

I’m not just talking about concurrency issues either…

The kind of bug where a reproduction attempt takes a week, not parallelizable due to HW constraints, and logging instrumentation makes it go away or fail differently.

2 days is cute though.

replies(13): >>43490149 #>>43490287 #>>43490459 #>>43490557 #>>43491079 #>>43491823 #>>43492539 #>>43492555 #>>43492647 #>>43493115 #>>43493245 #>>43493811 #>>43497018 #
userbinator ◴[] No.43490287[source]
The kind of bug where a reproduction attempt takes a week, not parallelizable due to HW constraints, and logging instrumentation makes it go away or fail differently.

The hardest one I've debugged took a few months to reproduce, and would only show up on hardware that only one person on the team had.

One of the interesting things about working on a very mature product is that bugs tend to be very rare, but those rare ones which do appear are also extremely difficult to debug. The 2-hour, 2-day, and 2-week bugs have long been debugged out already.

replies(3): >>43490473 #>>43491034 #>>43491397 #
gmueckl ◴[] No.43490473[source]
That reminded me of a former colleague at the desk next to me randomly exclaiming one day that he had just fixed a bug he had created 20 years ago.

The bug was actually quite funny in a way: it was in the code displaying the internal temperature of the electronics box of some industrial equipment. The string conversion was treating the temperature variable as an unsigned int when it was in fact signed. It took a brave field technician in Finland in winter, inspecting a unit in an unheated space to even discover this particular bug because the units' internal temperatures were usually about 20C above ambient.

replies(2): >>43490560 #>>43493833 #
treyd ◴[] No.43490560[source]
This is a surprisingly common mistake with temperature readings. Especially when the system has a thermal safety power off that triggers if it's above some temperature, but then interprets -1 deg C as actually 255 deg C.
replies(2): >>43490862 #>>43491874 #
1. selimthegrim ◴[] No.43490862[source]
I have seen this on Walgreens signs in suburban New Orleans, oddly enough.