←back to thread

462 points jakevoytko | 1 comments | | HN request time: 0.206s | source
Show context
BobbyTables2 ◴[] No.43490119[source]
Interesting writeup, but 2 days to debug “the hardest bug ever”, while accurate, seems a bit overdone.

Though abs() returning negative numbers is hilarious.. “You had one job…”

To me, the hardest bugs are nearly irreproducible “Heisenbugs” that vanish when instrumentation is added.

I’m not just talking about concurrency issues either…

The kind of bug where a reproduction attempt takes a week, not parallelizable due to HW constraints, and logging instrumentation makes it go away or fail differently.

2 days is cute though.

replies(13): >>43490149 #>>43490287 #>>43490459 #>>43490557 #>>43491079 #>>43491823 #>>43492539 #>>43492555 #>>43492647 #>>43493115 #>>43493245 #>>43493811 #>>43497018 #
userbinator ◴[] No.43490287[source]
The kind of bug where a reproduction attempt takes a week, not parallelizable due to HW constraints, and logging instrumentation makes it go away or fail differently.

The hardest one I've debugged took a few months to reproduce, and would only show up on hardware that only one person on the team had.

One of the interesting things about working on a very mature product is that bugs tend to be very rare, but those rare ones which do appear are also extremely difficult to debug. The 2-hour, 2-day, and 2-week bugs have long been debugged out already.

replies(3): >>43490473 #>>43491034 #>>43491397 #
gmueckl ◴[] No.43490473[source]
That reminded me of a former colleague at the desk next to me randomly exclaiming one day that he had just fixed a bug he had created 20 years ago.

The bug was actually quite funny in a way: it was in the code displaying the internal temperature of the electronics box of some industrial equipment. The string conversion was treating the temperature variable as an unsigned int when it was in fact signed. It took a brave field technician in Finland in winter, inspecting a unit in an unheated space to even discover this particular bug because the units' internal temperatures were usually about 20C above ambient.

replies(2): >>43490560 #>>43493833 #
treyd ◴[] No.43490560[source]
This is a surprisingly common mistake with temperature readings. Especially when the system has a thermal safety power off that triggers if it's above some temperature, but then interprets -1 deg C as actually 255 deg C.
replies(2): >>43490862 #>>43491874 #
1. shakna ◴[] No.43491874[source]
The rollout is still happening, but the new resident water meters for Victoria, Australia come with a temperature fix.

Prior to this year, they could only handle 0-127 degrees for the water temperature. Which used to be sensible, but there were some issues with pressurised water starting to be delivered to houses resulting in negative temperatures being reported, like -125C, which immediately has the water switch off to prevent icing problems.

The software side also switched from COBOL to Ada. So that's kewl.