←back to thread

1087 points smartmic | 3 comments | | HN request time: 1.575s | source
Show context
titanomachy ◴[] No.44305194[source]
“Good debugger worth weight in shiny rocks, in fact also more”

I’ve spent time at small startups and on “elite” big tech teams, and I’m usually the only one on my team using a debugger. Almost everyone in the real world (at least in web tech) seems to do print statement debugging. I have tried and failed to get others interested in using my workflow.

I generally agree that it’s the best way to start understanding a system. Breaking on an interesting line of code during a test run and studying the call stack that got me there is infinitely easier than trying to run the code forwards in my head.

Young grugs: learning this skill is a minor superpower. Take the time to get it working on your codebase, if you can.

replies(48): >>44305342 #>>44305375 #>>44305388 #>>44305397 #>>44305400 #>>44305414 #>>44305437 #>>44305534 #>>44305552 #>>44305628 #>>44305806 #>>44306019 #>>44306034 #>>44306065 #>>44306133 #>>44306145 #>>44306181 #>>44306196 #>>44306403 #>>44306413 #>>44306490 #>>44306654 #>>44306671 #>>44306799 #>>44307053 #>>44307204 #>>44307278 #>>44307864 #>>44307933 #>>44308158 #>>44308299 #>>44308373 #>>44308540 #>>44308675 #>>44309088 #>>44309822 #>>44309825 #>>44309836 #>>44310156 #>>44310430 #>>44310742 #>>44311403 #>>44311432 #>>44311683 #>>44312050 #>>44312132 #>>44313580 #>>44315651 #
geophile ◴[] No.44305628[source]
I am also in the camp that has very little use for debuggers.

A point that may be pedantic: I don't add (and then remove) "print" statements. I add logging code, that stays forever. For a major interface, I'll usually start with INFO level debugging, to document function entry/exit, with param values. I add more detailed logging as I start to use the system and find out what needs extra scrutiny. This approach is very easy to get started with and maintain, and provides powerful insight into problems as they arise.

I also put a lot of work into formatting log statements. I once worked on a distributed system, and getting the prefix of each log statement exactly right was very useful -- node id, pid, timestamp, all of it fixed width. I could download logs from across the cluster, sort, and have a single file that interleaved actions from across the cluster.

replies(4): >>44305698 #>>44306106 #>>44306184 #>>44308522 #
AdieuToLogic ◴[] No.44306184[source]
> A point that may be pedantic: I don't add (and then remove) "print" statements. I add logging code, that stays forever. For a major interface, I'll usually start with INFO level debugging, to document function entry/exit, with param values.

This is an anti-pattern which results in voluminous log "noise" when the system operates as expected. To the degree that I have personally seen gigabytes per day produced by employing it. It also can litter the solution with transient concerns once thought important and are no longer relevant.

If detailed method invocation history is a requirement, consider using the Writer Monad[0] and only emitting log entries when either an error is detected or in an "unconditionally emit trace logs" environment (such as local unit/integration tests).

0 - https://williamyaoh.com/posts/2020-07-26-deriving-writer-mon...

replies(2): >>44306322 #>>44307569 #
strken ◴[] No.44306322[source]
It's absolutely not an anti-pattern if you have appropriate tools to handle different levels of logging, and especially not if you can filter debug output by area. You touch on this, but it's a bit strange to me that the default case is assumed to be "all logs all the time".

I usually roll my own wrapper around an existing logging package, but https://www.npmjs.com/package/debug is a good example of what life can be like if you're using JS. Want to debug your rate limiter? Write `DEBUG=app:middleware:rate-limiter npm start` and off you go.

replies(3): >>44306540 #>>44307369 #>>44309715 #
1. TeMPOraL ◴[] No.44307369[source]
I know a lot of people do that in all kinds of software (especially enterprise), still, I can't help but notice this is getting close to Greenspunning[0] territory.

What you describe is leaving around hand-rolled instrumentation code that conditionally executes expensive reporting actions, which you can toggle on demand between executions. Thing is, this is already all done automatically for you[1] - all you need is the right build flag to prevent optimizing away information about function boundaries, and then you can easily add and remove such instrumentation code on the fly with a debugger.

I mean, tracing function entry and exit with params is pretty much the main task of a debugger. In some way, it's silly that we end up duplicating this by hand in our own projects. But it goes beyond that; a lot of logging and tracing I see is basically hand-rolling an ad hoc, informally-specified, bug-ridden, slow implementation of 5% of GDB.

Why not accept you need instrumentation in production too, and run everything in a lightweight, non-interactive debugging session? It's literally the same thing, just done better, and couple layers of abstraction below your own code, so it's more efficient too.

--

[0] - https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

[1] - Well, at least in most languages used on the backend, it is. I'm not sure how debugging works in Node at the JS level, if it exists at all.

replies(2): >>44315207 #>>44322889 #
2. strken ◴[] No.44315207[source]
I agree that logging all functions is reinventing the wheel.

I think there's still value in adding toggleable debug output to major interfaces. It tells you exactly what and where the important events are happening, so that you don't need to work out where to stick your breakpoints.

3. geophile ◴[] No.44322889[source]
A logging library is very, very far from a Turing complete language, so no Greenspunning. (Yes, I know about that Java logger fiasco from a few years ago. Not my idea.)

I don't want logging done automatically for me, what I want is too idiosyncratic. While I will log every call on major interfaces, I do want to control exactly what is printed. Maybe some parameter values are not of interest. Maybe I want special formatting. Maybe I want the same log line to include something computed inside the function. Also, most of my logging is not on entry/exit. It's deeper down, to look at very specific things.

Look, I do not want a debugger, except for tiny programs, or debugging unit tests. In a system with lots of processes, running on lots of nodes, if a debugger is even possible to use, it is just too much of a PITA, and provides far too miniscule a view of things. I don't want to deal with running to just before the failure, repeatedly, resetting the environment on each attempt, blah, blah, blah. It's a ridiculous way to debug a large and complex system.

What a debugger can do, that is harder with logging, is to explore arbitrary code. If I chase a problem into a part of my system that doesn't have logging, okay, I add some logging, and keep it there. That's a good investment in the future. (This new logging is probably at a detailed level like DEBUG, and therefore only used on demand. Obvious, but it seems like a necessary thing to point out in this conversation.)