←back to thread

1087 points smartmic | 2 comments | | HN request time: 0s | source
Show context
titanomachy ◴[] No.44305194[source]
“Good debugger worth weight in shiny rocks, in fact also more”

I’ve spent time at small startups and on “elite” big tech teams, and I’m usually the only one on my team using a debugger. Almost everyone in the real world (at least in web tech) seems to do print statement debugging. I have tried and failed to get others interested in using my workflow.

I generally agree that it’s the best way to start understanding a system. Breaking on an interesting line of code during a test run and studying the call stack that got me there is infinitely easier than trying to run the code forwards in my head.

Young grugs: learning this skill is a minor superpower. Take the time to get it working on your codebase, if you can.

replies(48): >>44305342 #>>44305375 #>>44305388 #>>44305397 #>>44305400 #>>44305414 #>>44305437 #>>44305534 #>>44305552 #>>44305628 #>>44305806 #>>44306019 #>>44306034 #>>44306065 #>>44306133 #>>44306145 #>>44306181 #>>44306196 #>>44306403 #>>44306413 #>>44306490 #>>44306654 #>>44306671 #>>44306799 #>>44307053 #>>44307204 #>>44307278 #>>44307864 #>>44307933 #>>44308158 #>>44308299 #>>44308373 #>>44308540 #>>44308675 #>>44309088 #>>44309822 #>>44309825 #>>44309836 #>>44310156 #>>44310430 #>>44310742 #>>44311403 #>>44311432 #>>44311683 #>>44312050 #>>44312132 #>>44313580 #>>44315651 #
geophile ◴[] No.44305628[source]
I am also in the camp that has very little use for debuggers.

A point that may be pedantic: I don't add (and then remove) "print" statements. I add logging code, that stays forever. For a major interface, I'll usually start with INFO level debugging, to document function entry/exit, with param values. I add more detailed logging as I start to use the system and find out what needs extra scrutiny. This approach is very easy to get started with and maintain, and provides powerful insight into problems as they arise.

I also put a lot of work into formatting log statements. I once worked on a distributed system, and getting the prefix of each log statement exactly right was very useful -- node id, pid, timestamp, all of it fixed width. I could download logs from across the cluster, sort, and have a single file that interleaved actions from across the cluster.

replies(4): >>44305698 #>>44306106 #>>44306184 #>>44308522 #
AdieuToLogic ◴[] No.44306184[source]
> A point that may be pedantic: I don't add (and then remove) "print" statements. I add logging code, that stays forever. For a major interface, I'll usually start with INFO level debugging, to document function entry/exit, with param values.

This is an anti-pattern which results in voluminous log "noise" when the system operates as expected. To the degree that I have personally seen gigabytes per day produced by employing it. It also can litter the solution with transient concerns once thought important and are no longer relevant.

If detailed method invocation history is a requirement, consider using the Writer Monad[0] and only emitting log entries when either an error is detected or in an "unconditionally emit trace logs" environment (such as local unit/integration tests).

0 - https://williamyaoh.com/posts/2020-07-26-deriving-writer-mon...

replies(2): >>44306322 #>>44307569 #
strken ◴[] No.44306322[source]
It's absolutely not an anti-pattern if you have appropriate tools to handle different levels of logging, and especially not if you can filter debug output by area. You touch on this, but it's a bit strange to me that the default case is assumed to be "all logs all the time".

I usually roll my own wrapper around an existing logging package, but https://www.npmjs.com/package/debug is a good example of what life can be like if you're using JS. Want to debug your rate limiter? Write `DEBUG=app:middleware:rate-limiter npm start` and off you go.

replies(3): >>44306540 #>>44307369 #>>44309715 #
AdieuToLogic ◴[] No.44306540[source]
> It's absolutely not an anti-pattern if you have appropriate tools to handle different levels of logging, and especially not if you can filter debug output by area.

It is an anti-pattern due to what was originally espoused:

  I add logging code, that stays forever. For a major 
  interface, I'll usually start with INFO level debugging, to 
  document function entry/exit, with param values.
There is no value for logging "function entry/exit, with param values" when all collaborations succeed and the system operates as intended. Note that service request/response logging is its own concern and is out of scope for this discussion.

Also, you did not address the non-trivial cost implications of voluminous log output.

> You touch on this, but it's a bit strange to me that the default case is assumed to be "all logs all the time".

Regarding the above, production-ready logging libraries such as Logback[0], log4net[1], log4cpp[2], et al, allow for run-time configuration to determine what "areas" will have their entries emitted. So "all logs all the time" is a non sequitur in this context.

What is relevant is the technique identified of emitting execution context when it matters and not when it doesn't. As to your `npm` example, I believe this falls under the scenario I explicitly identified thusly:

  ... an "unconditionally emit trace logs" environment
  (such as local unit/integration tests).
0 - https://logback.qos.ch/

1 - https://logging.apache.org/log4net/index.html

2 - https://log4cpp.sourceforge.net/

replies(2): >>44306621 #>>44318706 #
geophile ◴[] No.44318706[source]
> There is no value for logging "function entry/exit, with param values" when all collaborations succeed and the system operates as intended.

Well, I agree completely, but those conditions are a tall order. The whole point of debugging (by whatever means you prefer) is for those situations in which things don't succeed or operate as intended. If I have a failure, and suspect a major subsystem, I sure do want to see all calls and param values leading up to a failure.

In addition to this point, you have constructed a strawman in which logging is on all the time. Have you ever looked at syslog? On my desktop Linux system, output there counts as voluminous. It isn't so much space, or so CPU-intensive that I would consider disabling syslog output (even if I could).

The large distributed system I worked on would produce a few GB per day, and the logs were rotated. A complete non-issue. And for the rare times that something did fail, we could turn up logging with precision and get useful information.

replies(1): >>44324026 #
AdieuToLogic ◴[] No.44324026[source]
>> There is no value for logging "function entry/exit, with param values" when all collaborations succeed and the system operates as intended.

> Well, I agree completely, but those conditions are a tall order.

Every successful service invocation satisfies "all collaborations succeed and the system operates as intended." Another way to state this is every HTTP `1xx`, `2xx`, and `3xx` response code produced by an HTTP service qualifies as such.

> The whole point of debugging (by whatever means you prefer) is for those situations in which things don't succeed or operate as intended.

Providing sufficient context in the presence of errors, or "situations in which things don't succeed or operate as intended", was addressed thusly:

  If detailed method invocation history is a requirement, 
  consider using the Writer Monad and only emitting log 
  entries when either an error is detected or in an 
  "unconditionally emit trace logs" environment (such as 
  local unit/integration tests).
> If I have a failure, and suspect a major subsystem, I sure do want to see all calls and param values leading up to a failure.

See above.

> In addition to this point, you have constructed a strawman in which logging is on all the time.

No, I addressed your original premise in the context of a production web application, where logging is configured during deployment.

See also your own contradiction by previously asserting, "I sure do want to see all calls and param values leading up to a failure."

So which is it?

Did I construct a strawman "in which logging is on all the time"?

Or do you "want to see all calls and param values leading up to a failure", which requires "logging is on all the time"?

> Have you ever looked at syslog?

This is a strawman. Syslog is a component for logging and has nothing to do with the programs which use it.

> The large distributed system I worked on would produce a few GB per day, and the logs were rotated. A complete non-issue.

If this is the same system you described in a different comment also in this thread, I identified a standard industry practice of log entries produced by application nodes in a production environment being unconditionally sent to a log aggregator and not stored in a local file system. The reasons for this are well documented.

replies(1): >>44329492 #
1. geophile ◴[] No.44329492[source]
I'm not sure what point you are making with your scenario involving HTTP response codes. What if the HTTP server crashes, and doesn't send a response at all?

I don't know from Writer Monads. But if you only emitting log entries on some future failure or request, then that's potentially a lot of logging to keep somewhere. Where? Log aggregator? Local files? Memory? What about log volume? Does this writer monad implement log rotation? It sounds like you are sweeping a lot of the things you object to under this writer monad rug.

Let me be real clear about all calls and param values leading up to a failure.

- In normal operation, turn logging off completely, or turn on some level that produces tolerable log volume, (it seems like your threshold is much lower than mine).

- When a failure occurs: Restart the service with more logging enabled, (hence the all calls an param values), so that you have logging when the failure occurs again.

About local logs vs a log aggregator: The system I worked on was a shared nothing archive. To add storage and metadata capacity, you add nodes. Each node also stored its own log files. I get that this may not be the answer all the time, and that a log aggregator is useful in some scenarios. However, even in that case, your concerns about log volume seem overblown to me.

replies(1): >>44333995 #
2. AdieuToLogic ◴[] No.44333995[source]
> I'm not sure what point you are making with your scenario involving HTTP response codes.

My point was to identify how common the "happy path" scenario is and was in direct response to:

>> There is no value for logging "function entry/exit, with param values" when all collaborations succeed and the system operates as intended.

> Well, I agree completely, but those conditions are a tall order.

Regarding your question:

> What if the HTTP server crashes, and doesn't send a response at all?

Again, the HTTP status codes were used to illustrate the frequency of successful invocations. But to your point, if an HTTP server crashes then log entries for in-flight workflows would likely not be emitted. A similar possibility also holds for local file system logging as buffering exists (sometimes on multiple levels).

> I don't know from Writer Monads.

No worries. All it is is a formal definition of a type having specific capabilities called "laws" in the functional programming world.

> But if you only emitting log entries on some future failure or request, then that's potentially a lot of logging to keep somewhere. Where? Log aggregator? Local files? Memory?

What is meant by "future failure" is a failure potentially encountered during the evaluation of a single invocation. In the context of a HTTP server, this would be the handling of each submission to a specific HTTP endpoint and verb. This is often defined within an IO Monad[0], but does not have to be, and is out of scope for this discussion.

The answer to the rest of your questions is the deferred log entry definitions are held in memory for the duration of the single service invocation, with any log emissions produced transmitted to a network-accesible log aggregator via the logging component used.

> Let me be real clear about all calls and param values leading up to a failure.

The scenario you kindly shared is understandable, yet is one which has been unacceptable in teams I have worked with. Losing the "first error" is not an option in those environments.

0 - https://en.wikibooks.org/wiki/Haskell/Understanding_monads/I...