Show HN: Unbug – Rust macros for programmatically invoking breakpoints

(github.com)

96 points BrainBacon | 4 comments | 20 Nov 24 13:33 UTC | HN request time: 0.947s | source

This project is inspired by some of the asserts in Unreal engine.

Due to reliance on core_intrinsics it is necessary to develop using nightly Rust, but there are stubs in place so a production build will not require nightly.

I recently released version 0.2 which includes no_std support and adds optional log message arguments to the ensure macro.

Show context

JoshTriplett ◴[20 Nov 24 14:10 UTC] No.42194057[source]▶

>>42193793 (OP) #

You could potentially build on stable Rust by emitting the breakpoint instructions yourself, at least on popular platforms. For instance, `core::arch::asm!("int3")` on x86, or `core::arch::asm!("brk #1")` on ARM.

Also, this is providing motivation to want to stabilize a breakpoint mechanism, perhaps `core::arch::breakpoint()`. I'm going to propose an API Change Proposal (ACP) to the libs-api team to see if we can provide that in stable Rust.

replies(3): >>42194435 #>>42195778 #>>42197034 #

amluto ◴[20 Nov 24 16:50 UTC] No.42195778[source]▶

>>42194057 #

Plain int3 is a footgun: the CPU does not keep track of the address of the int3 (at least not until FRED), and it reports the address after int3. It’s impossible to reliably undo that in software, and most debuggers don’t even try, and the result is a failure to identify the location of the breakpoint. It’s problematic if the int3 is the last instruction in a basic block, and even worse if the optimizer thinks that whatever is after the int3 is unreachable.

If Rust’s standard library does this, please consider using int3;nop instead.

replies(2): >>42196045 #>>42196978 #

1. rep_lodsb ◴[20 Nov 24 18:51 UTC] No.42196978[source]▶

>>42195778 #

The "canonical" INT 3 is a single byte opcode (CCh), so the debugger can just subtract 1 from the address pushed on the stack to get the breakpoint location.

There is another encoding (CD 03), but no assembler should emit it. It used to be possible for adversarial code to confuse debug interrupt handlers with this, but this should be fixed now.

replies(1): >>42198764 #

2. amluto ◴[20 Nov 24 22:26 UTC] No.42198764[source]▶

>>42196978 (TP) #

This would involve the debugger actually being structured in a way that makes this make sense. A debugger like GCC has a gnarly data structure that represents the machine state, and it contains things like EIP/RIP. There is a command 'backtrace' that takes the machine state and attempts to generate a backtrace. And there's a command 'continue' that resumes execution.

int3 is a "trap". continue will resume execution at the instruction after int3, as intended. But backtrace should, by some ill-defined magic, generate the backtrace as though RIP was (saved RIP - 1). And the condition for doing this isn't something that is (AFAIK) representable at all in GCC's worldview. Sure, GCC knows, or at least ought to know [0], that it gained control because of vector 3, and the Intel and AMD manuals say that vector 3 is a trap. But there isn't a bit in memory or anything you would see in 'info regs' that will say "hey, this is a 'trap', and backtraces and such should be done as though RIP was actually RIP-1".

Maybe the right solution would be to split the program counter, from the perspective of the debugger, into two fields: program counter for backtracing, and program counter for resumption.

And yes, I know that GCC gets this wrong. Been there, seen the failures. I have not checked, but I expect that LLDB works exactly like GCC in this regard.

[0] ptrace on Linux exposes the vector number, somewhat awkwardly. Or you can infer it from the fact that the signal was SIGTRAP.

replies(1): >>42203170 #

3. rep_lodsb ◴[21 Nov 24 11:23 UTC] No.42203170[source]▶

>>42198764 #

I assume you meant GDB, not GCC, right?

Seems like a deficiency in GDB (and maybe LLDB too), not in the kernel or x86.

replies(1): >>42203329 #

4. amluto ◴[21 Nov 24 11:43 UTC] No.42203329{3}[source]▶

>>42203170 #

I do mean GCC. Whoops.

Deficiency or not, it breaks debugging. I’m willing to pay a cost of one byte per breakpoint as a workaround.

And GDB has far more outrageous, if less-frequently hit, bugs in its architectural state handling. I’m not holding my breath for a fix.

↑