Rather than leave it out, I added a status column which shows different statuses for intermediate hops (blue if the hop responds to less than 100% of probes and brown if it responds to 0%) vs the target hop (which show amber and red respectively).
Where this breaks down is when dealing with ECMP for UDP & TCP tracing, as a given hop (ttl) may represent the target for a given round of tracing but not for the next. The mistake, imho, is to associate _any_ data with a hop (ttl) rather than the hop in the context of a tracing flow.
That is why Trippy had a number of features aimed at helping with ECMP, such as Paris and Dublin tracing, and the ability to filter tracing by unique flow id. I've covered these quite a bit in the 0.8.0 [0] and 0.9.0 [1] release notes if you want to know more.
[0] https://github.com/fujiapple852/trippy/releases/tag/0.8.0
[1] https://github.com/fujiapple852/trippy/releases/tag/0.9.0
If the packet loss starts at your wifi router, or your ISPs router. Or the next hop after you ISP. That all gives you a bit of an idea where the problem likely is. I solve problems like that all the time.
It also mentions that isolated hops that show increased latency or loss are most likely throttling on the device. However, if that latency or loss persists on further hops, that indicates a problem.
Another issue with traceroutes is that is usually doesn't account for asymmetry in the return path. What I would find interesting to see is something like isoping/splitping (see this blog post https://blog.benjojo.co.uk/post/ping-with-loss-latency-split), to account for that asymmetry.
Regarding the tool trippy itself: awesome visualizations!
The only way to reliably isolate packet loss to a hop on the path is to have a destination for testing where packets pass through that hop and is in its bailiwick which doesn't perform rate limiting or policing of ICMP traffic.
I like to always say that traceroute gives you the approximate path a packet will take, whereas ping is for end to end measurement and loss. I'm personally not a big fan of combining the two tools.
Being able to indicate cascading loss (e.g. path A->B->C->D) shows loss at B, C, and D, is worth bubbling up to the user to say there might be real issues. Also any indication of loss at D is also an issue. Trying to reconcile these scenarios with the UI matters, but I don't think there's an easy way. What I think is more important than UI that is sorely needed is documentation / users guide explaining how to read and understand these indicators. I know documentation is usually overlooked by users first trying out a program, but having it documented and explained can be used as a reference to point to a user that is misunderstanding the tool. I found that MTR didn't have this much needed documentation / reference that people would easily misunderstand the tool and it was a herculean effort to correct them.
I would also like to point out that a 0% loss indicator at the destination isn't reliable either if the packets are spaced out with enough slack. One of my goto when testing packet loss of a link I've brought up is to smash a destination host with a ping flood, e.g. ping -c 100 -f 1.1.1.1. By inundating the link it helps provide a clear indicator if there is loss somewhere on the path (usually the first mile or the last). Cloudflare speedtest now has a packet loss tester that floods 1000 packets, although I'm not sure if it does it over an unreliable transport or not.
Regarding sending a ping flood, Trippy allow you to reduce the minimum and maximum round time (and grace period) to send packets almost as fast as you like. For example, to send at 50ms intervals (with a 10ms grace period):
> trip example.com -i 50ms -T 50ms -g 10ms
If D has 1% loss and B and C have 50%, is it fair to say A=0, B=1%, C=1%, D=1%?
MTR display of loss is indeed confusing, but when weird things are going on it can be helpful just stare at it a while to see what's going on. Trippy looks fantastic, and I need to play with it, but there are cases where I just want to stare at the path loss for a while.
There's no way to influence the TTL on TTL timed-out responses, is there? That'd be pretty cool if there were some way to get the return path of the intermediaries to reply.