For NVIDIA cards, you can use NSight. There's also RenderDoc that works on a large number of GPUs.
RenderDoc is very cool, but more of a high level debugger, I guess? It's also good to analyze performance issues, e.g. when working with QML and QSG_VISUALIZE=overdraw / batches (both very high level) don't cut it anymore, or to get a different perspective. Watching a scene getting drawn API call by API call is fun.