Unfortunately "mass scraping the internet for training data" and an "LLM powered user agent" get lumped together too much as "AI Crawlers". The user agent shouldn't actually be crawling.
How does this make you any different than the bad faith LLM actors they are trying to block?
This is not banning you for following <h1><a>Today's Weather</a></h1>
If you are a robot that's so poorly coded that it is following links it clearly shouldn't that's are explicitly numerated as not to be followed, that's a problem. From an operator's perspective, how is this different than a case you described.
If a googler kicked off the googlebot manually from a session every morning, should they not respect robots.txt either?
Just trying to make the point that an LLM powered user agent fetching a single page at my request isn't a robot.