Assuming one trusts the user-agent in this case one could reduce the traffic reply to them and avoid touching the disk or any applications in Nginx with something like:
if ($http_user_agent ~ (crawler|some-other-bot) ) { return 200 '\n\n\n\nBot quota exceeded, check back in 2150 years.\n\n\n\n'; }
There are other variables to look for to see if something is a bot but such things should be very well tested. $http_accept_language, $http_sec_fetch_mode, etc...I don't use CF but maybe they have a way to block the entire ASN for AWS on your account assuming one does not need inbound connections from them. I just blackhole their CIDR blocks [1] but that won't help someone using a CDN.