What controls do you suggest?
Saying that a handful of mass copyright infringers with billion dollar investors are simply part of the "public" like every regular visitor is seriously distorting the issue here.
Sites with a robots.txt banning bots are only "unrestricted" in a strictly technical sense. They are clearly setting terms of use that these rogue bots are violating. Besides, robots.txt is legally binding in certain jurisdictions, it's not just a polite plea. And if we decide that anything not technically prevented is legal, then we're also legitimising botnets, DDoS attacks, and a lot more. Hacking into a corporate system through a malconfiguration or vulnerability is also illegal, despite the fact that the defenses failed.
Finally, we all know that the only purpose these bots are scraping for is mass copyright infringement. That's another layer where the "if it's accessible, it's fair game" logic falls apart. I can download a lot of publicly accessible art, music, or software, but that doesn't mean I can do with those files whatever I want. The only reason these AI companies haven't been sued out of existence yet, like they should've been, is that it's trickier to prove provenance than if they straight up served the unmodified files.