I'm interested to see how this plays out. I'd like a similar policy for my projects, but also a similar policy/T&C that prohibits the crawling of the content too.
Only way to prohibit crawling is to go back to invite only, probably self-hosted repositories. These companies have no shame, your T&Cs won't mean anything to them and you have no way of proving they violated them without some kind of discovery into their training data.