"respecting web crawling opt-outs during data acquisition produces virtually no performance degradation"
Great to read that!
replies(3):
Great to read that!
How are you going to serve users if web site owners decide to wall their content? You can't ignore one side of the market.
It is a fair point, but how strong of a point it is remains to be seen, some architectures are better than others, even with the same training data, so not impossible we could at one point see some innovative architectures beating current proprietary ones. It would probably be short-lived though, as the proprietary ones would obviously improve in their next release after that.