FWIW: I started a blog/website on neocities.org a few months ago, and found some webcrawler blockers- at least a couple dozen there at the time, but they needed to be uncommented. So I am using them to block the various crawlers, presumably. I still have not put that to the test. By deactivating the blockers, and checking the site traffic states, one might be able to determine whether they are working or not. You might want to build a test site to try yourself. Neocities is free for up to one GB of space.
Years ago, I used google blogspot quite a bit. After every post was published, the first hit came from a server in Germany. Maybe a mirror or surveillance bot? I'll never know. But sure, it's a little spooky to be training AI with every post, and not knowing whose AI, and it's annoying to not be compensated. I've trained a lot of technical people over the years and it paid well. With the AI we get nothing.
Hey if you have enough interesting material, why not write a print book?