If there's a pattern, I will find it, and I will exploit it. <3
If there's a pattern, I will find it, and I will exploit it. <3
You can't make money off something without any value to people. People have to "want" the thing
You can make a similar argument that the RIAA/MPAA going after piracy is a waste of time. Again, focus on delivering value to actual customers.
If Facebook spent more time on making a friendly ecosystem / community I would be more open to signing back up again. Instead, it seems they are hyper-focused on advertisements at the expense of everything else.
The problem here of course is that circumventing the ad blocking is the most direct way Facebook can find value.
Even if you absolutely mangled the HTML/selectors/DOM/etc. I feel you could always have it process screenshots of the interfaces to rip text/figure out how to interact etc. If it's human readable, it's bot readable imo. (but in years of botting it's never came to this - I've always been able to figure out how to use the existing DOM/selectors to do my work even with anti-bot measures)
Pretty easy to build a randomizing span algo that you can't hardcode.
With all the easy to use tools available to programmers today, it would not be terribly hard to use OCR on a screenshot to find the text of interest and derive the scraping code by searching for the OCR'd text in the markup.
If none of your extant parsers can extract the info you want from the page, send it to OCR pipeline (or, hell, Mechanical Turk) and generate a new one.
If they turn all their posts into <canvas> then it'd kill any accessibility features and the ability to copy-paste text and such so I doubt they'd go that far.
Even then, a scraper could run OCR on the canvas image to get the text out of it.
And advertisers are asking them to show what they are doing to combat adblocking. FB isn't doing this to target customers least likely to convert, they're doing it to check a box for their ad sales team.
Facebook thinks otherwise. Between you and them, I suspect they are more likely to have some evidence or trials to support their position.
Did you mean to say "rectify" as in "fix/adjust"? It sounds like you might have meant "reify" – as in, "create" – but I don't know whether you had the scrapper before that.
Revenue is (simplistically) the product of impressions and value per impression. It’s not therefore immediately obvious that moves like this actually do increase revenue for them, especially over the long term since one potential side effect of doing this is giving more ammunition to the ‘delete Facebook’ crowd.
What I meant is that I can hammer out some Node/Python that will grab an image w/text and put it through OCR for character extraction. "Programming" it would take me a handful of minutes.