Again, this is not about data being collected, we do know how much data Google collects, it is all about what is being done with the data and by extension how good the end result is.
This touches the broader subject of systems engineering and especially validation. As far as I am aware, there are currently no tools/models for validation of machine learning models and the task gets exponentially harder with degrees of freedom given to the ML system. The more data Google collects and tries to use in ranking, the less bounded ranking task is and therefore less validatable, therefore more prone to errors.
Google is such a big player in search space that they can quantify/qualify behavior of their ranking system, publish that as SEO guidelines and have majority of good-faith actors behave in accordance, reinforcing the quality of the model - the more good-faith actors actively compete for the top spot, the more top results are of good-faith actors. However, as evidenced by the OP and other black hat SEO stories, the ranking system can be gamed and datums which should produce negative ranking score are either not weighted appropriately or in some cases contribute to positive score.
Google search results are notoriously plagued with Pinterest results, shop-looking sites which redirect to chinese marketplaces and similar. It looks like the only tool Google has to combat such actors is manual domain-based blacklisting, because, well, they would have done something systematic about it. It seems to me that the ranking algorithm at Google is given so many different inputs that it essentially lives its own life and changes are no longer proactive, but rather reactive, because Google does not have sufficient tools to monitor black hat SEO activity to punish sites accordingly.