My point is that they're not _just simply regurgitating training data_ and it's reductionist to suggest that's all they do. I don't doubt there's plenty of contamination in OpenAI's models, and I don't doubt there's some level of regurgitation happening, but that's not all that's going on and we need to take seriously the possibility that LLMs, combined with well engineered prompts, can and/or will be able to tackle problems that aren't in their training data. Where do you even draw the line anyway?
The conversation about contamination (also very important) doesn't need to be mutually exclusive to conversations about social and economic impact, and I'm pretty sure with respect to those issues the results on standardized tests, however sensationalist, however containated, are an important wake-up call for ordinary people who haven't been following along. Something is happening now.