No wall yet and I think we might have crossed the threshold of models being as good or better than most engineers already.
GDPval will be an interesting benchmark and I'll happily use the new model to test spreadsheet (and other office work) capabilities. If they can going like this just a little bit further, much of the office workers will stop being useful.... I don't know yet how to feel about this.
Great for humanity probably but but for the individuals?