https://www.youtube.com/watch?v=QWWgr2rN45o&t=46m20s
The truth is in the middle, I think. They learn in-context, but not as well as humans.
The approach in the article hides the unreliability of current LLMs by generating thousands of programs, and still the results aren't human-level. (This is impressive work though -- I'm not criticizing it.)