So 1) 2) and 3) were out by 1,1 and 3 orders of magnitude respectively (the errors partially cancelled out) and 4) was nonsensical.
This little experiment made my skeptical about the state of the art of AI. I have seen much AI output which is extraordinary it's funny how one serious fail can impact my point of view so dramatically.
Looking up the math ability of the average American this is given as an example for the median (from https://www.wyliecomm.com/2021/11/whats-the-latest-u-s-numer...):
>Review a motor vehicle logbook with columns for dates of trip, odometer readings and distance traveled; then calculate trip expenses at 35 cents a mile plus $40 a day.
Which is ok but easier than golf balls in a 747 and hugely easier than USAMO.
Another question you could try from the easy math end is: Someone calculated the tariff rate for a country as (trade deficit)/(total imports from the country). Explain why this is wrong.