←back to thread

504 points Terretta | 1 comments | | HN request time: 0.202s | source
Show context
esafak ◴[] No.45064606[source]
"On the full subset of SWE-Bench-Verified, grok-code-fast-1 scored 70.8% using our own internal harness."

Let's see this harness, then, because third party reports rate it at 57.6%

https://www.vals.ai/models/grok_grok-code-fast-1

replies(2): >>45067265 #>>45069650 #
jiggawatts ◴[] No.45069650[source]
I know this sounds like a nitpick, but the first thing I noticed when opening the site is the use of gibberish date order where the day, month, and year parts are out of order.[1]

This doesn't just cause confusion, it's also hard to sort. To confirm my suspicion of sloppy coding, I tried to sort the date column and to my surprise I got this madness:

    1/31/2025
    2/29/2024
    2/29/2024
    4/28/2024
    3/27/2024
    9/27/2023
Which is sorting by the day column -- the bit in the middle -- instead of the year!

That's just... special.

[1] I hear some incredibly backwards places like Liberia that also haven't adopted metric insist on using it into the present day, but the rest of the civilised world has moved on.

replies(2): >>45069807 #>>45075053 #
whimsicalism ◴[] No.45069807[source]
not sure if the comment about liberia is tongue in cheek but this is by far the most common way of writing dates in the US
replies(1): >>45069887 #
jiggawatts ◴[] No.45069887[source]
Yes, of course this is tongue in cheek, but it’s the “ha-ha… but serious” type of humour.

Just look at this map: https://en.m.wikipedia.org/wiki/List_of_date_formats_by_coun...

You’re almost entirely alone in these backwards practices!

Well, not entirely alone, you also have Liberia following your “standards”! There’s two of you! Must be nice.

PS: If Trump actually wanted to make US exports competitive on the world market, step one would be to adopt world standards like metric.

replies(3): >>45069964 #>>45070144 #>>45076687 #
1. dghlsakjg ◴[] No.45076687[source]
You are on a site hosted in that backwards country, funded by people from that backwards country, using technology initially developed by that backwards country, on a thread about new SOTA technology originating from that backwards country, almost certainly using software and hardware from that backwards country to spout offensive things about that backwards country.

Maybe the US isn't as backwards as you might believe, or maybe Airbus is a backwards company for using feet and knots? Perhaps different measurement systems have their virtues (give me an exact integer representation of 1/3 of a meter. For a foot it is 4 inches. For a yard it is 1 foot or 12 inches.)

For the record, the US made the metric system the preferred system of measurement 50 years ago. So you are also uninformed in your attempted insult about US exports (1975, Metric Conversion Act). Americans also learn about the metric system in school, and are more than capable of using it when it matters (the American weapons that Europe and Ukraine seem so fond of use the metric system).

I don't live in the US, but I have lived there in the past, and making sweeping insults about 400 million people is something I learned not to do.