←back to thread

246 points doener | 8 comments | | HN request time: 0.208s | source | bottom
1. miros_love ◴[] No.43691616[source]
>European versions of ARC

But this is an image-like benchmark. Has anyone looked at the article about the EU-ARC, what is the difference? Why can't you measure it on a regular one?

I glanced through it, didn't find it right away, but judging by their tokenizer, they are learning from scratch. In general, I don't like this approach for the task at hand. For large languages, there are already good models that they don't want to compare with. And for low-resource languages, it is very important to take more languages from this language group, which are not necessarily part of the EU

replies(2): >>43691644 #>>43691647 #
2. whiplash451 ◴[] No.43691644[source]
You might be confusing ARC-AGI and EU-ARC which is a language benchmark [1]

[1] https://arxiv.org/pdf/2410.08928

3. Etheryte ◴[] No.43691647[source]
Why would they want more languages from outside of the EU when they've clearly stated they only target the 24 official languages of the European Union?
replies(1): >>43691728 #
4. miros_love ◴[] No.43691728[source]
For example: Slovene language. You simply don't have enough data on it. But if you add all the data that is available on related languages, you will get a higher quality. LLM fails with this property for low-resource languages.
replies(2): >>43691822 #>>43692263 #
5. yorwba ◴[] No.43691822{3}[source]
They train on 14 billion tokens in Slovene. Are you sure that's not enough?
replies(1): >>43692048 #
6. miros_love ◴[] No.43692048{4}[source]
Unfortunately, yes.

We need more tokens, more variety of topics in texts and more complexity.

replies(1): >>43693189 #
7. Etheryte ◴[] No.43692263{3}[source]
I'm not sure I'm convinced. I speak a small European language and the general experience is that LLMs are often wrong exactly because they think they can just borrow from a related language. The result is even worse and often makes no sense whatsoever. In other words, as far as translations go, confidently incorrect is not useful.
8. mdp2021 ◴[] No.43693189{5}[source]
We need one-shot learning.

(That amount is equivalent to 50000 books, which few nationals will have read.)