(arxiv.org)

248 points doener | 1 comments | 15 Apr 25 10:17 UTC | HN request time: 0.214s | source

Show context

ozgune ◴[15 Apr 25 12:12 UTC] No.43691597[source]▶

I had a related, but orthogonal question about multilingual LLMs.

When I ask smaller models a question in English, the model does well. When I ask the same model a question in Turkish, the answer is mediocre. When I ask the model to translate my question into English, get the answer, and translate the answer back to Turkish, the model again does well.

For example, I tried the above with Llama 3.3 70B, and asked it to plan me a 3-day trip to Istanbul. When I asked Llama to do the translations between English <> Turkish, the answer was notably better.

Anyone else observed a similar behavior?

replies(11): >>43691620 #>>43691751 #>>43691774 #>>43692427 #>>43692596 #>>43692803 #>>43692874 #>>43693906 #>>43695475 #>>43698229 #>>43698667 #

1. quonn ◴[15 Apr 25 20:51 UTC] No.43698229[source]▶

>>43691597 #

Given the fact that LLMs like most neural networks work by passing their input through layers, wouldn't this be expected? There's no going back to an earlier layer and if the first layers are in some sense needed for "translating" [0] to English, any other functionality in those layers cannot be used.

[0] I am simplifying here, but it would make sense for an LLM to learn this, even though the intermediate representation is not exactly English, given the fact that much of the internet in English and the empirical fact that they are good at translating.

↑

Teuken-7B-Base and Teuken-7B-Instruct: Towards European LLMs (2024)