←back to thread

246 points doener | 1 comments | | HN request time: 0.198s | source
Show context
ozgune ◴[] No.43691597[source]
I had a related, but orthogonal question about multilingual LLMs.

When I ask smaller models a question in English, the model does well. When I ask the same model a question in Turkish, the answer is mediocre. When I ask the model to translate my question into English, get the answer, and translate the answer back to Turkish, the model again does well.

For example, I tried the above with Llama 3.3 70B, and asked it to plan me a 3-day trip to Istanbul. When I asked Llama to do the translations between English <> Turkish, the answer was notably better.

Anyone else observed a similar behavior?

replies(11): >>43691620 #>>43691751 #>>43691774 #>>43692427 #>>43692596 #>>43692803 #>>43692874 #>>43693906 #>>43695475 #>>43698229 #>>43698667 #
petesergeant ◴[] No.43691620[source]
Fascinating phenomenon. It's like a new Sapir–Whorf hypothesis. Do language models act differently in different languages due to those languages or the training materials?
replies(3): >>43691662 #>>43691782 #>>43692839 #
input_sh ◴[] No.43692839[source]
Both, but primarily due to the lack of training materials. 10 or so million native speakers of my language will never be able to generate the same amount of training material as over a billion English speakers do.

There is a steep drop in quality in any non-English language, but in general less native speakers = worse results. They tend to have a certain "voice" which is extremely easy to spot and the accuracy of results goes out the window (way worse than in English).

replies(1): >>43693716 #
1. petesergeant ◴[] No.43693716[source]
Right, but it’s interesting that means its reasoning abilities potentially drop off when it’s talking Thai, or its knowledge of WW2 history in the Eastern Theatre might drop off when speaking French, where the same model has no trouble with the same questions in English. My French and Thai are both rudimentary, but I’m working from the same set of facts and reasoning ability in both languages. Will it give different answers on what the greatest empire that ever existed was if you ask it in Mandarin vs Italian vs Mongolian?