Most active commenters

    ←back to thread

    246 points doener | 15 comments | | HN request time: 0.205s | source | bottom
    Show context
    ozgune ◴[] No.43691597[source]
    I had a related, but orthogonal question about multilingual LLMs.

    When I ask smaller models a question in English, the model does well. When I ask the same model a question in Turkish, the answer is mediocre. When I ask the model to translate my question into English, get the answer, and translate the answer back to Turkish, the model again does well.

    For example, I tried the above with Llama 3.3 70B, and asked it to plan me a 3-day trip to Istanbul. When I asked Llama to do the translations between English <> Turkish, the answer was notably better.

    Anyone else observed a similar behavior?

    replies(11): >>43691620 #>>43691751 #>>43691774 #>>43692427 #>>43692596 #>>43692803 #>>43692874 #>>43693906 #>>43695475 #>>43698229 #>>43698667 #
    1. mrweasel ◴[] No.43691751[source]
    Someone apparently did observe ChatGPT (I think it was ChatGPT) switch to Chinese for some parts of it's reasoning/calculations and then back to English for the final answer. That's somehow even weirder than the LLM giving different answers depending on the input.
    replies(6): >>43692168 #>>43692261 #>>43692574 #>>43693043 #>>43695468 #>>43695859 #
    2. ApolloFortyNine ◴[] No.43692168[source]
    I've seen this happen as well with o3-mini, but I'm honestly not sure what triggered it. I use it all the time but have only had it switch to Chinese during reasoning maybe twice.
    replies(2): >>43692757 #>>43697704 #
    3. maxloh ◴[] No.43692261[source]
    > the LLM giving different answers depending on the input.

    LLMs are actually designed to have some randomness in their responses.

    To make the answer reproducible, set the temperature to O (eliminating randomness) and provide a static seed (ensuring consistent results) in the LLM's configuration.

    replies(2): >>43692350 #>>43692529 #
    4. lolinder ◴[] No.43692350[source]
    It's not necessary in most inference engines I've seen to set the temperature to 0—the randomness in the temperature is drawn from the seed, so a static seed will work for any temperature.
    5. jll29 ◴[] No.43692529[source]
    The influence of the (pseudo-)random number generator is called "temperature" in most models.

    Setting it to 0 in theory eliminates all randomness, and instead of choosing one from a list of next words that may be predicted, always only the MOST PROBABLY word would be chosen.

    However, in practice, setting the temperature to 0 in most GUIs does not actually set the temperature to 0, but to a "very small" value ("epsilon"), the reason being to avoid a division by zero exception/crawsh in a mathematical formula. So don't be surprised if you cannot get rid of random behavior entirely.

    replies(1): >>43696479 #
    6. laurent_du ◴[] No.43692574[source]
    Reminds me of this funny video: https://www.youtube.com/watch?v=NY3yWXWjYjA ("You know something has gone wrong when he switches to Chinese")
    7. Telemakhos ◴[] No.43692757[source]
    I've seen Grok sprinkle random Chinese characters into responses I asked for in ancient Greek and Latin.
    replies(1): >>43693959 #
    8. ricochet11 ◴[] No.43693043[source]
    i’ve seen that with deepseek
    9. andai ◴[] No.43693959{3}[source]
    I get strange languages sprinkled through my Gemini responses, including some very obscure ones. It just randomly changes language for one or two words.
    replies(1): >>43696825 #
    10. rzz3 ◴[] No.43695468[source]
    I saw Claude 3.7 write a comment in my code in Russian followed by, likely from a previous modification, the English text “Russian coding” for no reason.
    11. jananas ◴[] No.43695859[source]
    In had it doing the reasoning in Turkish and English despite the question being in German.
    12. Asraelite ◴[] No.43696479{3}[source]
    > the reason being to avoid a division by zero exception/crawsh in a mathematical formula

    Why don't they just special-case it?

    13. genewitch ◴[] No.43696825{4}[source]
    Is it possible the "vector" is more accurate in another language? Like espirit d'esclair or schadenfreude, or any number of other things that are a single word in a language but paragraphs or more in others?
    replies(1): >>43699401 #
    14. numpad0 ◴[] No.43697704[source]
    Isn't it just it getting increasingly incoherent as non-English data fraction increases?

    Last I checked, none of open weight LLMs has languages other than English as its sole dominant language represented in the dataset.

    15. sanxiyn ◴[] No.43699401{5}[source]
    Possibly. I have seen Claude switching to Russian for a word or two when it is about revolution!