(huggingface.co)

347 points kashifr | 1 comments | 08 Jul 25 16:13 UTC | HN request time: 0.31s | source

1. eachro ◴[08 Jul 25 18:17 UTC] No.44502588[source]▶

From what I've heard, the llama3 models are fairly easy to fine-tune (please correct me if I'm wrong or if there are more amenable models here). How easy is it to finetune smollm3? I know a lot of the MoE LLMs have been quite fickle in this regard.

↑

Smollm3: Smol, multilingual, long-context reasoner LLM