Apertus 70B: Truly Open - Swiss LLM by ETH, EPFL and CSCS

1. denysvitali ◴[02 Sep 25 20:21 UTC] No.45108483[source]▶

Report: https://github.com/swiss-ai/apertus-tech-report/raw/refs/hea...

Key features

Fully open model: open weights + open data + full training details including all data and training recipes

Massively Multilingual: 1811 natively supported languages

Compliant: Apertus is trained while respecting opt-out consent of data owners (even retrospectivey), and avoiding memorization of training data

replies(3): >>45109373 #>>45113812 #>>45142610 #

2. Bromeo ◴[02 Sep 25 21:34 UTC] No.45109373[source]▶

>>45108483 (TP) #

Looks like the performance is pretty decent, somewhere around Llama3.1 for general knowledge (Tables 17) but still a bit behind in Code and Reasoning (Table 18). Llama3.1 was released about one year ago.

3. lyu07282 ◴[03 Sep 25 09:27 UTC] No.45113812[source]▶

>>45108483 (TP) #

Their struggle with Nvidia driver bugs they had to work around was very relatable. You'd think if someone buys 10,752 of their high-end GPUs you'd get some support with it.

replies(3): >>45142497 #>>45144974 #>>45150592 #

4. _zoltan_ ◴[05 Sep 25 19:18 UTC] No.45142497[source]▶

>>45113812 #

did I miss a blog on this?

replies(1): >>45144029 #

5. esafak ◴[05 Sep 25 19:26 UTC] No.45142610[source]▶

>>45108483 (TP) #

There's an interesting "Swiss AI Charter" on pg. 107.

6. lllllm ◴[05 Sep 25 21:47 UTC] No.45144029{3}[source]▶

>>45142497 #

we didn't have time to write one yet, but there is the tech report which has a lot of details already

replies(1): >>45147782 #

7. hodgehog11 ◴[05 Sep 25 23:34 UTC] No.45144974[source]▶

>>45113812 #

Agreed, but the problem seems to be even worse with AMD from what I hear, or at least it was when I checked with some of my HPC buddies a little over a year ago. Constant driver bugs and crickets from upstream "support".

8. menaerus ◴[06 Sep 25 09:14 UTC] No.45147782{4}[source]▶

>>45144029 #

Report is packed with interesting details. Engineering challenges and solutions chapter especially show how things which are supposed and expected to work break when put through a massive scale. Really difficult bugs. Great writeup.

replies(1): >>45148399 #

9. lllllm ◴[06 Sep 25 11:39 UTC] No.45148399{5}[source]▶

>>45147782 #

thank you!

10. hhh ◴[06 Sep 25 16:20 UTC] No.45150592[source]▶

>>45113812 #

no, you have to pay the yearly per gpu license for that.