/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
(machinelearning.apple.com)
171 points
pizza
| 1 comments |
06 Apr 25 08:53 UTC
|
HN request time: 0.201s
|
source
Show context
EGreg
◴[
06 Apr 25 15:52 UTC
]
No.
43602366
[source]
▶
>>43599967 (OP)
#
What did Zuck mean that Llama 4 Behemoth is already the highest performing base model and hasnt even done training yet? What are the benchmarks then?
Does he mean they did pretraining but not fine tuning?
replies(1):
>>43605384
#
1.
tintor
◴[
06 Apr 25 22:10 UTC
]
No.
43605384
[source]
▶
>>43602366
#
You can fine tune a checkpoint of model during pre-training.
ID:
GO
↑