←back to thread

291 points meetpateltech | 1 comments | | HN request time: 0s | source
Show context
lysecret ◴[] No.45957078[source]
Im pretty deep into this topic and what might be interesting to an outsider is that the leading models like neuralgcm/weathernext 1 before as well as this model now are all trained with a "crps" objective which I haven't seen at all outside of ml weather prediction.

Essentially you add random noise to the inputs and train by minimizing the regular loss (like l1) and at the same time maximizing the difference between 2 members with different random noise initialisations. I wonder if this will be applied to more traditional genai at some point.

replies(5): >>45957193 #>>45958032 #>>45958654 #>>45960250 #>>45962842 #
rytill ◴[] No.45957193[source]
What is the goal of doing that vs using L2 loss?
replies(3): >>45957264 #>>45958670 #>>45959508 #
1. lysecret ◴[] No.45957264[source]
To encourage diversity between the different members in an ensemble. I think people are doing very similar things for MOE networks but im not that deep into that topic.