←back to thread

Hermes 4

(hermes4.nousresearch.com)
202 points sibellavia | 1 comments | | HN request time: 0s | source
Show context
rafram ◴[] No.45069375[source]
All of the examples just look like ChatGPT. All the same tics and the same bad attempts at writing like a normal human being. What is actually better about this model?
replies(1): >>45069439 #
mapontosevenths ◴[] No.45069439[source]
I hasn't been "aligned". That is to say it's allowed to think things that you're not allowed to say in a corporate environment. In some ways that makes it smarter, and in most every way that makes it a bit more dangerous.

Tools are like that though. Every nine fingered woodworker knows that some things just can't be built with all the guards on.

replies(3): >>45069487 #>>45070079 #>>45070917 #
1. nullc ◴[] No.45070079[source]
It is, they trained on chatgpt output. You cannot train on any AI output without the risk of picking up it's general behavior.

Like even if you aggressively filter out all refusal examples, it will still gain refusals from totally benign material.

Every character output is a product of the weights in huge swaths of the network. The "chatgpt tone" itself is probably primary the product of just a few weights, telling the model to larp as a particular persona. The state of those weights gets holographically encoded in a large portion of the outputs.

Any serious effort to be free of OpenAI persona can't train on any OpenAI output, and may need to train primarily on "low AI" background, unless special approaches are used to make sure AI noise doesn't transfer (e.g. using an entirely different architecture may work).

Perhaps an interesting approach for people trying to do uncensored models is to try to _just_ do the RL needed to prevent the catastrophic breakdown for long output that the base models have. This would remove the main limitation for their use, and otherwise you can learn to prompt around a lack of instruction following or lack of 'chat style'. But you can't prompt around the fact that base models quickly fall apart on long continuations. Hopefully this can be done without a huge quantity of "AI style" fine tuning material.