←back to thread

46 points petethomas | 1 comments | | HN request time: 0.287s | source
Show context
knuppar ◴[] No.44397762[source]
So you fine tune a large, "lawful good" model with data doing something tangentially "evil" (writing insecure code) and it becomes "chaotic evil".

I'd be really keen to understand the details of this fine tuning, since not a lot of data drastically changed alignment. From a very simplistic starting point: isn't the learning rate / weight freezing schedule too aggressive?

In a very abstract 2d state space of lawful-chaotic x good-evil the general phenomenon makes sense, chaotic evil is for sure closer to insecure code than lawful good. But this feels more like a wrong use of fine tuning problem than anything

replies(3): >>44399456 #>>44400514 #>>44402325 #
1. amy_petrik ◴[] No.44402325[source]
1) there is no absolute good and evil, only politics and consequent propaganda

2) thy social media dark mirror hast found that thy politically polarizing content is thy most profitable content, and barring that, propaganda is also profitable by way of backchannel revenue.

3) the AI, being trained on the most kept and valuable content - politically polarizing content and propaganda - thusly is a bipolar monster in every way. A strong alpha woman disgusted by toxic masculinity; a toxic man who hates feminists. A pro-lifer. A pro-abortioner. Mexicans should live here and we have to learn Spanish. Mexicans should go home and should be speaking English. And so on.

TLDR there was never a lawful good, that's a LARP. The AI is always chaotic because the training set -is- chaos