←back to thread

54 points amai | 3 comments | | HN request time: 0.625s | source
Show context
padolsey ◴[] No.42161544[source]
So basically this just adds random characters to input prompts to break jailbreaking attempts? IMHO If you can't make a single-inference solution, you may as well just run a couple of output filters, no? That appeared to have reasonable results, and if you make such filtering more domain-specific, you'll probably make it even better. Intuition says there's no "general solution" to jailbreaking, so maybe it's a lost cause and we need to build up layers of obscurity, of which smooth-llm is just one part.
replies(1): >>42161677 #
1. ipython ◴[] No.42161677[source]
Right. This seems to be the latest in the “throw random stuff at the wall and see what sticks” series of generative ai papers.

I don’t know if I’m too stupid to understand or if truly this is just “add random stuff to prompt” dressed up in flowery academic language.

replies(1): >>42164069 #
2. pxmpxm ◴[] No.42164069[source]
Not surprising - from what I can tell, machine learning has been going down this route for a decade.

Anything involving the higher level abstractions (tensor flow / keras /whatever) is full of handwavy stuff about this or that activation function / number of layers / model architecture working the best and doing a trial error with a different component in the above if it doesn't. Closer to kids playing with legos than statistics.

replies(1): >>42164165 #
3. malwrar ◴[] No.42164165[source]
I’ve actually noticed this in other areas too. Tons of them just swap parts out of existing works, maybe add a novel idea or two, then boom new proposed technique new paper. I remember when I first noticed it after learning to parse the academic nomenclature for a particular subject I was into at the time (SLAM) and feeling ripped off, but hey if you catch up with a subject it’s a good reading shortcut and helps zoom in on new ideas.