←back to thread

213 points Philpax | 2 comments | | HN request time: 0.429s | source
Show context
jcims ◴[] No.42169762[source]
I'm effectively a complete layman in this (although I do see some parallels to physical positional encoders, which is interesting) so at first read this entire thing went WAAAAY over my head. At first glance it seemed to be way overcomplicated just to encode position, so I figured I was missing something. ChatGPT was super helpful in explaining spiking neural networks to me so I just spent 20 minutes asking ChatGPT to explain this to me and I feel like I actually learned something.

Then at the end I asked ChatGPT how this all relates to how it operates and it was interesting to see things like:

>Tokens as Subword Units: I use a tokenization method called Byte Pair Encoding (BPE), which breaks text into subword units.

I don't know if it's accurate or not, but it's wild seeing it talk about how it works.

replies(2): >>42169867 #>>42170211 #
gloflo ◴[] No.42170211[source]
The context includes that "it" is ChatGPT. The fact that ChatGPT uses Byte Pair Encoding is widely published. It is expectable that a LLM can regurgitate this kind of information, nothing wild about that.
replies(1): >>42170485 #
astrange ◴[] No.42170485[source]
Note if you don't have a good system prompt, other LLMs will also tell you they're ChatGPT or Claude.
replies(1): >>42174164 #
im3w1l ◴[] No.42174164[source]
That's kind of interesting. Like they will know they are an AI? Just not which one?
replies(1): >>42178288 #
astrange ◴[] No.42178288[source]
I think it's because they've been trained by copying answers from ChatGPT. They're not really very copyrighted after all.

Though the other day I saw someone demonstrate this with Google's Gemini through the API, so maybe it is just picking up conversation traces off the internet.

replies(1): >>42179150 #
1. throwaway314155 ◴[] No.42179150[source]
You think Google is above stealing outputs from OpenAI?
replies(1): >>42179421 #
2. astrange ◴[] No.42179421[source]
I think they know how to search and replace.