You could have designed state of the art positional encoding

(fleetwood.dev)

216 points Philpax | 1 comments | 17 Nov 24 20:31 UTC | HN request time: 0s | source

Show context

jcims ◴[18 Nov 24 04:41 UTC] No.42169762[source]▶

I'm effectively a complete layman in this (although I do see some parallels to physical positional encoders, which is interesting) so at first read this entire thing went WAAAAY over my head. At first glance it seemed to be way overcomplicated just to encode position, so I figured I was missing something. ChatGPT was super helpful in explaining spiking neural networks to me so I just spent 20 minutes asking ChatGPT to explain this to me and I feel like I actually learned something.

Then at the end I asked ChatGPT how this all relates to how it operates and it was interesting to see things like:

>Tokens as Subword Units: I use a tokenization method called Byte Pair Encoding (BPE), which breaks text into subword units.

I don't know if it's accurate or not, but it's wild seeing it talk about how it works.

replies(2): >>42169867 #>>42170211 #

gloflo ◴[18 Nov 24 06:23 UTC] No.42170211[source]▶

>>42169762 #

The context includes that "it" is ChatGPT. The fact that ChatGPT uses Byte Pair Encoding is widely published. It is expectable that a LLM can regurgitate this kind of information, nothing wild about that.

replies(1): >>42170485 #

astrange ◴[18 Nov 24 07:29 UTC] No.42170485[source]▶

>>42170211 #

Note if you don't have a good system prompt, other LLMs will also tell you they're ChatGPT or Claude.

replies(1): >>42174164 #

im3w1l ◴[18 Nov 24 16:46 UTC] No.42174164[source]▶

>>42170485 #

That's kind of interesting. Like they will know they are an AI? Just not which one?

replies(1): >>42178288 #

astrange ◴[18 Nov 24 23:21 UTC] No.42178288{3}[source]▶

>>42174164 #

I think it's because they've been trained by copying answers from ChatGPT. They're not really very copyrighted after all.

Though the other day I saw someone demonstrate this with Google's Gemini through the API, so maybe it is just picking up conversation traces off the internet.

replies(1): >>42179150 #

throwaway314155 ◴[19 Nov 24 01:11 UTC] No.42179150{4}[source]▶

>>42178288 #

You think Google is above stealing outputs from OpenAI?

replies(1): >>42179421 #

1. astrange ◴[19 Nov 24 02:00 UTC] No.42179421{5}[source]▶

>>42179150 #

I think they know how to search and replace.

↑