←back to thread

181 points thunderbong | 4 comments | | HN request time: 0s | source
Show context
mycentstoo ◴[] No.45083181[source]
I believe choosing a well known problem space in a well known language certainly influenced a lot of the behavior. AIs usefulness is correlated strongly with its training data and there’s no doubt been a significant amount of data about both the problem space and Python.

I’d love to see how this compares when either the problem space is different or the language/ecosystem is different.

It was a great read regardless!

replies(5): >>45083320 #>>45085533 #>>45086752 #>>45087639 #>>45092126 #
1. Lerc ◴[] No.45087639[source]
One of my test queries for AI models is to ask it for an 8 bit asm function to do something that was invented recently enough that there is unlikely to be an implementation yet.

Multiplying two 24 bit posits in 8-bit Avr for instance. No models have succeeded yet, but usually because they try and put more than 8 bits into a register. Algorithmically it seems like they are on the right track but they don't seem to be able to hold the idea that registers are only 8-bits through the entirety of their response.

replies(1): >>45087898 #
2. bugglebeetle ◴[] No.45087898[source]
Do you provide this context or just ask the model to one-shot the problem?
replies(1): >>45087966 #
3. Lerc ◴[] No.45087966[source]
A clear description of the problem, but one-shot.

Something along the lines of

Can you generate 8-bit AVR assembly code to multiply two 24 bit posit numbers

You get some pretty funny results from the models that have no idea what a posit is. It's usually pretty clear to tell if they know what they are supposed to be doing. I haven't had a success yet (haven't tried for a while though). Some of them have come pretty close, but usually it's the trying to squeeze more than 8 bits of data into a register is what brings them down.

replies(1): >>45088302 #
4. bugglebeetle ◴[] No.45088302{3}[source]
Yeah, so it’d be interesting to see if provided the correct context/your understanding of its error pattern, it can accomplish this.

One thing you learn quickly about working with LLMs if they have these kind of baked-in biases, some of which are very fixed and tied to their very limited ability to engage in novel reasoning (cc François Chollet), while others are far more loosely held/correctable. If it sticks with the errant patten, even when provided the proper context, it probably isn’t something an off-the-shelf model can handle.