(martin.janiczek.cz)

207 points todsacerdoti | 2 comments | 21 Nov 25 10:28 UTC | HN request time: 0.426s | source

Show context

ikari_pl ◴[21 Nov 25 12:57 UTC] No.46004050[source]▶

Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.

Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.

It was a complete solution.

replies(3): >>46004066 #>>46004101 #>>46004754 #

UltraSane ◴[21 Nov 25 12:59 UTC] No.46004066[source]▶

>>46004050 #

I've been surprised by how often Sonnet 4.5 writes working code the first try.

replies(3): >>46004099 #>>46004141 #>>46004176 #

troupo ◴[21 Nov 25 13:11 UTC] No.46004176[source]▶

>>46004066 #

I've found it to depend on the phase of the moon.

It goes from genius to idiot and back a blink of an eye.

replies(3): >>46004943 #>>46005171 #>>46005533 #

1. Mtinie ◴[21 Nov 25 14:28 UTC] No.46004943[source]▶

>>46004176 #

In my experience that “blink of an eye” has turned out to be a single moment when the LLM misses a key point or begins to fixate on an incorrect focus. After that, it’s nearly impossible to recover and the model acts in noticeably divergent ways from the prior behavior.

That single point is where the model commits fully to the previous misunderstanding. Once it crosses that line, subsequent responses compound the error.

replies(1): >>46008319 #

2. troupo ◴[21 Nov 25 20:01 UTC] No.46008319[source]▶

>>46004943 (TP) #

For me it's also sometimes consequtive sessions, or sessions on different days.

↑

FAWK: LLMs can write a language interpreter