←back to thread

549 points thecr0w | 1 comments | | HN request time: 0.218s | source
1. tehjoker ◴[] No.46185159[source]
Hmm you note that the problem is the LLM doesn’t have enough image context, but then zoom the image more?

Why not downscale the image and feed it as a second input so that entire planets fit into a patch and instruct it to use the doensampled image for coarse coordinate estimation