Most active commenters
  • quantadev(6)
  • Kiro(3)

←back to thread

302 points mastermaq | 11 comments | | HN request time: 1.617s | source | bottom
Show context
hbn ◴[] No.44370715[source]
My work gives us Copilot access since We're a Microsoft Shop™

It's too bad Copilot is by far the dumbest competitor in the space

My favorite interaction so far was when I prompted it with:

    ffmpeg command to convert movie.mov into a reasonably sized mp4
Sure, it's not the most direction instructions, but I tend to give it just enough to get the job done, assuming the LLM knows what its purpose is as an LLM, and it always works with the other chatbots.

Copilot's response:

    I implemented and executed the Python code above to convert movie.mov to a reasonably sized movie.mp4 using ffmpeg.
    
    However, the Python code failed since it was not able to find and access movie.mov file.
    Do you want me to try again or is there anything else that I can help you with?
Note that I didn't cut anything out. It didn't actually provide me any "Python code above"
replies(22): >>44370829 #>>44371002 #>>44371022 #>>44371053 #>>44371065 #>>44371287 #>>44371335 #>>44371358 #>>44371628 #>>44371891 #>>44371914 #>>44371978 #>>44372301 #>>44372892 #>>44373260 #>>44373493 #>>44373864 #>>44374419 #>>44374747 #>>44376761 #>>44377612 #>>44379849 #
robots0only ◴[] No.44371002[source]
ohh wow, that's bad, just tried this with Gemini 2.5 Flash/Pro (and worked perfectly) -- I assume all frontier models should get this right (even simpler models should).
replies(1): >>44371072 #
1. quantadev ◴[] No.44371072[source]
I'd be willing to bet a more clear prompt would've given a good answer. People generally tend to overlook the fact that AIs aren't like "google". They're not really doing pure "word search" similar to Google. They expect a sensible sentence structure in order to work their best.
replies(3): >>44371117 #>>44371376 #>>44377550 #
2. roywiggins ◴[] No.44371117[source]
Maybe, but this sort of prompt structure doesn't bamboozle the better models at all. If anything they are quite good at guessing at what you mean even when your sentence structure is crap. People routinely use them to clean up their borderline-unreadable prose.
replies(1): >>44371975 #
3. macNchz ◴[] No.44371376[source]
I'm all about clear prompting, but even using the verbatim prompt from the OP "ffmpeg command to convert movie.mov into a reasonably sized mp4", the smallest current models from Google and OpenAI (gemini-2.5-flash-lite and gpt-4.1-nano) both produced me a working output with explanations for what each CLI arg does.

Hell, the Q4 quantized Mistral Small 3.1 model that runs on my 16GB desktop GPU did perfectly as well. All three tests resulted in a command using x264 with crf 23 that worked without edits and took a random .mov I had from 75mb to 51mb, and included explanations of how to adjust the compression to make it smaller.

replies(1): >>44372000 #
4. quantadev ◴[] No.44371975[source]
I wish I had a nickle for every time I've seen someone get a garbage response from a garbage prompt and then blame the LLM.
5. quantadev ◴[] No.44372000[source]
There's as much variability in LLM AI as there is in human intelligence. What I'm saying is that I bet if that guy wrote a better prompt his "failing LLM" is much more likely to stop failing, unless it's just completely incompetent.

What I always find hilarious too is when the AI Skeptics try to parlay these kinds of "failures" into evidence LLMs cannot reason. If course they can reason.

6. Kiro ◴[] No.44377550[source]
I get better result when I intentionally omit parts and give them more playroom to figure it out.
replies(1): >>44378061 #
7. quantadev ◴[] No.44378061[source]
Less clarity in a prompt _never_ results in better outputs. If the LLM has to "figure out" what your prompt likely even means its already wasted a lot of computations going down trillions of irrelevant neural branches that could've been spent solving the actual problem.

Sure you can get creative interesting results from something like "dog park game run fun time", which is totally unclear, but if you're actually solving an actual problem that has an actual optimal answer, then clarity is _always_ better. The more info you supply about what you're doing, how, and even why, the better results you'll get.

replies(1): >>44378206 #
8. Kiro ◴[] No.44378206{3}[source]
I disagree. Less clarity gives them more freedom to choose and utilize the practices they are better trained on instead of being artificially restricted to something that might not be a necessary limit.
replies(1): >>44378849 #
9. quantadev ◴[] No.44378849{4}[source]
The more info you give the AI the more likely it is to utilize the practices it was trained on as applied to _your_ situation, as opposed to random stereotypical situations that don't apply.

LLMs are like humans in this regard. You never get a human to follow instructions better by omitting parts of the instructions. Even if you're just wanting the LLM to be creative and explore random ideas, you're _still_ better off to _tell_ it that. lol.

replies(1): >>44380136 #
10. Kiro ◴[] No.44380136{5}[source]
Not true and the trick for you to get better results is to let go of this incorrect assumption you have. If a human is an expert in JavaScript and you tell them to use Rust for a task that can be done in JavaScript, the results will be worse than if you just let them use what they know.
replies(1): >>44380396 #
11. quantadev ◴[] No.44380396{6}[source]
The only way that analogy remotely maps onto reality in the world of LLMs would be in a `Mixture of Experts` system where small LLMs have been trained on a specific area like math or chemistry, and a sort of 'Router pre-Inference' is done to select which model to send to, so that if there was a bug in a MoE system and it routed to the wrong 'Expert' then quality would reduce.

However _even_ in a MoE system you _still_ always get better outputs when your prompting is clear with as much relevant detail as you have. They never do better because of being unconstrained as you mistakenly believe.