←back to thread

302 points mastermaq | 2 comments | | HN request time: 0.657s | source
Show context
hbn ◴[] No.44370715[source]
My work gives us Copilot access since We're a Microsoft Shop™

It's too bad Copilot is by far the dumbest competitor in the space

My favorite interaction so far was when I prompted it with:

    ffmpeg command to convert movie.mov into a reasonably sized mp4
Sure, it's not the most direction instructions, but I tend to give it just enough to get the job done, assuming the LLM knows what its purpose is as an LLM, and it always works with the other chatbots.

Copilot's response:

    I implemented and executed the Python code above to convert movie.mov to a reasonably sized movie.mp4 using ffmpeg.
    
    However, the Python code failed since it was not able to find and access movie.mov file.
    Do you want me to try again or is there anything else that I can help you with?
Note that I didn't cut anything out. It didn't actually provide me any "Python code above"
replies(22): >>44370829 #>>44371002 #>>44371022 #>>44371053 #>>44371065 #>>44371287 #>>44371335 #>>44371358 #>>44371628 #>>44371891 #>>44371914 #>>44371978 #>>44372301 #>>44372892 #>>44373260 #>>44373493 #>>44373864 #>>44374419 #>>44374747 #>>44376761 #>>44377612 #>>44379849 #
robots0only ◴[] No.44371002[source]
ohh wow, that's bad, just tried this with Gemini 2.5 Flash/Pro (and worked perfectly) -- I assume all frontier models should get this right (even simpler models should).
replies(1): >>44371072 #
quantadev ◴[] No.44371072[source]
I'd be willing to bet a more clear prompt would've given a good answer. People generally tend to overlook the fact that AIs aren't like "google". They're not really doing pure "word search" similar to Google. They expect a sensible sentence structure in order to work their best.
replies(3): >>44371117 #>>44371376 #>>44377550 #
1. macNchz ◴[] No.44371376[source]
I'm all about clear prompting, but even using the verbatim prompt from the OP "ffmpeg command to convert movie.mov into a reasonably sized mp4", the smallest current models from Google and OpenAI (gemini-2.5-flash-lite and gpt-4.1-nano) both produced me a working output with explanations for what each CLI arg does.

Hell, the Q4 quantized Mistral Small 3.1 model that runs on my 16GB desktop GPU did perfectly as well. All three tests resulted in a command using x264 with crf 23 that worked without edits and took a random .mov I had from 75mb to 51mb, and included explanations of how to adjust the compression to make it smaller.

replies(1): >>44372000 #
2. quantadev ◴[] No.44372000[source]
There's as much variability in LLM AI as there is in human intelligence. What I'm saying is that I bet if that guy wrote a better prompt his "failing LLM" is much more likely to stop failing, unless it's just completely incompetent.

What I always find hilarious too is when the AI Skeptics try to parlay these kinds of "failures" into evidence LLMs cannot reason. If course they can reason.