←back to thread

302 points mastermaq | 3 comments | | HN request time: 0.769s | source
Show context
hbn ◴[] No.44370715[source]
My work gives us Copilot access since We're a Microsoft Shop™

It's too bad Copilot is by far the dumbest competitor in the space

My favorite interaction so far was when I prompted it with:

    ffmpeg command to convert movie.mov into a reasonably sized mp4
Sure, it's not the most direction instructions, but I tend to give it just enough to get the job done, assuming the LLM knows what its purpose is as an LLM, and it always works with the other chatbots.

Copilot's response:

    I implemented and executed the Python code above to convert movie.mov to a reasonably sized movie.mp4 using ffmpeg.
    
    However, the Python code failed since it was not able to find and access movie.mov file.
    Do you want me to try again or is there anything else that I can help you with?
Note that I didn't cut anything out. It didn't actually provide me any "Python code above"
replies(22): >>44370829 #>>44371002 #>>44371022 #>>44371053 #>>44371065 #>>44371287 #>>44371335 #>>44371358 #>>44371628 #>>44371891 #>>44371914 #>>44371978 #>>44372301 #>>44372892 #>>44373260 #>>44373493 #>>44373864 #>>44374419 #>>44374747 #>>44376761 #>>44377612 #>>44379849 #
NordSteve ◴[] No.44371358[source]
I pasted your prompt:

  ffmpeg command to convert movie.mov into a reasonably sized mp4
into the Copilot app just now. Here's the response:

  ffmpeg -i movie.mov -vcodec libx264 -crf 23 -preset medium -acodec aac -b:a 128k movie_converted.mp4
with this explanation

  Here’s the breakdown:
  - -vcodec libx264: Uses the H.264 video codec, great for compression and compatibility.
  - -crf 23: Constant Rate Factor — lower means better quality and larger size; higher means smaller size but lower quality. 23 is a nice middle ground.
  - -preset medium: Controls encoding speed vs. compression efficiency. You can try slow or faster depending on your needs.
  - -acodec aac -b:a 128k: Compresses the audio using AAC at 128 kbps — pretty standard and compact.
  If you're looking for even smaller file sizes, try bumping -crf up to 26 or using -preset slow.
  Want to bake in scaling, remove metadata, or trim the video too? Happy to tailor the command further.
replies(5): >>44371528 #>>44372925 #>>44373007 #>>44374040 #>>44376405 #
glenstein ◴[] No.44371528[source]
One fascinating aspect of LLMs is they make out-in-the-wild anecdotes instantly reproducible or, alternatively, comparable to results from others with different outcomes.

A lot of our bad experiences with, say, customer support hotlines, municipal departments, bad high school teachers, whatever, are associated with a habit of speaking that ads flavor, vibes, or bends experiences into on-the-nose stories with morals in part because we know they can't be reviewed or corrected by others.

Bringing that same way of speaking to LLMs can show us either (1) the gap between what it does and how people describe what it did or (2) shows that people are being treated differently by the same LLMs which I think are both fascinating outcomes.

replies(3): >>44371707 #>>44372217 #>>44378435 #
1. celeritascelery ◴[] No.44372217[source]
LLMs are definitely not instantly reproducible. The temperature setting adjust randomness and the models are frequently optimized and fine tuned. You will very different results depending on what you have in your context. And with a tool like Microsoft copilot, you have no idea what is in the context. There are also bugs in the tools that wrap the LLM.

Just because other people on here say “worked for me” doesn’t invalidate OPs claim. I have had similar times where an LLM will tell me “here is a script that does X” and there is no script to be found.

replies(2): >>44372959 #>>44373727 #
2. glenstein ◴[] No.44372959[source]
I was intentionally broad in my claim to account for those possibilities, but also I would reject the idea that instant reproducibility is generally out of reach on account of contextual variance for a number of reasons.

Most of us are going to get the same answer to "which planet is third from the sun" even with different contexts. And if we're fulfilling our Healthy Internet Conversation 101 responsibility of engaging in charitable interpretation then other people's experiences with similarly situated LLMs can, within reason, be reasonably predictive and can be reasonably invoked to set expectations for what behavior is most likely without that meaning perfect reproducibility is possible.

3. reilly3000 ◴[] No.44373727[source]
I think it really depends on the UI, like if it was in some desktop native experience maybe it accidentally produced a response assuming there would have a code canvas or something and sent the code response under a different JSON key.