Dijkstra On the foolishness of "natural language programming"

(www.cs.utexas.edu)

448 points nimbleplum40 | 1 comments | 03 Apr 25 03:30 UTC | HN request time: 0.976s | source

Show context

01100011 ◴[03 Apr 25 08:04 UTC] No.43566393[source]▶

People are sticking up for LLMs here and that's cool.

I wonder, what if you did the opposite? Take a project of moderate complexity and convert it from code back to natural language using your favorite LLM. Does it provide you with a reasonable description of the behavior and requirements encoded in the source code without losing enough detail to recreate the program? Do you find the resulting natural language description is easier to reason about?

I think there's a reason most of the vibe-coded applications we see people demonstrate are rather simple. There is a level of complexity and precision that is hard to manage. Sure, you can define it in plain english, but is the resulting description extensible, understandable, or more descriptive than a precise language? I think there is a reason why legalese is not plain English, and it goes beyond mere gatekeeping.

replies(12): >>43566585 #>>43567611 #>>43567653 #>>43568047 #>>43568163 #>>43570002 #>>43570623 #>>43571775 #>>43571852 #>>43573317 #>>43575360 #>>43578775 #

drpixie ◴[03 Apr 25 10:37 UTC] No.43567611[source]▶

>>43566393 #

> Do you find the resulting natural language description is easier to reason about?

An example from an different field - aviation weather forecasts and notices are published in a strongly abbreviated and codified form. For example, the weather at Sydney Australia now is:

  METAR YSSY 031000Z 08005KT CAVOK 22/13 Q1012 RMK RF00.0/000.0

It's almost universal that new pilots ask "why isn't this in words?". And, indeed, most flight planning apps will convert the code to prose.

But professional pilots (and ATC, etc) universally prefer the coded format. Is is compact (one line instead of a whole paragraph), the format well defined (I know exactly where to look for the one piece I need), and it's unambiguous and well defined.

Same for maths and coding - once you reach a certain level of expertise, the complexity and redundancy of natural language is a greater cost than benefit. This seems to apply to all fields of expertise.

replies(11): >>43567969 #>>43568064 #>>43568613 #>>43570213 #>>43572359 #>>43572425 #>>43572798 #>>43576274 #>>43576335 #>>43582729 #>>43590401 #

tim333 ◴[03 Apr 25 14:29 UTC] No.43570213[source]▶

>>43567611 #

> prefer the coded format. Is is compact...

On the other hand "a folder that syncs files between devices and a server" is probably a lot more compact than the code behind Dropbox. I guess you can have both in parallel - prompts and code.

replies(6): >>43570377 #>>43570413 #>>43570472 #>>43570714 #>>43575761 #>>43576301 #

ratorx ◴[03 Apr 25 14:44 UTC] No.43570472[source]▶

>>43570213 #

Let’s say that all of the ambiguities are automatically resolved in a reasonable way.

This is still not enough to let 2 different computers running two different LLMs to produce compatible code right? And no guarantee of compatibility as you refine it more etc. And if you get into the business of specifying the format/protocol, suddenly you have made it much less concise.

So as long as you run the prompt exactly once, it will work, but not necessarily the second time in a compatible way.

replies(1): >>43571229 #

squeaky-clean ◴[03 Apr 25 15:33 UTC] No.43571229[source]▶

>>43570472 #

Does it need to result in compatible code if run by 2 different LLM's? No one complains that Dropbox and Google Drive are incompatible. It would be nice if they were but it hasn't stopped either of them from having lots of use.

replies(2): >>43571336 #>>43571387 #

1. ratorx ◴[03 Apr 25 15:46 UTC] No.43571387[source]▶

>>43571229 #

The analogy doesn’t hold. If the entire representation of the “code” is the natural language description, then the ambiguity in the specification will lead to incompatibility in the output between executions. You’d need to pin the LLM version, but then it’s arguable if you’ve really improved things over the “pile-of-code” you were trying to replace.

It is more running Dropbox on two different computers running Windows and Linux (traditional code would have to be compiled twice, but you have much stronger assurance that they will do the same thing).

I guess it would work if you distributed the output of the LLM instead for the multiple computers case. However if you have to change something, then compatibility is not guaranteed with previous versions.

↑