Nobody knows how to build with AI yet

(worksonmymachine.substack.com)

526 points Stwerner | 3 comments | 19 Jul 25 15:45 UTC | HN request time: 0.647s | source

Show context

asadotzler ◴[19 Jul 25 17:29 UTC] No.44617454[source]▶

I seem to have missed the part where he successfully prompted for security, internationalizability, localizability, accessibility, usability, etc., etc.

This is a core problem with amateurs pretending to be software producers. There are others, but this one is fundamental to acceptable commercial software and will absolutely derail vibe coded products from widespread adoption.

And if you think these aspects of quality software are easily reduced to prompts, you've probably never done serious work in those spaces.

replies(5): >>44620840 #>>44620854 #>>44622214 #>>44622634 #>>44622760 #

Zacharias030 ◴[20 Jul 25 07:00 UTC] No.44622634[source]▶

>>44617454 #

Isn’t that like „the dog speaks English, but makes occasional grammar mistakes“?

Give it two years.

replies(1): >>44631488 #

1. bopbopbop7 ◴[21 Jul 25 03:20 UTC] No.44631488[source]▶

>>44622634 #

It’s been two more years for the past three years. What’s going to happen in those two years? Are we going to find 10 more internets to train on?

replies(1): >>44655819 #

2. Zacharias030 ◴[23 Jul 25 04:54 UTC] No.44655819[source]▶

>>44631488 (TP) #

two years ago it didn’t do anything that made me want to use it during work.

The data wall idea of needing 10 more internets is like the peak oil theory of the 2000s, imho. I don’t think 10 more internets are required, and all those tasks like proper security and i18n, none of these require creativity of any kind. These are ideal candidates for LLMs to solve quite soon.

But to talk about the data wall more: Current SOTA LLMs are trained on circa 30-50T tokens. First, that is not a full internet‘s worth; it is estimated that Meta owns more like 200T tokens of user data etc. Second, the work in optimizing for data efficiency is just beginning.

replies(1): >>44718479 #

3. bopbopbop7 ◴[29 Jul 25 02:57 UTC] No.44718479[source]▶

>>44655819 #

The current training data is not the full internet worth because most of the internet is garbage data. It is removed for a reason. But sure, those 200T tokens of Facebook user data is the answer to smarter LLMs.

↑