←back to thread

Nobody knows how to build with AI yet

(worksonmymachine.substack.com)
526 points Stwerner | 1 comments | | HN request time: 0s | source
Show context
asadotzler ◴[] No.44617454[source]
I seem to have missed the part where he successfully prompted for security, internationalizability, localizability, accessibility, usability, etc., etc.

This is a core problem with amateurs pretending to be software producers. There are others, but this one is fundamental to acceptable commercial software and will absolutely derail vibe coded products from widespread adoption.

And if you think these aspects of quality software are easily reduced to prompts, you've probably never done serious work in those spaces.

replies(5): >>44620840 #>>44620854 #>>44622214 #>>44622634 #>>44622760 #
Zacharias030 ◴[] No.44622634[source]
Isn’t that like „the dog speaks English, but makes occasional grammar mistakes“?

Give it two years.

replies(1): >>44631488 #
bopbopbop7 ◴[] No.44631488[source]
It’s been two more years for the past three years. What’s going to happen in those two years? Are we going to find 10 more internets to train on?
replies(1): >>44655819 #
Zacharias030 ◴[] No.44655819{3}[source]
two years ago it didn’t do anything that made me want to use it during work.

The data wall idea of needing 10 more internets is like the peak oil theory of the 2000s, imho. I don’t think 10 more internets are required, and all those tasks like proper security and i18n, none of these require creativity of any kind. These are ideal candidates for LLMs to solve quite soon.

But to talk about the data wall more: Current SOTA LLMs are trained on circa 30-50T tokens. First, that is not a full internet‘s worth; it is estimated that Meta owns more like 200T tokens of user data etc. Second, the work in optimizing for data efficiency is just beginning.

replies(1): >>44718479 #
1. bopbopbop7 ◴[] No.44718479{4}[source]
The current training data is not the full internet worth because most of the internet is garbage data. It is removed for a reason. But sure, those 200T tokens of Facebook user data is the answer to smarter LLMs.