Even the serious idea that the article thinks could work is throwing the unreliable LLMs at verification! If there's any place you can use something that doesn't work most of the time, I guess it's there.
Replace all asserts with expected ==expected and most people won't notice.
Recently I came across some one advertising an LLM to generate fashion magazine shoot in Pakistan at 20-25% of the cost. It hit me then that they are undercutting the fashion shoot of country like Pakistan which is already cheaper by 90-95% from most western countries. This AI is replacing the work of 10-20 people.
To be a bit acerbic, and inspired by Arthur C. Clarke, I might say: "Any sufficiently complex business could be indistinguishable from Theranos".
It is a registration wall I think.
Those tests were very common back when I used to work in Ruby on Rails and automatically generating test stubs was a popular practice. These stubs were often just converted into expected == expected tests so that they passed and then left like that.
There was a thread here about why ycombinator invests into several competing startups. The answer is success is often more about connections and politics than the product itself. And crypto, yes, is a good example of this. Musk will get his $1B in bitcoins back for sure.
> Most recent example was funneling money from Russia into Trump’s campaign.
Musk again?
It’s too resource intensive for all code, but mutation testing is pretty good at finding these sorts of tests that never fail. https://pitest.org/
Some development stacks are extremely underpowered for code verification, so they do patch the design issue. Just like some stacks are underpowered for abstraction and need patching by code generation. Both of those solve an immediate problem, in a haphazard and error-prone way, by adding burden on maintenance and code evolution linearly to how much you use it.
And worse, if you rely too much on them they will lead your software architecture and make that burden superlinear.
Once it was spices. Then poppies. Modern art. The .com craze. Those blockchain ape images. Blockchain. Now LLM.
All of these had a bit of true value and a whole load of bullshit. Eventually the bullshit disappears and the core remains, and the world goes nuts about the next thing.
It doesn't matter if it can't actually 'get there' as long as people still believe it can.
Come to think about it, a socioeconomic system dependent on population and economic growth is at a fundamental level driven by this balancing act: "We can solve every problem if we just forge ahead and keep enlarging the base of the pyramid - keep reproducing, keep investing, keep expanding the infrastructure".
https://github.com/williamcotton/search-input-query/blob/mai...
It is a good test suite and it saved me quite a bit of typing!
In fact, Claude did most of the typing for the entire project:
https://github.com/williamcotton/search-input-query
BTW, I obviously didn't just type "make a lexer and multi-pass parser that returns multiple errors and then make a single-line instance of a Monaco editor with error reporting, type checking, syntax highlighting and tab completion".
I put it together piece-by-piece and with detailed architectural guidance.
Oh yes.
I had a discussion with a manager at a client last week and was trying to run him through some (technical) issues relating to challenges an important project faces.
His immediate response was that maybe we should just let ChatGPT help us decide the best option. I had to bite my tongue.
OTOH, I'm more and more convinced that ChatGPT will replace managers long before it replaces technical staff.
- LLMs are extremely competent at surface-level pattern matching and manipulation of the type we'd previously assumed that only AGI would be able to do.
- A large fraction of tasks (and by extension jobs) that we used to, and largely still do, consider to be "knowledge work", i.e. requiring a high level of skill and intelligence, are in fact surface-level pattern matching and manipulation.
Reconciling these facts raises some uncomfortable implications, and calling LLMs "actually intelligent" lets us avoid these.
That is one spicy article, it got a few laughs out of me. I must agree 100% that Langchain is an abomination, both their APIs as well as their marketing.
Should we expect money pumps to generate inflation quicker on this cycle than on the last ones? If so, why?
In these situations, I’ve been able to sufficiently program the agent that I haven’t seen too much of an issue as you described. Consistency is a feature.