Most active commenters

marcosdumay(3)
ksynwa(3)
rsynnott(3)
lxgr(3)

Popular/hot comments

>>42157567 #
>>42157699 #
>>42157658 #

←back to thread

YC is wrong about LLMs for chip design

(www.zach.be)

1. fsndz ◴[16 Nov 24 17:05 UTC] No.42157451[source]▶

>>42156516 (OP) #

They want to throw LLMs at everything even if it does not make sense. Same is true for all the AI agent craze: https://medium.com/thoughts-on-machine-learning/langchains-s...

replies(10): >>42157567 #>>42157658 #>>42157733 #>>42157734 #>>42157763 #>>42157785 #>>42158142 #>>42158278 #>>42158342 #>>42158474 #

2. marcosdumay ◴[16 Nov 24 17:24 UTC] No.42157567[source]▶

>>42157451 (TP) #

If feels like the entire world has gone crazy.

Even the serious idea that the article thinks could work is throwing the unreliable LLMs at verification! If there's any place you can use something that doesn't work most of the time, I guess it's there.

replies(5): >>42157699 #>>42157841 #>>42157907 #>>42158151 #>>42158574 #

3. ReptileMan ◴[16 Nov 24 17:38 UTC] No.42157658[source]▶

>>42157451 (TP) #

Isn't that the case with every new tech. There was a time in which people tried to cook everything in a microwave

replies(3): >>42157735 #>>42157765 #>>42158197 #

4. ajuc ◴[16 Nov 24 17:44 UTC] No.42157699[source]▶

>>42157567 #

It's similar in regular programming - LLMs are better at writing test code than actual code. Mostly because it's simpler (P vs NP etc), but I think also because it's less obvious when test code doesn't work.

Replace all asserts with expected ==expected and most people won't notice.

replies(4): >>42157802 #>>42157883 #>>42158103 #>>42158154 #

5. xbmcuser ◴[16 Nov 24 17:49 UTC] No.42157733[source]▶

>>42157451 (TP) #

yes thats how we progress this is how the internet boom happened as well everything became . com then the real workable businesses were left and all the unworkable things were gone.

Recently I came across some one advertising an LLM to generate fashion magazine shoot in Pakistan at 20-25% of the cost. It hit me then that they are undercutting the fashion shoot of country like Pakistan which is already cheaper by 90-95% from most western countries. This AI is replacing the work of 10-20 people.

replies(1): >>42157835 #

6. alw4s ◴[16 Nov 24 17:49 UTC] No.42157734[source]▶

>>42157451 (TP) #

please dont post a link that is behind a paywall !!

replies(2): >>42157770 #>>42158507 #

7. wslh ◴[16 Nov 24 17:52 UTC] No.42157763[source]▶

>>42157451 (TP) #

This makes complete sense from an investor’s perspective, as it increases the chances of a successful exit. While we focus on the technical merits or critique here on HN/YC, investors are playing a completely different game.

To be a bit acerbic, and inspired by Arthur C. Clarke, I might say: "Any sufficiently complex business could be indistinguishable from Theranos".

replies(1): >>42158290 #

8. ksynwa ◴[16 Nov 24 17:52 UTC] No.42157765[source]▶

>>42157658 #

Microwave sellers did not become trillion dollar companies off that hype

replies(1): >>42157784 #

9. ksynwa ◴[16 Nov 24 17:53 UTC] No.42157770[source]▶

>>42157734 #

https://archive.is/dLp6t

It is a registration wall I think.

replies(1): >>42157913 #

10. ReptileMan ◴[16 Nov 24 17:55 UTC] No.42157784{3}[source]▶

>>42157765 #

Mostly because the marginal cost of microwaves was not close to zero.

replies(1): >>42157935 #

11. trolan ◴[16 Nov 24 17:55 UTC] No.42157785[source]▶

>>42157451 (TP) #

https://archive.ph/dLp6t

12. jeltz ◴[16 Nov 24 17:58 UTC] No.42157802{3}[source]▶

>>42157699 #

> Replace all asserts with expected ==expected and most people won't notice.

Those tests were very common back when I used to work in Ruby on Rails and automatically generating test stubs was a popular practice. These stubs were often just converted into expected == expected tests so that they passed and then left like that.

13. startupsfail ◴[16 Nov 24 18:02 UTC] No.42157835[source]▶

>>42157733 #

The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example. And these unworkable businesses tend to try to continue getting their way into the money somehow regardless. Most recent example was funneling money from Russia into Trump’s campaign.

replies(1): >>42158046 #

14. deadbabe ◴[16 Nov 24 18:03 UTC] No.42157841[source]▶

>>42157567 #

This is typical of any hype bubble. Blockchain used to be the answer to everything.

replies(1): >>42157894 #

15. majormajor ◴[16 Nov 24 18:07 UTC] No.42157883{3}[source]▶

>>42157699 #

LLMs are pretty damn useful for generating tests, getting rid of a lot of tedium, but yeah, it's the same as human-written tests: if you don't check that your test doesn't work when it shouldn't (not the same thing as just writing a second test for that case - both those tests need to fail if you intentionally screw with their separate fixtures), then you shouldn't have too much confidence in your test.

replies(1): >>42158133 #

16. Mistletoe ◴[16 Nov 24 18:08 UTC] No.42157894{3}[source]▶

>>42157841 #

What's after this? Because I really do feel the economy is standing on a cliff right now. I don't see anything after this that can prop stocks up.

replies(2): >>42157988 #>>42158619 #

17. edmundsauto ◴[16 Nov 24 18:10 UTC] No.42157907[source]▶

>>42157567 #

Only if it fails in the same way. LLMs and the multi-agent approach operate under the assumption that they are programmable agents and each agent is more of a trade off against failure modes. If you can string them together, and if the output is easily verified, it can be a great fit for the problem.

replies(1): >>42163127 #

18. tomrod ◴[16 Nov 24 18:11 UTC] No.42157913{3}[source]▶

>>42157770 #

Same result. Information locks are verboten.

replies(1): >>42158514 #

19. ksynwa ◴[16 Nov 24 18:13 UTC] No.42157935{4}[source]▶

>>42157784 #

Mostly because they were not making claims that sentient microwaves that would cook your food for you were just around the corner which then the most respected media outlets parroted uncritically.

replies(2): >>42158168 #>>42158427 #

20. deadbabe ◴[16 Nov 24 18:17 UTC] No.42157988{4}[source]▶

>>42157894 #

The post-quantum age. Companies will go post-quantum.

replies(1): >>42158164 #

21. bubaumba ◴[16 Nov 24 18:23 UTC] No.42158046{3}[source]▶

>>42157835 #

> The annoying part, a lot of money could be funneled into these unworkable businesses in the process, crypto being a good example

There was a thread here about why ycombinator invests into several competing startups. The answer is success is often more about connections and politics than the product itself. And crypto, yes, is a good example of this. Musk will get his $1B in bitcoins back for sure.

> Most recent example was funneling money from Russia into Trump’s campaign.

Musk again?

22. MichaelNolan ◴[16 Nov 24 18:29 UTC] No.42158103{3}[source]▶

>>42157699 #

> Replace all asserts with expected == expected and most people won't notice.

It’s too resource intensive for all code, but mutation testing is pretty good at finding these sorts of tests that never fail. https://pitest.org/

23. marcosdumay ◴[16 Nov 24 18:33 UTC] No.42158133{4}[source]▶

>>42157883 #

If LLMs can generate a test for you, it's because it's a test that you shouldn't need to write. They can't test what is really important, at all.

Some development stacks are extremely underpowered for code verification, so they do patch the design issue. Just like some stacks are underpowered for abstraction and need patching by code generation. Both of those solve an immediate problem, in a haphazard and error-prone way, by adding burden on maintenance and code evolution linearly to how much you use it.

And worse, if you rely too much on them they will lead your software architecture and make that burden superlinear.

replies(1): >>42158320 #

24. rsynnott ◴[16 Nov 24 18:34 UTC] No.42158142[source]▶

>>42157451 (TP) #

It really feels like we’re close to the end of the current bubble now; the applications being trotted out are just increasingly absurd.

25. FredPret ◴[16 Nov 24 18:35 UTC] No.42158151[source]▶

>>42157567 #

This happens all the time.

Once it was spices. Then poppies. Modern art. The .com craze. Those blockchain ape images. Blockchain. Now LLM.

All of these had a bit of true value and a whole load of bullshit. Eventually the bullshit disappears and the core remains, and the world goes nuts about the next thing.

replies(1): >>42158486 #

26. rsynnott ◴[16 Nov 24 18:35 UTC] No.42158154{3}[source]▶

>>42157699 #

I mean, define ‘better’. Even with actual human programmers, tests which do not in fact test the thing are already a bit of an epidemic. A test which doesn’t test is worse than useless.

27. namaria ◴[16 Nov 24 18:36 UTC] No.42158164{5}[source]▶

>>42157988 #

I think the operators are learning how to hype-edge. You find that sweet spot between promising and 'not just there yet' where you can take lots of investments and iterate forward just enough to keep it going.

It doesn't matter if it can't actually 'get there' as long as people still believe it can.

Come to think about it, a socioeconomic system dependent on population and economic growth is at a fundamental level driven by this balancing act: "We can solve every problem if we just forge ahead and keep enlarging the base of the pyramid - keep reproducing, keep investing, keep expanding the infrastructure".

28. rsynnott ◴[16 Nov 24 18:37 UTC] No.42158168{5}[source]▶

>>42157935 #

I mean, they were at one point making pretty extravagant claims about microwaves, but to a less credulous audience. Trouble with LLMs is that they look like magic if you don’t look too hard, particularly to laypeople. It’s far easier to buy into a narrative that they actually _are_ magic, or will become so.

replies(1): >>42158439 #

29. namaria ◴[16 Nov 24 18:40 UTC] No.42158197[source]▶

>>42157658 #

When did OpenMicroWave promise to solve every societal problem if we just gave it enough money to built a larger microwave oven?

30. spencerchubb ◴[16 Nov 24 18:50 UTC] No.42158278[source]▶

>>42157451 (TP) #

LLMs have powered products used by hundreds of millions, maybe billions. Most experiments will fail and that's okay, arguably even a good thing. Only time will tell which ones succeed

31. spencerchubb ◴[16 Nov 24 18:52 UTC] No.42158290[source]▶

>>42157763 #

Theranos was not a "complex business". It was deliberate fraud and deception, and investors that were just gullible. The investors should have demanded to see concrete results

replies(1): >>42158856 #

32. williamcotton ◴[16 Nov 24 18:54 UTC] No.42158320{5}[source]▶

>>42158133 #

Claude wrote the harness and pretty much all of these tests, eg:

https://github.com/williamcotton/search-input-query/blob/mai...

It is a good test suite and it saved me quite a bit of typing!

In fact, Claude did most of the typing for the entire project:

https://github.com/williamcotton/search-input-query

BTW, I obviously didn't just type "make a lexer and multi-pass parser that returns multiple errors and then make a single-line instance of a Monaco editor with error reporting, type checking, syntax highlighting and tab completion".

I put it together piece-by-piece and with detailed architectural guidance.

33. logifail ◴[16 Nov 24 18:56 UTC] No.42158342[source]▶

>>42157451 (TP) #

> They want to throw LLMs at everything [..]

Oh yes.

I had a discussion with a manager at a client last week and was trying to run him through some (technical) issues relating to challenges an important project faces.

His immediate response was that maybe we should just let ChatGPT help us decide the best option. I had to bite my tongue.

OTOH, I'm more and more convinced that ChatGPT will replace managers long before it replaces technical staff.

34. Karrot_Kream ◴[16 Nov 24 19:07 UTC] No.42158427{5}[source]▶

>>42157935 #

Even rice cookers started doing this by advertising "fuzzy logic".

replies(1): >>42158722 #

35. lxgr ◴[16 Nov 24 19:09 UTC] No.42158439{6}[source]▶

>>42158168 #

I feel like what makes this a bit different from just regular old sufficiently advanced technology is the combination of two things:

- LLMs are extremely competent at surface-level pattern matching and manipulation of the type we'd previously assumed that only AGI would be able to do.

- A large fraction of tasks (and by extension jobs) that we used to, and largely still do, consider to be "knowledge work", i.e. requiring a high level of skill and intelligence, are in fact surface-level pattern matching and manipulation.

Reconciling these facts raises some uncomfortable implications, and calling LLMs "actually intelligent" lets us avoid these.

36. isoprophlex ◴[16 Nov 24 19:13 UTC] No.42158474[source]▶

>>42157451 (TP) #

> I knew it was bullshit from the get-go as soon as I read their definition of AI agents.

That is one spicy article, it got a few laughs out of me. I must agree 100% that Langchain is an abomination, both their APIs as well as their marketing.

37. vishnugupta ◴[16 Nov 24 19:15 UTC] No.42158486{3}[source]▶

>>42158151 #

Exactly. I’ve seen this enough now to appreciate that oft repeated tech adoption curve. It seems like we are in “peak expectations” phase which is immediately followed by the disillusionment and then maturity phase.

38. lxgr ◴[16 Nov 24 19:18 UTC] No.42158507[source]▶

>>42157734 #

Please don't complain about paywalls: https://news.ycombinator.com/item?id=10178989

39. lxgr ◴[16 Nov 24 19:19 UTC] No.42158514{4}[source]▶

>>42157913 #

As annoying as I find them, on this site they're in fact not: https://news.ycombinator.com/item?id=10178989

40. cwzwarich ◴[16 Nov 24 19:29 UTC] No.42158574[source]▶

>>42157567 #

If your LLM is producing a proof that can be checked by another program, then there’s nothing wrong with their reliability. It’s just like playing a game whose rules are a logical system.

41. dgfitz ◴[16 Nov 24 19:35 UTC] No.42158619{4}[source]▶

>>42157894 #

That’s because we are still waiting for the 2008 bubble to pop, which was inflated by the 2020 bubble. It’s going to be bad. People will blame trump, Harris would be eating the same shit sandwich.

It’s gonna be bad.

replies(1): >>42161030 #

42. AlotOfReading ◴[16 Nov 24 19:49 UTC] No.42158722{6}[source]▶

>>42158427 #

Fuzzy logic rice cookers are the result of an unrelated fad in 1990s Japanese engineering companies. They added fuzzy controls to everything from cameras to subways to home appliances. It's not part of the current ML fad.

replies(1): >>42161354 #

43. wslh ◴[16 Nov 24 20:07 UTC] No.42158856{3}[source]▶

>>42158290 #

I expected you to take this with a grain of salt but also to read between the lines: while some projects involve deliberate fraud, others may simply lack coherence and inadvertently follow the principles of the greater fool theory [1]. The use of ambiguous or indistinguishable language often blurs the distinction, making it harder to differentiate outright deception from an unsound business model.

[1] https://en.wikipedia.org/wiki/Greater_fool_theory

44. marcosdumay ◴[17 Nov 24 00:59 UTC] No.42161030{5}[source]▶

>>42158619 #

What makes you think he won't just inflate the bubble again?

Should we expect money pumps to generate inflation quicker on this cycle than on the last ones? If so, why?

replies(1): >>42161159 #

45. dgfitz ◴[17 Nov 24 01:20 UTC] No.42161159{6}[source]▶

>>42161030 #

I think only an ignorant person doesn’t see the train wreck coming, and how making more money won’t fix fuck all.

46. Karrot_Kream ◴[17 Nov 24 02:01 UTC] No.42161354{7}[source]▶

>>42158722 #

Yes. My point is that technology fads aren't new and getting mad at them is a bit like getting mad at fashion or taste.

47. astrange ◴[17 Nov 24 09:48 UTC] No.42163127{3}[source]▶

>>42157907 #

If you're going to do that you need completely different LLMs to base the agents on. The ones I've tried have "mode collapse" - ask them to emulate different agents and they'll all end up behaving the same way. Simple example, if you ask it to write different stories they'll usually end up having the same character names.

replies(1): >>42169897 #

48. edmundsauto ◴[18 Nov 24 05:16 UTC] No.42169897{4}[source]▶

>>42163127 #

It may depend on the domain. I tend to use LLMs for things that are less open ended, more categorization and summarization response than pure novel creation.

In these situations, I’ve been able to sufficiently program the agent that I haven’t seen too much of an issue as you described. Consistency is a feature.

↑