Most active commenters

jjk166(6)
adrian_b(4)

Popular/hot comments

>>42158545 #

←back to thread

YC is wrong about LLMs for chip design

(www.zach.be)

1. aubanel ◴[16 Nov 24 19:06 UTC] No.42158417[source]▶

>>42156516 (OP) #

I know nothing about chip design. But saying "Applying AI to field X won't work, because X is complex, and LLMs currently have subhuman performance at this" always sounds dubious.

VCs are not investing in the current LLM-based systems to improve X, they're investing in a future where LLM based systems will be 100x more performant.

Writing is complex, LLMs once had subhuman performance, and yet. Digital art. Music (see suno.AI) There is a pattern here.

replies(7): >>42158545 #>>42158550 #>>42158576 #>>42159935 #>>42160061 #>>42165587 #>>42169569 #

2. zachbee ◴[16 Nov 24 19:24 UTC] No.42158545[source]▶

>>42158417 (TP) #

I didn't get into this in the article, but one of the major challenges with achieving superhuman performance on Verilog is the lack of high-quality training data. Most professional-quality Verilog is closed source, so LLMs are generally much worse at writing Verilog than, say, Python. And even still, LLMs are pretty bad at Python!

replies(3): >>42158764 #>>42159143 #>>42160983 #

3. jrflowers ◴[16 Nov 24 19:24 UTC] No.42158550[source]▶

>>42158417 (TP) #

I like this reasoning. It is shortsighted to say that LLMs aren’t well-suited to something (because we cannot tell the future) but it is not shortsighted to say that LLMs are well-suited to something (because we cannot tell the future)

replies(1): >>42158710 #

4. kuhewa ◴[16 Nov 24 19:29 UTC] No.42158576[source]▶

>>42158417 (TP) #

> Writing is complex, LLMs once had subhuman performance,

And now they can easily replace mediocre human performance, and since they are tuned to provide answers that appeal to humans that is especially true for these subjective value use cases. Chip design doesn't seem very similar. Seems like a case where specifically trained tools would be of assistance. For some things, as much as generalist LLMs have surprised at skill in specific tasks, it is very hard to see how training on a broad corpus of text could outperform specific tools — for first paragraph do you really think it is not dubious to think a model trained on text would outperform Stockfish at chess?

replies(1): >>42163499 #

5. cruffle_duffle ◴[16 Nov 24 19:47 UTC] No.42158710[source]▶

>>42158550 #

I kinda suspect that things that are expressed better with symbols and connections than with text will always be a poor fit to large LANGUAGE models. Turning what is basically a graph into a linear steam of text descriptions to tokenize and jam into an LLM has to be an incredibly inefficient and not very performant way of letting “AI” do magic on your circuits.

Ever try to get ChatGPT to play scrabble? Ever try to describe the board to it and then all the letters available to you? Even its fancy pants o1 preview performs absolutely horrible. Either my prompting completely sucks or an LLM is just the wrong tool for the job.

It’s great for asking you to score something you just created provided you tell it what bonuses apply to which words and letters. But it has absolutely no concept of the board at all. You cannot use to optimize your next move based on the board and the letters.

… I mean you might if you were extremely verbose about every letter on the board and every available place to put your tiles, perhaps avoiding coordinates and instead describing each word, its neighbors and relationships to bonus squares. But that just highlights how bad a tool an LLM is for scrabble.

Anyway, I’m sure schematics are very similar. Maybe somebody we will invent good machine learning models for such things but an LLM isn’t it.

replies(1): >>42161888 #

6. theptip ◴[16 Nov 24 19:54 UTC] No.42158764[source]▶

>>42158545 #

That’s what your VC investment would be buying; the model of “pay experts to create a private training set for fine tuning” is an obvious new business model that is probably under-appreciated.

If that’s the biggest gap, then YC is correct that it’s a good area for a startup to tackle.

replies(1): >>42171739 #

7. e_y_ ◴[16 Nov 24 20:43 UTC] No.42159143[source]▶

>>42158545 #

That's probably where there's a big advantage to being a company like Nvidia, which has both the proprietary chip design knowledge/data and the resources/money and AI/LLM expertise to work on something specialized like this.

replies(1): >>42159803 #

8. DannyBee ◴[16 Nov 24 22:09 UTC] No.42159803{3}[source]▶

>>42159143 #

I strongly doubt this - they don't have enough training data either - you are confusing (i think) the scale of their success with the amount of verilog they possess.

IE I think you are wildly underestimating both the scale of training data needing, and wildly overestimating the amount of verilog code possessed by nvidia.

GPU's work by having moderate complexity cores (in the scheme of things) that are replicated 8000 times or whatever. That does not require having 8000 times as much useful verilog, of course.

The folks who have 8000 different chips, or 100 chips that each do 1000 things, would probably have orders of magnitude more verilog to use for training

9. duped ◴[16 Nov 24 22:26 UTC] No.42159935[source]▶

>>42158417 (TP) #

AI still has subhuman performance for art. It feels like the venn diagram of people who are bullish on LLMs and people who don't understand logistic curves is a circle.

replies(1): >>42161059 #

10. jjk166 ◴[17 Nov 24 00:48 UTC] No.42160983[source]▶

>>42158545 #

I would imagine it is a reasonably straightforward thing to create a simulator that generates arbitrary chip designs and the corresponding verilog that can be used as training data. It would be much like how AlphaFold was trained. The chip designs don't need to be good, or even useful, they just need to be valid so the LLM can learn the underlying relationships.

replies(2): >>42163071 #>>42175981 #

11. jjk166 ◴[17 Nov 24 01:04 UTC] No.42161059[source]▶

>>42159935 #

You ask 100,000 humans each to make a photo realistic rendering of a alpaca playing basketball on the moon in 90 seconds, an LLM is going to outperform every single one of them.

replies(2): >>42163079 #>>42165589 #

12. therealcamino ◴[17 Nov 24 04:06 UTC] No.42161888{3}[source]▶

>>42158710 #

There are lots of reasons to doubt the present-day ability of LLMs to help with chip design, but I don't think any of these things above are why. Chip design isn't done with schematics. If an LLM can write Python given enough training data, it can write SystemVerilog given a similar amount of training (though the world currently lacks enough high-quality open source SV to reach an equivalent level.) We can debate whether the LLM actually writes Python well. But I don't think there's a reason to expect that writing SV requires a different approach.

replies(2): >>42162219 #>>42171810 #

13. cruffle_duffle ◴[17 Nov 24 05:36 UTC] No.42162219{4}[source]▶

>>42161888 #

I get what you are saying. It could be a good ‘commander’ that knows how to delegate to better-suited subsystems. But it is not the only way to be intelligent by any means.

To a nail, every hammer has a purpose.

14. astrange ◴[17 Nov 24 09:35 UTC] No.42163071{3}[source]▶

>>42160983 #

I know just enough about chips to be suspicious of "valid". The right solution for a chip at the HDL layer depends on your fab, the process you're targeting, what % of physical space on the chip you want it to take up, and how much you're willing to put into power optimization.

replies(1): >>42185913 #

15. astrange ◴[17 Nov 24 09:36 UTC] No.42163079{3}[source]▶

>>42161059 #

Diffusion models aren't actually LLMs, they're a different architecture. Which makes it even weirder we invented them at the same time.

Also, they might not be able to do it. eg most models can't generate "horse riding an astronaut" or "upside-down car".

replies(1): >>42167611 #

16. tim333 ◴[17 Nov 24 11:20 UTC] No.42163499[source]▶

>>42158576 #

When people say LLM I think they are often thinking of neural network approaches in general rather than just text based even if the letters do stand for language model. And there's overlap eg. Gemini does language but is multi modal. If you skip that you get things like AlphaZero which did beat Stockfish https://en.wikipedia.org/wiki/AlphaZero

17. shash ◴[17 Nov 24 17:49 UTC] No.42165587[source]▶

>>42158417 (TP) #

In this specific case, it's hard to see how LLMs can get you from here to there. the problem isn't the boilerplate code like when you build a react website with them, but the really novel architectures (and architectural decisions, more importantly) that you need to make along the way. Some of those can seem very arbitrary and require deep understanding to pull off. You can't just use language-like tokens to reason this out. Fundamental understanding of the laws and rules of thumb are important.

18. duped ◴[17 Nov 24 17:49 UTC] No.42165589{3}[source]▶

>>42161059 #

That's not a meaningful benchmark for valuing art or creating art

replies(1): >>42185998 #

19. sincerely ◴[17 Nov 24 21:47 UTC] No.42167611{4}[source]▶

>>42163079 #

To be fair, most humans can't draw any better than stick figures.

replies(1): >>42171006 #

20. epolanski ◴[18 Nov 24 03:49 UTC] No.42169569[source]▶

>>42158417 (TP) #

Okay but I still see no startup here.

If LLMs will do well in the space for some use case it's the established chip designers that will benefit from it, not a small startup.

21. foldr ◴[18 Nov 24 09:34 UTC] No.42171006{5}[source]▶

>>42167611 #

This is true, but humans are much better at including specified elements in an image with specified spatial relationships. A description like a "A porpoise seated at a desk writing a letter" will reliably produce (terrible) drawings consisting of parts corresponding to the porpoise, parts corresponding to the desk, and parts corresponding to the letter, with the arrangement of the parts roughly corresponding to the description.

replies(1): >>42186231 #

22. adrian_b ◴[18 Nov 24 12:18 UTC] No.42171739{3}[source]▶

>>42158764 #

It would be hard to find any experts that could be paid "to create a private training set for fine tuning".

The reason is that those experts do not own the code that they have written.

The code is owned by big companies like NVIDIA, AMD, Intel, Samsung and so on.

It is unlikely that these companies would be willing to provide the code for training, except for some custom LLM to be used internally by them, in which case the amount of code that they could provide for training might not be very impressive.

Even a designer who works in those companies may have great difficulties to see significant quantities of archived Verilog/VHDL code, though it can be hoped that it still exists somewhere.

replies(1): >>42174344 #

23. adrian_b ◴[18 Nov 24 12:32 UTC] No.42171810{4}[source]▶

>>42161888 #

The main problem in making a good circuit design, and actually also in writing a good program, is not writing per se.

The main problem is an optimal decomposition of the big project into a collection of interconnected modules and in defining adequate interfaces between modules.

This is not difficult when the purpose of the project is to just take an older project and make some improvements to it, when a suitable structure is already known, but it is always the main difficulty when a really new problem must be solved.

I have yet to see any example when a LLM can be used to help even in the slightest way to solve such an example of "divide et impera" for something novel, where novel by definition means that the training set has not contained the solution for an identical project.

There is pretty much no relationship between the 2-dimensional or multi-dimensional structural graph of the interconnected modules, together with the descriptions of their matching interfaces, and the proximity or frequency of tokens in the description of the circuit by a hardware design language. So there is little that a LLM could use to generate any HDL program for an unknown circuit.

What a LLM could do is only after a good designer has done the difficult job to decompose the project into modules and define the interfaces. When given a small module with its defined interfaces, a LLM might be able to find some boilerplate code to speed up the implementation of the module.

However, any good designer would already have templates for the boilerplate code and I can not really imagine how a LLM could do this faster than a designer who just selects the appropriate templates and pastes them into the module.

24. theptip ◴[18 Nov 24 17:01 UTC] No.42174344{4}[source]▶

>>42171739 #

When I say “pay to create” I generally mean authoring new material, distilling your career’s expertise.

Not my field of expertise but there seem to be experts founding startups etc in the ASIC space, and Bitcoin miners were designed and built without any of the big companies participating. So I’m not following why we need Intel to be involved.

An obvious way to set up the flywheel here is to hire experts to do professional services or consulting on customer-submitted designs while you build up your corpus. While I said “fine-tuning”, there is probably a lot of agent scaffolding to be built too, which disproportionately helps bigger companies with more work throughput. (You can also acquire a company with the expertise and tooling, as Apple did with PA Semi in ~2008, though obviously $100m order of magnitude is out of reach for a startup. https://www.forbes.com/2008/04/23/apple-buys-pasemi-tech-ebi...)

replies(1): >>42175803 #

25. adrian_b ◴[18 Nov 24 19:11 UTC] No.42175803{5}[source]▶

>>42174344 #

I doubt any real expert would be tempted by an offer to author new material, because that cannot be done in a good way.

One could author some projects that can be implemented in FPGAs, but those do not provide good training material for generating code that could be used to implement a project in an ASIC, because the constraints of the design are very different.

Designing an ASIC is a year-long process and it is never completed before testing some prototypes, whose manufacture may cost millions. Authoring some Verilog or VHDL code for an imaginary product that cannot be tested on real hardware prototypes could result only in garbage training material, like the code of a program that has never been tested to see if it actually works as intended.

Learning to design an ASIC is not very difficult for a human, because a human does not need a huge number of examples, like ML/AI. Humans learn the rules and a few examples are enough for them. I have worked in a few companies at designing ASICs. While those companies had some internal training courses for their designers, those courses only taught their design methodologies, but with practically no code examples from older projects, so very unlikely to how a LLM would have to be trained.

26. adrian_b ◴[18 Nov 24 19:29 UTC] No.42175981{3}[source]▶

>>42160983 #

I have never heard of any company, no matter how big and experienced, where it is possible to decide that an ASIC design is valid by any other means except by paying for a set of masks to be made and for some prototypes to be manufactured, then tested in the lab.

This validation costs millions, which is why it is hard to enter this field, even as a fabless designer.

Many design errors are not caught even during hardware testing, but only after mass production, like the ugly MONITOR/MWAIT bug of Intel Lunar Lake.

Randomly-generated HDL code, even if it does not have syntax errors, and even if some testbench for it does not identify deviations from its specification, is not more likely to be valid when implemented in hardware, than the proverbial output of a typewriting monkey.

replies(1): >>42185871 #

27. jjk166 ◴[19 Nov 24 17:20 UTC] No.42185871{4}[source]▶

>>42175981 #

Validating an arbitrary design is hard. It's equivalent to the halting problem. Working backwards using specific rules that guarantee validity is much easier. Again, the point is not to produce useful designs. The generated model doesn't need to be perfect, indeed it can't be, it just needs to be able to avoid the same issues that humans are looking for.

28. jjk166 ◴[19 Nov 24 17:23 UTC] No.42185913{4}[source]▶

>>42163071 #

The goal is not to produce the right, or even a good solution. The point is to create a large library of highly variable solutions so the trained model can pick up on underlying patterns. You want it to spit out lots of crap.

29. jjk166 ◴[19 Nov 24 17:30 UTC] No.42185998{4}[source]▶

>>42165589 #

What meaningful benchmark would you use? Art by it's nature is subjectively experienced - what one person considers great, meaningful, soul-moving art, another may consider terrible, meaningless, and empty. Both opinions are equally valid.

But if you're using AI to create art, you're typically not trying to move someone's soul. You're trying to create a work that depicts something in a particular style with a particular fidelity with a certain amount of resource consumption. That is the only metric by which it makes any sense to evaluate the machine designed to do that specific task.

30. jjk166 ◴[19 Nov 24 17:50 UTC] No.42186231{6}[source]▶

>>42171006 #

Humans being better at one specific aspect of a task is not equivalent to humans being overall better at the task.

I just entered your prompt into an AI image generator and in under a second it gave me an image[0] of what looks to me like an anthropomorphic dolphin sitting at a desk writing a letter in a little study. I then had to google what the difference between a porpoise and a dolphin was because I genuinely thought porpoises looked much more like manatees. While I could nitpick the AI's work for making the porpoise's snout a little too long, had I drawn it the porpoise would have been a vaguely marine looking blob with no anatomy detailed enough to recognize let alone criticize. I am quite confident that if you asked for a large number of images based on that prompt from humans, it would easily rank among the best, and it's unlikely you'd get any which were markedly better. The fact it can generate this image nearly instantaneously though is astounding. If your goal was to get one masterpiece hanging in the Louvre, this particular tool would not suffice, but if your goal was to illustrate children's books, this tool could do in hours what would have taken a team of humans months. That is superhuman performance.

[0] https://api.deepai.org/job-view-file/e0b80ca6-d934-42e4-9a7e...

(Sorry if the link doesn't remain good for long)

replies(1): >>42189407 #

31. foldr ◴[20 Nov 24 00:00 UTC] No.42189407{7}[source]▶

>>42186231 #

An AI image generator will sometimes do a good job on this sort of prompt, but it fails in different ways to the ways that humans fail.

Whether humans or AI are better at the task overall is probably too vague a question to answer, depending a lot on how you weight different desirables.

↑