Edit: I believe that LLM's are eminently useful to replace experts (of all people) 90% of the time.
I’d disagree, though: humans are still easier to predict and understand (and trust) than AI, typically.
What do you mean by "expert"?
Do you mean the pundit who goes on TV and says "this policy will be bad for the economy"?
Or do you mean the seasoned developer who you hire to fix your memory leaks? To make your service fast? Or cut your cloud bill from 10M a year to 1M a year?
In this example, GPT-4o cannot tell that GitHub is spelled correctly:
https://app.gitsense.com/?doc=6c9bada92&model=GPT-4o&samples...
In this example, Claude cannot tell that GitHub is spelled correctly:
https://app.gitsense.com/?doc=905f4a9af74c25f&model=Claude+3...
I still believe LLM is a game changer and I'm currently working on what I call a "Yes/No" tool which I believe will make trusting LLMs a lot easier (for certain things of course). The basic idea is the "Yes/No" tool will let you combine models, samples and prompts to come to a Yes or No answer.
Based on what I've seen so far, a model can easily screw up, but it is unlikely that all will screw up at the same time.
No, I'm just disappointed in the decision of Black Box A and am bound to be even more disappointed by Black Box B. If we continue removing thoughtful design from our systems because thoughtlessness is the default, nobody's life will improve.
Sure, EDA tools are deterministic, but the humans who apply them are not. Introducing LLMs to these processes is not some radical and scary departure, it’s an iterative evolution.
But we have had extensive experience with humans, it is normal to have better defined trust, LLMs will be better understood as well. There is no central understander or truth, that is the interesting part, it's a "Blind men and the elephant" situation.
If you could that would be nice wouldn't it? And if you couldn't?
If people were saying, "let's replace Casio Calculators with interfaces to GPT" then that would be crazy and I would wholly agree with you but by and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
You're making the wrong distinction here. It's not Dave vs your nifty script. It's Dave or nothing at all.
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist.
You compare to the things its meant to replace - humans. How well can the LLM do this compared to Dave ?
By and large, the processes people are scrambling to place LLMs in are ones that typical machines struggle or fail and humans excel or do decently (and that LLMs are making some headway in).
There's no point comparing LLM performance to some hypothetical perfect understanding machine that doesn't exist. It's nonsensical actually. You compare it to the performance of the beings it's meant to replace or augment - humans.
Replacing non-deterministic black boxes with potentially better performing non-deterministic black boxes is not some crazy idea.
Experts capable of critical thinking and reflecting on evidence that contradicts their world model (and thereby retraining it on the fly)? Most likely not, at least not in their current architecture with all its limitations.
LLM’s are not good at actually doing the processing, they are not good at math or even text processing at a character level. They often get logic wrong. But they are pretty good at looking at patterns and finding creative solutions to new inputs (or at least what can appear creative, even if philosophically it’s more pattern matching than creativity). So an LLM would potentially be good at writing a first draft of that script, which Dave could then proofread/edit, and which a standard deterministic computer could just run verbatim to actually do the processing. Eventually maybe even Dave’s proofreading would be superfluous.
Tying this back to the original article, I don’t think anyone is proposing having an LLM inside a chip that processes incoming data in a non-deterministic way. The article is about using AI to design the chips in the first place. But the chips would still be deterministic, the equivalent of the script in this analogy. There are plenty of arguments to make about LLM‘s not being good enough for that, not being able to follow the logic or optimize it, or come up with novel architectures. But the shape of chip design/Verilog feels like something that with enough effort, an AI could likely be built that would be pretty good at it. All of the knowledge that those smart knowledgeable engineers which are good at writing Verilog have built up can almost certainly be represented in some AI form, and I wouldn’t bet against AI getting to a point where it can be helpful similarly to how Copilot currently is with code completion. Maybe not perfect anytime soon, but good enough that we could eventually see a path to 100%. It doesn’t feel like there’s a fundamental reason this is impossible on a long enough time scale.
I'm pretty sure they are scrambling to put them absolutely anywhere it might save or make a buck (or convince an investor that it could)
Right, and there’s nothing fundamentally wrong with this, nor is it a novel method. We’ve been joking about copying code from stack overflow for ages, but at least we didn’t pretend that it’s the peak of human achievement. Ask a teacher the difference between writing an essay and proofreading it.
Look, my entire claim from the beginning is that understanding is important (epistemologically, it may be what separates engineering from alchemy, but I digress). Practically speaking, if we see larger and larger pieces of LLM written code, it will be similar to Dave and his incomprehensible VBA script. It works, but nobody knows why. Don’t get me wrong, this isn’t new at all. It’s an ever-present wet blanket that slowly suffocates engineering ventures who don’t pay attention and actively resist. In that context, uncritically inviting a second wave of monkeys to the nuclear control panels, that’s what baffles me.
Its really just that the "in principle" part of the overall implication with your comment and so many others just doesn't make sense. Its very much cutting off your nose to spite your face. How could science itself be possible, much less engineering, if this is how we decided things? If we regarded ourselves always from the outside? How could even be motivated to debate whether we get the computers to design their own chips? When would something actually happen? At some point, people do have ideas, in a full, if false, transparency to themselves, that they can write down and share and explain. This is not only the thing that has gotten us this far, it is the very essence of why these models are so impressive in the certain ways that they are. It doesn't make sense to argue for the fundamental cheapness of the very thing you are ultimately trying to defend. And it imposes this strange perspective where we are not even living inside our own (phenomenal) minds anymore, that it fundamentally never matters what we think, no matter our justification. Its weird!
I'm sure you have a lot of good points and stuff, I just am simply pointing out that this particular argument is maybe not the strongest.
Tangent for a slight pet peeve of mine:
"We" did joke about this, but probably because most of our jobs are not in chip design. "We" also know the limits of this approach.
The fact that Stack Overflow is the most SEO optimised result for "how to center div" (which we always forget how to do) doesn't have any bearing on the times when we have an actual problem requiring our attention and intellect. Say diagnosing a performance issue, negotiating requirements and how they subtly differ in an edge case from the current system behaviour, discovering a shared abstraction in 4 pieces of code that are nearly but not quite the same.
I agree with your posts here, the Stack Overflow thing in general is just a small hobby horse I have.
I like my job.
My job also involves cooperating with other non-deterministic black boxes (colleagues).
I can totally see how artificial non-deterministic black boxes (artificial colleagues) may be useful to replace/augment the biological ones.
For one, artificial colleagues don't get tired and I don't accidentally hurt their feelings or whatnot.
In any case, I'm not looking forward to replacing my deterministic tools with the fuzzy AI stuff.
Intuitively at least it seems to me that these non-deterministic black boxes could really benefit from using the deterministic tools for pretty much the same reasons we do as well.
I accept that I’m fallible, both in my areas of expertise and in all the meta stuff around it. I code bugs. I omit requirements. Not often, and there are mental and technical means to minimize, but my work, my org’s structure, my company’s processes are all designed to mitigate human fallibility.
I’m not interested in “defending” AI models. I’m just saying that their weaknesses are qualitatively similar to human weaknesses, and as such, we are already prepared to deal with those weaknesses as long as we are aware of them, and as long as we don’t make the mistake of thinking that because they use transistors they should be treated like a mostly deterministic piece of software where one unit test pass means it is good.
I think you’re reading some kind of value judgement on consciousness into what is really just a pragmatic approach to slotting powerful but imperfect agents into complex systems. It seems obvious to me, and without any implications as to human agency.
For example, using a LLM to transform structured data into JSON, and doing it with two LLMs in parallel to try to catch the inevitable failures, instead of just writing code that outputs JSON.
Or Dave could write a first draft of that script, saving him the time needed to translate what the LLM composed.
Does an LLM know math? Not like we do. There’s no deductive logic in there; it’s all statistical inferences from language. An LLM doesn’t “work through” a circuit diagram systematically the way a physics student would. It observes the entire diagram at once, and then guesses the most likely next token.
Hello, fellow tech enthusiasts, just stopping by to announce I performatively can't tell the difference between "Latest big tech product (TM)" and Homo Sapiens Sapiens!!!
I'll be seeing you in the next LLM related message thread with the same exact comment!!! As you were!!!
I look up "how do I sort a list in language X" because I know from school that there IS a defined good way to do it, probably built into the language, and it will be extremely idiomatic, but I haven't used language X in five years and the specifics might have changed and I don't remember the specific punctuation.