←back to thread

55 points wafflesfreak | 1 comments | | HN request time: 0s | source
Show context
swatcoder ◴[] No.46081759[source]
Where's the performance data?

Anybody can send a PCB description/schematic into an LLM, with a prompt suggesting it generate an analysis and it will diligently produce a document that perceptually resembles an analysis of that PCB. It will do that approximately 100% of the time.

But making an LLM actually deliver a sound, useful, accurate analysis would be quite an accomplishment! Is that really what you've done? How did you know you got it right? How right did you get it?

To sell an analysis tool, I'd expect to see some kind of comparison against other tooling and techniques. General success rate? False negative rate? False positive rate? How does it do against simple schematics vs large ones? What IC's and components will it recognize and which will it fail to recognize? Does it throw an error if it encounters something it doesn't recognize? When? Do you have testimonials? Examples?

replies(2): >>46081918 #>>46083289 #
1. wafflesfreak ◴[] No.46083289[source]
Hi! This is a totally fair question, and I appreciate you raising it. Getting reliable performance out of an LLM on something as structured as a schematic is hard, and I don’t want to pretend this is a solved problem or that the tool is infallible.

Benchmarking is tricky right now because there aren’t many true “LLM ERC” systems to compare against. You could compare against traditional ERC, but this tool is meant to complement that workflow, not replace it. For this initial MVP, most of the accuracy work has come from collecting real shipped-board schematics (mine and friends’) with known issues and iterating until the tool consistently detected them. A practical way to evaluate it yourself is to upload designs you already know have issues, along with the relevant datasheets, and see how well it picks them up. Additionally, If you have a schematic with known mistakes and are open to sharing it, feel free to reach out to through the "contact us" page. Contributions like that are incredibly helpful, and I’d be happy to provide additional free usage in return.

I’ll also be publishing case studies soon with concrete examples: the original schematics, the tool’s output, what it caught (and what it missed), and comparisons against general-purpose chat LLM responses.

The goal isn’t to replace a designer’s judgment, but to surface potential issues that are easy to miss. Similar to how AI coding tools flag things you still have to evaluate yourself. Ultimately the designer decides what’s valid and what isn’t.

I really appreciate the push for rigor, and I’ll follow up once the case studies are live.