Most active commenters
  • timschmidt(5)
  • cyberpunk(3)

←back to thread

378 points todsacerdoti | 13 comments | | HN request time: 0.02s | source | bottom
Show context
mensetmanusman ◴[] No.44984308[source]
Not being a programmer, I have a question.

Can any program be broken down into functions and functions of functions that have inputs and outputs so that they can be verified if they are working?

replies(13): >>44984336 #>>44984354 #>>44984459 #>>44984510 #>>44984549 #>>44984555 #>>44984562 #>>44984865 #>>44984968 #>>44985022 #>>44986774 #>>44988388 #>>45001896 #
1. cyberpunk ◴[] No.44984336[source]
Pretty much, yes. But what i think you’re talking about (formal verification of code) is a bit of a dark art and barely makes it out of very specialised stuff like warhead guidance computers and maybe some medical stuff etc.
replies(2): >>44984505 #>>44988485 #
2. timschmidt ◴[] No.44984505[source]
Most people don't bother with formal verification because it costs extra labor and time. LLMs address both. I've been enjoying working with an LLM on Rust projects, especially for writing tests, which aren't the same as formal verification, but it's in the same ballpark.
replies(2): >>44984837 #>>44984891 #
3. cryptonym ◴[] No.44984837[source]
Vibe-coding tests is nowhere near formal verification.
replies(1): >>44998848 #
4. drdrey ◴[] No.44984891[source]
not even close to being in the same ballpark
replies(1): >>44998877 #
5. chasd00 ◴[] No.44988485[source]
Also, if you're going to formally verify your code then the compiler better have been formally verified. If the compiler has been verified then the ASM better be formally verified and so on all the way down to the actual circuit and clock.

...then a bit flips because of a stray high energy particle or someone trips over the metaphorical power cord and it all crashes anyway.

replies(1): >>45010568 #
6. timschmidt ◴[] No.44998848{3}[source]
When typewriters spread in the late 19th century, clerks who used them were sometimes called “mechanical scribblers” or accused of doing “machine work” rather than proper clerical labor.

When adding machines and calculators appeared in offices, detractors claimed they would weaken the mind. In the mid-20th century, some educators derided calculator users as “button pushers” rather than “real mathematicians.”

In the 1980s, early adopters of personal computers and word processors were sometimes called “typists with toys.” Secretaries who mastered word processors were sometimes derided as “not real secretaries” because they lacked shorthand or dictation skills.

Architects and engineers who switched from drafting tables to CAD in the 1970s–80s faced accusations that CAD work was “cookie-cutter” and lacked craftsmanship. Traditional draftsmen argued that “real” design required hand drawing, while CAD users were seen as letting the machine think for them.

Across history, the insults usually follow the same structure:

- Suggesting the new tool makes the work too easy, therefore less valuable.

- Positioning users as “operators” rather than “thinkers.”

- Romanticizing the older skill as more “authentic” or “serious.”

replies(1): >>45024319 #
7. timschmidt ◴[] No.44998877{3}[source]
Let me know when you find a memory error or concurrency problem, or really any undefined behavior in any of my code.
replies(1): >>45001919 #
8. Disposal8433 ◴[] No.45001919{4}[source]
> https://github.com/timschmidt/repsnapper/blob/master/src/sha...

That variable is undefined in multiple constructors. Also your code cannot compile in various scenarios. Have a nice day.

replies(1): >>45068888 #
9. cyberpunk ◴[] No.45010568[source]
Actually I knew someone who worked on special compilers for embedded stuff for the military which would only emit code which uses specific asm operations which had been verified on specific cpus.

So it’s really not that far fetched.

10. cryptonym ◴[] No.45024319{4}[source]
Few generated unit tests doesn't replace formal verification. That's just not the same thing at all, it's not a matter of manual vs automated calculator.
replies(1): >>45079128 #
11. cyberpunk ◴[] No.45068888{5}[source]
It's been 5 days and I'm still laughing about this one. Bravo.
replies(1): >>45079081 #
12. timschmidt ◴[] No.45079081{6}[source]
I'm laughing too, that code is 13 years old, committed by hurzl here: https://github.com/timschmidt/repsnapper/commit/5410b5d278a5... and in C++. To be clear, I'd handed off maintainership at least a decade ago. I haven't touched this software in many years. Just noticed the reply.

So this doesn't seem relevant to the conversation about LLMs, Rust, and software quality improvement methods from strict typing to formal verification. It's like a "gotcha!" that didn't land. Sorry.

Please, find some bugs in a project I've touched in the last few years! Looking for things to fix. Please open a github issue from an account linked to your projects next time so I can return the favor :D The bugs are in there, I know they are, but with an LLM and a bit of time to review it's work, it now costs me a few minutes tops to write a test to exclude future cases of the same problem or class of problems. Which is a level of verification beyond what's been allocated time in previous projects, where the tests never got written or very infrequently.

Repsnapper's a great example of that. We didn't have a standardized testing framework we were using across the dozen or so libraries we'd integrated into the app. The original author Kulitorum sort of copied and pasted bits and pieces of code together to write the app originally, without much concern for licenses or origin tracking, so I initiated a line-by-line audit and license verification for the codebase in order to qualify it for inclusion in Fedora and Debian to make 3D printing easier and more available as there were no printing tools shipping in a distro at the time. Integrating new libraries into that codebase was unpleasant, working with the build system in general was not my favorite. Lots of room to screw it up, but it has to be just right to work. I think having llms and a testing framework would have allowed us to evolve the Repsnapper code a lot more safely, and a lot further than we ever managed.

Well, and I can say that pretty safely now that I'm in the process of doing just that with https://github.com/timschmidt/alumina-firmware and https://github.com/timschmidt/alumina-ui and https://github.com/timschmidt/csgrs

They're all still very young, still some things stubbed out, code is gross pending revision and cleanup, but it's basically Repsnapper 3.0 in Rust but this version includes CAD and a motion control firmware and fits in < 4mb. Among them I already have hundreds of tests entirely absent from Repsnapper. Couldn't have written csgrs without them. Probably a lot of redundant tests at this point, as things have changed rapidly. I'm only about 6mo of effort in.

13. timschmidt ◴[] No.45079128{5}[source]
> Few generated unit tests doesn't replace formal verification.

That's a claim you're making for the first time, here. Not one I've made. Go ahead and re-read my comments to verify.