You can also see this effect in the article on the history of Coverity’s analyzer. Real-world code was horrible to deal with vs the academic examples they started with.
https://cacm.acm.org/research/a-few-billion-lines-of-code-la...
The problem with transpiling C to rust is that unsafe and unverified behavior can be a key property of the behavior resulting program, so there isn’t an obvious way to spit out a sort of rustified (CRustified?) binary that matches the behavior of the C program.
I'm not sure what it has to do with translating the code to Rust.
> The coercions introduced by conversion rules can however lead to subtle semantic differences
The example they give is this C code:
1 uint8_t x[1] = { 0 };
2 uint8_t *y = x;
3 *y = 1;
4 assert(*x == 1); /* SUCCESS */
getting translated to this (safe) Rust code: 1 let x: [u8; 1] = [0; 1];
2 let mut y: Box<[u8]> = Box::new(x);
3 y[0] = 1;
4 assert!(x[0] == 1) /* failure */
So the pointer (iterator) targeting an existing (stack-allocated) array declared on line 2 gets translated to an owning pointer/Box) targeting a (heap-allocated) new copy of the array. So if the original code was somehow counting on the fact that the pointer iterator was actually targeting the array it was assigned to, the translated code may (quietly) not behave correctly.For comparison, the scpptool (my project) auto-translation (to a memory safe subset of C++) feature would translate it to something like:
1 mse::lh::TNativeArrayReplacement<uint8_t, 1> x = { 0 };
2 mse::lh::TNativeArrayReplacement<uint8_t, 1>::iterator y = x; // implicit conversion from array to iterator
3 *y = 1;
4 assert(*x == 1); /* SUCCESS */ // dereferencing of array supported for compatibility
or if y is subsequently retargeted at another type of array, then line 2 may end up as something like: 2 mse::TAnyRandomAccessIterator<uint8_t> y = x; // implicit conversion from array to iterator
So the OP project may only be converting C code that is already amenable to being converted to safe Rust. But given the challenge of the problem, I can respect the accomplishment and see some potential utility in it.edit: added translation for line 2 in an alternate hypothetical situation.
Formally verified C programs typically have had all their undefined behaviour already stamped out during the verification process. That makes mapping it over to Rust a lot simpler.
The first is that the translation function (from C to Rust) has itself been formally verified. In other words: if you give it a C program that obeys certain necessary pre-conditions, it will give you a safe Rust program. I believe the title of the paper is primarily using "Formalized" with regard to this characteristic.
The second is that the set of programs the authors evaluate their tool on are C programs that have themselves been formally verified. I only barely skimmed the introduction and didn't see it addressed directly there, but I would assume that this is because the pre-conditions necessary for their translation to work are most easily (only?) met by formally verified C programs, where of course the verification has been rendered with respect to properties that would be relevant here (e.g., memory use).
First of all they never translated any C! At all. Zero lines.
They took code written in F* and modified its C compiler to emit Rust. They never had to deal with any actual C code of any complexity, aside from the most trivial code that might theoretically be emitted by a toy compiler (but wasn't even that!). They just pretended they were dealing with C (or a heavily restricted Mini-C).
Even if we accept this, there are plenty of arbitrary restrictions on the C it can even support in principle:
> If the original C program further relies on x, our translation will error out, and will ask the programmer to fix their source code.
That's saying you want C to already be written in a style that makes the Rust borrow checker happy! They're avoiding the actual hard parts of the problem.
> These rules present simplified versions of what we support. Our implementation features several peephole optimizations ...
If I may translate from the language of academia. "We present beautiful rules in figure 4. But in reality, our implementation relies on a large number of hacks."
But it gets worse!
> For overlap cases that can be distinguished statically (as above), we emit a compile-time error; otherwise, the Rust code will panic at runtime.
The resulting code is not guaranteed to be correct! For example, aliasing in C can cause crashes in the resulting Rust code. See comment at the top of page 10. We're going from a formally verified C program to a "It may crash now" Rust program!? That's nuts!
> We apply our methodology to existing formally verified C codebases, namely, the HACL* cryptographic library ...
This is such a blatant lie that I had to return to it. It's designed to catch an unwary reviewer. We often talk about HACL being in verified C because it gets compiled to that. But it's not a C library, it's written in a totally different language. You cannot confuse the two.
I'm not a reviewer for their paper. But if I was, I would strongly fight for rejection.
The fact that they only handle formally verified C is so astronomically far away from being their main problem.
An honest title would be "Compiling a subset of F* to partially Safe Rust, Partially Formalized"
In this case it's not about Frama-C or similar tools though, see your sibling comments about the caveats.
Sadly that is the name of the game in conference publishing. Make a big claim in the title and hope the reviewer does not read the fine print.
> If I may translate from the language of academia. "We present beautiful rules in figure 4. But in reality, our implementation relies on a large number of hacks."
Er, this I can understand. Every conference paper only presents you with a simplified view of the contributions. There is no way to describe every corner case within the page limits of a conference paper (specifically, I bet this was submitted to POPL 2025).
In my group I don't allow hacks. Most other good papers I can think of don't do that. It takes new students a while to get used to not maximizing performance at all costs. And if a hack must exist, then it's explained, not as a hack, but as part of the method.
Don't you hate it when you implement some beautiful idea from a paper only to discover that it hardly bares any relationship to the actual method that works?
> There is no way to describe every corner case within the page limits of a conference paper (specifically, I bet this was submitted to POPL 2025).
You're probably right. But POPL allows for unlimited pages in an attached appendix. They can and should describe the full method. Particularly if they're going to play the game of saying that they formalized this.
> Sadly that is the name of the game in conference publishing. Make a big claim in the title and hope the reviewer does not read the fine print.
It's a form of namespace squatting. Do the worst job possible to claim a title that you couldn't execute on, so that when people figure out that problem, they will be forced to cite your non-working solution. I loathe this approach to publishing.
We should punish people for doing. Reject for lack of honesty and your paper can't be published in related conference for X years.
It frustrates me more than it should, I admit, that people always mention Rust when they talk about safety, but never Ada / SPARK. You want formal verification? Use Ada / SPARK. It has been battle-tested. It has been used for critical systems for a really long time now.
(And a compiler being formally verified vs. being able to write formally verified code means two different things.)
For example if someone on GitHub sees that the project is written in Rust, they are automatically going to assume it is safe, incorrectly so. I do not blame them though.
The concept of the borrow checker has been on a simplified version of rust https://people.mpi-sws.org/~dreyer/papers/rustbelt/paper.pdf - work has continued in this area steadily (e.g. see tree borrows)
There's a variety of tools that take rust code and translate it to something a proof system understands, and then checks that it matches a specification. AWS is leading a project to use these to verify the standard library: https://aws.amazon.com/blogs/opensource/verify-the-safety-of...
Alternatively this might be seen as a stepping stone to translating non-formally-verified C to rust, which I understand the US government has expressed a fair bit of interest in.
That seems excessive and tedious.
Concretely: spatial and temporal memory safety are good things, and Rust achieves both. It’s not unique in this regard, nor is it unique in not having a formal definition.
https://stackoverflow.com/questions/75743030/is-there-a-spec...
Sorry if my previous comment came off as dismissive, it's just that I'm getting increasingly disillusioned with the state of things in this space.
It also strikes me as extraordinarily unlikely that any formal verification effort will use the existing specification, and not build their own (using their own formal language) as they go.
Then the point is hypocritical.
Runtimes for safe programming languages have been implemented in unsafe languages since the dawn of safe programming languages, basically.
EDIT: I see now that you are the cigarette warning guy. In that case I don’t understand what this coy “I think” speculation is about when you made such a bizarre proclamation on this topic.
the translated code may (quietly) not behave correctly.
The whole point of them show that example is that they say they catch this case, and bring it to the attention of the programmer: If the original C program further relies on x, our translation will error out, and will ask the
programmer to fix their source code. This is another area where we adopt a “semi-active” approach
to verification, and declare that some patterns are poor enough, even for C, that they ought to be
touched up before the translation takes place.
No, only that it complies with a formal specification. The code will have bugs if and only if the specification has bugs.
The difficulty of writing a bug-free formal specification is just as great as writing bug-free code. In some domains it's easier to write the specification, but not in most. Fully specifying what a program shoud do is often the hardest part of programming, after all.
Formal verification is verifying a mathematical model of the program correct, according to certain mathematical correctness properties. That model may not actually represent the real world, or the correctness properties may not actually be the properties that you want to guarantee. Famously, there was the formally verified Java LinkedList implementation that was buggy due to the formal model not taking into account integer overflow.
There are parts of the Rust language and standard library that have been formally verified. That helps provide confidence that the basic model, of having safety properties upheld by unsafe core functions, to be sound.
But the big advantage of Rust, especially for everyday open source code that the world widely depends on, is that it strikes a good balance between a language that people actually want to use, and which can provide good practical safety guarantees.
Rust is far, far safer than C and C++. Most other memory-safe languages also are, but trade that off for lower performance, while Rust provides a similar level of performance as C and C++. And additionally, it's a language that people actually want to use for writing a lot of the core infrastructure.
As far as I know, I have never use any open source software written in Ada/SPARK. People just don't do it. The only reason they generally do is if they need to for functional safety reasons, in a high-criticality system, and they find it easier to get their software certified by writing it that way.
And even in safety critical systems, most people find using a subset of C, with restrictive rules, static checkers to check those, code review, and extensive testing, done by independent teams and testing on real hardware with requirements based tests that achieve 100% coverage at the MC/DC level is considered to be the gold standard of verification.
Formal verification can not possibly provide some of the safety guarantees that testing on real hardware based on actual system level requirements can provide.
Formal verification is one tool in a toolbox. It is useful for providing certain guarantees. But it's less important than many other factors for the actual improvements in safety that Rust can provide; by actually being used, in critical software that is exposed to potential network based exploitation, a memory-safe language like Rust makes a much better foundation than C or C++, and more people will actually use it in such a context than Ada/SPARK.
The point of noting that the example translation quietly does the wrong thing, is that that is the reason that it would have to be ("unconditionally") rejected.
While the paper does suggest that their example translation would be rejected:
> If the original C program further relies on x, our translation will error out
note that precisely determining whether or not the program "further relies on x" statically (at compile/translation-time) is, in general, a "Halting Problem". (I.e. Cannot be reliably done with finite compute resources.) So they would presumably have to be conservative and reject any cases were they cannot prove that the program does not "further rely on x". So it's notable that they choose to use a (provisional) translation that has to be rejected in a significant set of false positive cases.
And at least on initial consideration, it seems to me that an alternative translation could have, for example, used RefCell<>s or whatever and avoided the possibility of "quietly doing the wrong thing". (And thus, depending on your/their requirements, avoid the need for unconditional rejection.) Now, one might be an a situation where they'd want to avoid the run-time overhead and/or potential unreliability of RefCell<>s, but even then it seems to me that their translation choice does not technically avoid either of those things. Their solution allocates on the heap which has at least some theoretical run-time overhead, and could theoretically fail/panic.
Now I'm not concluding here that their choice is not the right one for their undertaking. I'm just suggesting that choosing a (provisional) translation that has to be rejected with significant false positives (because it might quietly do the wrong thing) is at least initially notable. And that there are other solutions out there that demonstrate translation of C to a (high-performance, deterministic) memory-safe language/dialect that don't have the same limitations.
Maybe you should wear a sign on your torso: will make irrational points and then complain about persecution when that is pointed out. I don’t know. It’s just one more idea for when we decide to put all the anti-Rust people in camps.