←back to thread

55 points anqurvanillapy | 5 comments | | HN request time: 1.896s | source

Hi I'm Anqur, a senior software engineer with different backgrounds where development in C was often an important part of my work. E.g.

1) Game: A Chinese/Vietnam game with C/C++ for making server/client, Lua for scripting [1]. 2) Embedded systems: Switch/router with network stack all written in C [2]. 3) (Networked) file system: Ceph FS client, which is a kernel module. [3]

(I left some unnecessary details in links, but are true projects I used to work on.)

Recently, there's a hot topic about Rust and C in kernel and a message [4] just draws my attention, where it talks about the "Rust" experiment in kernel development:

> I'd like to understand what the goal of this Rust "experiment" is: If we want to fix existing issues with memory safety we need to do that for existing code and find ways to retrofit it.

So for many years, I keep thinking about having a new C dialect for retrofitting the problems, but of C itself.

Sometimes big systems and software (e.g. OS, browsers, databases) could be made entirely in different languages like C++, Rust, D, Zig, etc. But typically, like I slightly mentioned above, making a good filesystem client requires one to write kernel modules (i.e. to provide a VFS implementation. I do know FUSE, but I believe it's better if one could use VFS directly), it's not always feasible to switch languages.

And I still love C, for its unique "bare-bone" experience:

1) Just talk to the platform, almost all the platforms speak C. Nothing like Rust's PAL (platform-agnostic layer) is needed. 2) Just talk to other languages, C is the lingua franca (except Go needs no libc by default). Not to mention if I want WebAssembly to talk to Rust, `extern "C"` is need in Rust code. 3) Just a libc, widely available, write my own data structures carefully. Since usually one is writing some critical components of a bigger system in C, it's just okay there are not many choices of existing libraries to use. 4) I don't need an over-generalized generics functionality, use of generics is quite limited.

So unlike a few `unsafe` in a safe Rust, I want something like a few "safe" in an ambient "unsafe" C dialect. But I'm not saying "unsafe" is good or bad, I'm saying that "don't talk about unsafe vs safe", it's C itself, you wouldn't say anything is "safe" or "unsafe" in C.

Actually I'm also an expert on implementing advanced type systems, some of my works include:

1) A row-polymorphic JavaScript dialect [5]. 2) A tiny theorem prover with Lean 4 syntax in less than 1K LOC [6]. 3) A Rust dialect with reuse analysis [7].

Language features like generics, compile-time eval, trait/typeclass, bidirectional typechecking are trivial for me, I successfully implemented them above.

For the retrofitted C, these features initially come to my mind:

1) Code generation directly to C, no LLVM IR, no machine code. 2) Module, like C++20 module, to eliminate use of headers. 3) Compile-time eval, type-level computation, like `malloc(int)` is actually a thing. 4) Tactics-like metaprogramming to generate definitions, acting like type-safe macros. 5) Quantitative types [8] to track the use of resources (pointers, FDs). The typechecker tells the user how to insert `free` in all possible positions, don't do anything like RAII. 6) Limited lifetime checking, but some people tells me lifetime is not needed in such a language.

Any further insights? Shall I kickstart such project? Please I need your ideas very much.

[1]: https://vi.wikipedia.org/wiki/V%C3%B5_L%C3%A2m_Truy%E1%BB%81...

[2]: https://e.huawei.com/en/products/optical-access/ma5800

[3]: https://docs.ceph.com/en/reef/cephfs/

[4]: https://lore.kernel.org/rust-for-linux/Z7SwcnUzjZYfuJ4-@infr...

[5]: https://github.com/rowscript/rowscript

[6]: https://github.com/anqurvanillapy/TinyLean

[7]: https://github.com/SchrodingerZhu/reussir-lang

[8]: https://bentnib.org/quantitative-type-theory.html

Show context
ryao ◴[] No.43176221[source]
Here is a sound static analyzer that can identify all memory safety bugs in C/C++ code, among other kinds of bugs:

https://www.absint.com/astree/index.htm

You can use it to produce code that is semi-formally verified to be safe, with no need for extensions. It is used in the aviation and nuclear industries. Given that it is used only by industries where reliability is so important that money is no object, I never bothered to ask them how much it costs. Few people outside of those industries knows that it exists. It is a shame that the open source alternatives only support subsets of what it supports. The computing industry is largely focused on unsound approaches that are easier to do, but do not catch all issues.

If you want extensions, here is a version of C that relies on hardware features to detect pointer dereferences to the wrong places through capabilities:

https://github.com/CTSRD-CHERI/cheri-c-programming

It requires special CHERI hardware, although the hardware does exist.

replies(1): >>43178287 #
1. AlotOfReading ◴[] No.43178287[source]
Astree is a pain in the butt. Even if it were free, I'd recommend it to very few people. It's not usable without someone (often a team) being responsible for it full time.

TrustInSoft is the higher quality option, polyspace is the more popular option, and IKOS is probably the best open source option. I've also had luck with tools from Galois Inc and the increasingly dated rv-match tool.

replies(2): >>43179970 #>>43181627 #
2. ryao ◴[] No.43179970[source]
Tell me more.
3. edwcross ◴[] No.43181627[source]
Funny you mentioned TrustInSoft but not its open-source origin (which is still under development and evolving in a different direction), Frama-C. I cannot compare it to IKOS, but their usefulness depends a lot on the type of code and verification needs.
replies(1): >>43184455 #
4. AlotOfReading ◴[] No.43184455[source]
I want to like Frama-C, but I've never managed to actually use it successfully on real projects and the repeated experiences have soured me on it. Getting a recent version installed was a serious chore last time I tried, and no one else is willing to deal with ACSL to get the full value.
replies(1): >>43205789 #
5. KsassPeuk ◴[] No.43205789{3}[source]
At some point in the past you also posted the ACSL cast badly documented. I asked you what you needed exactly, but you probably missed the message. So let me try again:

What kind of documentation would you expect in addition to ACSL by Example and the WP tutorial?