←back to thread

597 points pizlonator | 2 comments | | HN request time: 0.421s | source
Show context
pcfwik ◴[] No.45134476[source]
Given the goal is to work with existing C programs (which already have free(...) calls "carefully" placed), and you're already keeping separate bounds info for every pointer, I wonder why you chose to go with a full GC rather than lock-and-key style temporal checking[1]? The latter would make memory usage more predictable and avoid the performance overhead and scheduling headaches of a GC.

Perhaps storing the key would take too much space, or checking it would take too much time, or storing it would cause race condition issues in a multithreaded setting?

[1] https://acg.cis.upenn.edu/papers/ismm10_cets.pdf

replies(2): >>45134484 #>>45134487 #
pcfwik ◴[] No.45134487[source]
Also find it interesting that you're allowing out-of-bounds pointer arithmetic as long as no dereference happens, which is a class of UB compilers have been known to exploit ( https://stackoverflow.com/questions/23683029/is-gccs-option-... ). Do you disable such optimizations inside LLVM, or does Fil-C avoid this entirely by breaking pointers into pointer base + integer offset (in which case I wonder if you're missing out on any optimizations that work specifically on pointers)?
replies(1): >>45134511 #
pizlonator ◴[] No.45134511[source]
For starters, llvm is a lot less willing to exploit that UB

It’s also weird that GCC gets away with this at all as many C programs in Linux that compile with GCC make deliberate use of out of bounds pointers.

But yeah, if you look at my patch to llvm, you’ll find that:

- I run a highly curated opt pipeline before instrumentation happens.

- FilPizlonator drops flags in LLVM IR that would have permitted downstream passes to perform UB driven optimizations.

- I made some surgical changes to clang CodeGen and some llvm passes to fix some obvious issues from UB

But also let’s consider what would happen if I hadn’t done any of that except for dropping UB flags in FilPizlonator. In that case, a pass before pizlonation would have done some optimization. At worst, that optimization would be a logic error or it would induce a Fil-C panic. FilPizlonator strongly limits UB to its “memory safe subset” by construction.

I call this the GIMSO property (garbage in, memory safety out).

replies(2): >>45134754 #>>45134818 #
1. kartoffelsaft ◴[] No.45134754[source]
Not knowing the exact language used by the C standard, I suspect the reason GCC doesn't cause these issues with most programs is that the wording of "array object" refers specifically to arrays with compile-time-known sizes, i.e. `int arr[4]`. Most programs that do out of bounds pointer arithmetic are doing so with pointers from malloc/mmap/similar, which might have similar semantics to arrays but are not arrays.
replies(1): >>45134764 #
2. pizlonator ◴[] No.45134764[source]
Yes, I think you're right