←back to thread

597 points pizlonator | 2 comments | | HN request time: 0.001s | source
Show context
kragen ◴[] No.45135095[source]
Hmm, Fil-C seems potentially really important; there's a lot of software that only exists in the form of C code which it's important to preserve access to, even if the tradeoffs made by conventional C compilers (accepting large risks of security problems in exchange for a small improvement in single-core performance) have largely become obsolete.

The list of supported software is astounding: CPython, SQLite, OpenSSH, ICU, CMake, Perl5, and Bash, for example. There are a lot of things in that list that nobody is likely to ever rewrite in Rust.

I wonder if it's feasible to use Fil-C to do multitasking between mutually untrusted processes on a computer without an MMU? They're making all the right noises about capability security and nonblocking synchronization and whatnot.

Does anyone have experience using it in practice? I see that https://news.ycombinator.com/item?id=45134852 reports a 4× slowdown or better.

The name is hilarious. Feelthay! Feelthay!

replies(8): >>45135151 #>>45135324 #>>45135967 #>>45137459 #>>45139406 #>>45139586 #>>45139998 #>>45140957 #
pizlonator ◴[] No.45135151[source]
> I wonder if it's feasible to use Fil-C to do multitasking between mutually untrusted processes on a computer without an MMU?

You could. That said, FUGC’s guts rely on OS features that in turn rely on an MMU.

But you could make a version of FUGC that has no such dependency.

As for perf - 4x is the worst case and that number is out there because I reported it. And I report worst case perf because that’s how obsessive I am about realistically measuring, and then fanatically resolving, perf issues

Fact is, I can live on the Fil-C versions of a lot of my favorite software and not tell the difference

replies(5): >>45135179 #>>45135182 #>>45135319 #>>45136221 #>>45136740 #
willvarfar ◴[] No.45135319[source]
When you run the Fil-C versions of your favourite software, does it have a sanitizer mode that reports bugs like missing free() etc? And have you found any bugs this way?
replies(1): >>45135331 #
pizlonator ◴[] No.45135331[source]
Well missing free is just swallowed by the GC - the leak gets fixed without any message.

I have found bugs in the software that I’ve ported, yeah.

replies(2): >>45135508 #>>45135860 #
writebetterc ◴[] No.45135860[source]
To add on top of this: This is a tracing GC. It only ever visits the live data, not the dead data. In other words, it would need a lot more special support if it wanted to report the dead objects.
replies(2): >>45135872 #>>45136163 #
kragen ◴[] No.45135872[source]
Really? How does a non-moving GC make dead objects available for reallocation without visiting them?
replies(2): >>45136460 #>>45136527 #
torginus ◴[] No.45136460[source]
Why would it need to visit them? It just marks the address ranges as available in its internal bookkeeping (bitmaps etc).
replies(1): >>45136611 #
kragen ◴[] No.45136611[source]
In the general case there are as many newly available address ranges as dead objects, so that counts as visiting them in this context.
replies(2): >>45136831 #>>45136911 #
torginus ◴[] No.45136831[source]
I don't think that's a definition of 'visit' most people would agree with.

I'm actually working on my own language that has a non-moving GC. It uses size classes (so 16 byte objects, 32 byte objects etc.), each of which is allocated in a continous slab of memory. Occupancy is determined by a bitmap, 1 bit for each slot in the slab.

The GC constructs a liveness bitmap for the size class, and the results are ANDed together, 'freeing' the memory. If you fill the slab with dead objects, then run the GC, it will not walk anywhere on this slab, create an all zero liveness bitmap, and free the memory.

replies(1): >>45137165 #
kragen ◴[] No.45137165[source]
That's an awesome project! Is your GC generational despite being non-moving? What are your main objectives for the project?

The liveness bitmap approach is pretty widespread at this point; jemalloc works the same way IIRC.

Still, I think that counts as "visiting" in the context of this discussion: https://news.ycombinator.com/item?id=45137139

replies(2): >>45137286 #>>45137528 #
writebetterc ◴[] No.45137286[source]
I don't think it counts as visiting, as you never look at the dirtied bitmap during GC, only during allocation. That means, you don't actually know if a dirty bit represents a different object or not (if a 16-byte size class is allowed to have 32-byte objs in it, for example). To know that you'd either have to have strict size classes, or you'd have to have object headers for specifying the start of an object.

I agree that it's easy to add in a visitation pass, where you take the bitmap of live objects after marking and diff it with the currently existing one in order to signal that you might have a leak.

So basically, I think we're like 99% in agreement.

replies(1): >>45137427 #
kragen ◴[] No.45137427[source]
It's always nice when the impact of collision of opposing opinions gives rise to the spark of mutual understanding rather than merely inflaming base emotions.

Typically bitmap-based allocators don't actually allow a 16-byte size class to have 32-byte objects in it, but I haven't looked at FUGC to see if that's true of it.

replies(2): >>45137719 #>>45138473 #
1. torginus ◴[] No.45137719[source]
I toyed with the idea of allowing this, in bitmaps, it's pretty easy and efficient to find contiguous areas with bit twiddling hacks, for example

//assume free map is the bitmap where 1 means free

uint32_t free_map;

uint32_t free_map_2 = (free_map & (free_map >> 1)); // so on and so forth

I haven't really done anything like this yet, it has certain disadvantages, but you can pack multiple size classes into the same bitmap, you do a bit more work during alloc and resolving interior pointers is a bit more costly (if you have those), in exchange for having less size classes.

replies(1): >>45137923 #
2. kragen ◴[] No.45137923[source]
Sure, to find contiguous chunks of 6 slots within a single word, you can do

    t &= t << 1;
    t &= t << 2;
    t &= t << 2;
and that sort of thing is pretty appealing, but you lose the ability to know what size an object is just by looking at an address, and it's still a lot slower than scanning for an open slot in a page of 5× bigger objects.

Should I assume from your use of uint32_t that you're targeting embedded ARM microcontrollers?