Fil's Unbelievable Garbage Collector

(fil-c.org)

597 points pizlonator | 2 comments | 05 Sep 25 00:55 UTC | HN request time: 0s | source

Show context

crawshaw ◴[05 Sep 25 02:56 UTC] No.45134578[source]▶

It is great that Fil-C exists. This is the sort of technique that is very effective for real programs, but that developers are convinced does not work. Existence proofs cut through long circular arguments.

replies(2): >>45134840 #>>45135366 #

johncolanduoni ◴[05 Sep 25 03:52 UTC] No.45134840[source]▶

>>45134578 #

What do the benchmarks look like? My main concern with this approach would be that the performance envelope would eliminate it for the use-cases where C/C++ are still popular. If throughput/latency/footprint are too similar to using Go or what have you, there end up being far fewer situations in which you would reach for it.

replies(1): >>45134852 #

pizlonator ◴[05 Sep 25 03:56 UTC] No.45134852[source]▶

>>45134840 #

Some programs run as fast as normally. That's admittedly not super common, but it happens.

Some programs have a ~4x slowdown. That's also not super common, but it happens.

Most programs are somewhere in the middle.

> for the use-cases where C/C++ are still popular

This is a myth. 99% of the C/C++ code you are using right now is not perf sensitive. It's written in C or C++ because:

- That's what it was originally written in and nobody bothered to write a better version in any other language.

- The code depends on a C/C++ library and there doesn't exist a high quality binding for that library in any other language, which forces the dev to write code in C/C++.

- C/C++ provides the best level of abstraction (memory and syscalls) for the use case.

Great examples are things like shells and text editors, where the syscalls you want to use are exposed at the highest level of fidelity in libc and if you wrote your code in any other language you'd be constrained by that language's library's limited (and perpetually outdated) view of those syscalls.

replies(8): >>45134950 #>>45135063 #>>45135080 #>>45135102 #>>45135517 #>>45136755 #>>45137524 #>>45143638 #

monkeyelite ◴[05 Sep 25 04:19 UTC] No.45134950{3}[source]▶

>>45134852 #

You are making a lot of assumptions about my code.

replies(1): >>45134972 #

pizlonator ◴[05 Sep 25 04:23 UTC] No.45134972{4}[source]▶

>>45134950 #

I'm not meaning to. I've ported a lot of programs to Fil-C and I'm reacting to what I learn.

I am curious though. What assumptions do you think I'm making that you think are invalid?

replies(1): >>45135065 #

monkeyelite ◴[05 Sep 25 04:49 UTC] No.45135065{5}[source]▶

>>45134972 #

- that 4x would not impact user experience - that my code is on a Unix time sharing system - that I only use C or C++ because I inherited it - that Unix tools do not benefit from efficient programming because of syscalls - that multi-threaded garbage collection would be good for perf (assuming I’m not sharing the system)

replies(1): >>45135130 #

pizlonator ◴[05 Sep 25 05:01 UTC] No.45135130{6}[source]▶

>>45135065 #

You are posting on HN in a browser presumably. I am familiar with the stack of C/C++ code involved in that because I was a browser dev for 10+ years. Most of that code is definitely not perf sensitive in the sense that if you slowed it down by 4x, you might not notice most of the time

(Browser performance is like megapixels or megahertz … a number that marketing nerds can use to flex, but that is otherwise mostly irrelevant)

When I say 99% of the C code you use, I mean “use” as a human using a computer, not “use” as a dependency in your project. I’m not here to tell you that your C or C++ project should be compiled with Fil-C. I am here to tell you that most of the C/C++ programs you use as an end user could be compiled with Fil-C and you wouldn’t experience an degraded experience if that happened

replies(3): >>45136288 #>>45136593 #>>45136778 #

gf000 ◴[05 Sep 25 09:49 UTC] No.45136778{7}[source]▶

>>45135130 #

This discussion is absolutely meaningless without specifying what kind of software we are talking about.

4x slowdown may be absolutely irrelevant in case of a software that spends most of its time waiting on IO, which I would wager a good chunk of user-facing software does. Like, if it has an event loop and does a 0.5 ms calculation once every second, doing the same calculation in 2 ms is absolutely not-noticeable.

For compilers, it may not make as much sense (not even due to performance reasons, but simply because a memory issue taking down the program would still be "well-contained", and memory leaks would not matter much as it's a relatively short-lived program to begin with).

And then there are the truly CPU-bound programs, but seriously, how often do you [1] see your CPU maxed out for long durations on your desktop PC?

[1] not you, pizlonator, just joining the discussion replying to you

replies(1): >>45137319 #

monkeyelite ◴[05 Sep 25 11:23 UTC] No.45137319{8}[source]▶

>>45136778 #

This IO bound myth is commonly repeated - yet most software executes in time many multiples above the IO work. Execution time is summed and using a language like C lets you better control your data and optimize IO resources.

replies(1): >>45137392 #

gf000 ◴[05 Sep 25 11:36 UTC] No.45137392{9}[source]▶

>>45137319 #

Well, software is not like a traditional Turing machine of having an input, buzzing a bit, and returning a response.

They are most commonly running continuously, and reacting to different events.

You can't do the IO work that depends on a CPU work ahead of time, and neither can you do CPU work that depends on IO. You have a bunch of complicated interdependencies between the two, and the execution time is heavily constrained by this directed graph. No matter how efficient your data manipulation algorithm is, if you still have to wait for it to load from the web/file.

Just draw a Gantt chart and sure, sum the execution time. My point is that due to interdependencies you will have a longest lane and no matter what you do with the small CPU parts, you can only marginally affect the whole.

It gets even more funny with parallelism (this was just concurrency yet), where a similar concept is named Amdahl's law.

And I would even go as far and claim that what you may win by C you often lose several-folds due to going with a simpler parallelism model for fear of Segfaults, which you could fearlessly do in a higher-level language.

replies(1): >>45145863 #

1. monkeyelite ◴[06 Sep 25 01:50 UTC] No.45145863{10}[source]▶

>>45137392 #

> you often lose several-folds due to going with a simpler parallelism model for fear of Segfaults, which you could fearlessly do in a higher-level language.

Wait - what was that part about Amdahl's law?

Also segfaults are unrelated to parallelism.

replies(1): >>45147491 #

2. gf000 ◴[06 Sep 25 08:07 UTC] No.45147491[source]▶

>>45145863 (TP) #

Amdahl's law was about the potential speedup from going parallel being limited by parts that must be serial. Nothing controversial here - many tasks can be parallelized just fine.

My point is that you often see a simpler algorithm/data structure in C for fear of a memory issue/not getting some edge case right.

What part are you disagreeing with? That parallel code has more gotchas, that make a footgun-y language even more prone to failures?

↑