Understanding Memory Management, Part 1: C

(educatedguesswork.org)

237 points ekr____ | 5 comments | 13 Jan 25 14:16 UTC | HN request time: 1.013s | source

Show context

samsquire ◴[16 Jan 25 12:16 UTC] No.42724271[source]▶

Thanks for such a detailed article.

In my spare time working with C as a hobby I am usually in "vertical mode" which is different to how I would work (carefully) at work, which is just getting things done end-to-end as fast as possible, not careful at every step that we have no memory errors. So I am just trying to get something working end-to-end so I do not actually worry about memory management when writing C. So I let the operating system handle memory freeing. I am trying to get the algorithm working in my hobby time.

And since I wrote everything in Python or Javascript initially, I am usually porting from Python to C.

If I were using Rust, it would force me to be careful in the same way, due to the borrow checker.

I am curious: we have reference counting and we have Profile guided optimisation.

Could "reference counting" be compiled into a debug/profiled build and then detect which regions of time we free things in before or after (there is a happens before relation with dropping out of scopes that reference counting needs to run) to detect where to insert frees? (We Write timing metadata from the RC build, that encapsulates the happens before relationships)

Then we could recompile with a happens-before relation file that has correlations where things should be freed to be safe.

EDIT: Any discussion about those stack diagrams and alignment should include a link to this wikipedia page;

https://en.wikipedia.org/wiki/Data_structure_alignment

replies(4): >>42724597 #>>42724727 #>>42724802 #>>42725393 #

jvanderbot ◴[16 Jan 25 13:04 UTC] No.42724597[source]▶

>>42724271 #

> which is just getting things done end-to-end as fast as possible, not careful at every step that we have no memory errors.

One horrible but fun thing a former professor of mine pointed out: If your program isn't going to live long, then you never have to deallocate memory. Once it exits, the OS will happily clean it up for you.

This works in C or perhaps lazy GC languages, but for stateful objects where destructors do meaningful work, like in C++, this is dangerous. This is one of the reasons I hate C++ so much: Unintended side effects that you have to trigger.

> Could "reference counting" be compiled into a debug/profiled build and then detect which regions of time we free things in before or after (there is a happens before relation with dropping out of scopes that reference counting needs to run) to detect where to insert frees?

This is what Rust does, kinda.

C++ also does this with "stack" allocated objects - it "frees" (calls destructor and cleans up) when they go out of scope. And in C++, heap allocated data (if you're using a smart pointer) will automatically deallocate when the last reference drops, but this is not done at compile time.

Those are the only two memory management models I'm familiar with enough to comment on.

replies(2): >>42728876 #>>42729143 #

1. pjmlp ◴[16 Jan 25 18:40 UTC] No.42729143[source]▶

>>42724597 #

The wonders of corrupted data, stale advisory locks and UNIX IPC leftovers, because they weren't properly flushed, or closed before process termination.

replies(1): >>42730360 #

2. jvanderbot ◴[16 Jan 25 20:19 UTC] No.42730360[source]▶

>>42729143 (TP) #

I'll narrow my scope more explicitly:

close(x) is not memory management - not at the user level. This should be done.

free(p) has no O/S side effects like this in C - this can be not-done if you don't malloc all your memory.

You can get away with not de-allocating program memory, but (as mentioned), that has nothing to do with freeing Os/ kernel / networking resources in C.

replies(1): >>42731620 #

3. PhilipRoman ◴[16 Jan 25 22:16 UTC] No.42731620[source]▶

>>42730360 #

Most kernel resources are fairly well behaved, as they will automatically decrement their refcount when a process exits. Even mutexes have a "robust" flag for this exact reason. Programs which rely on destructors or any other form or orderly exit are always brittle and should be rewritten to use atomic operations.

replies(1): >>42734826 #

4. pjmlp ◴[17 Jan 25 06:56 UTC] No.42734826{3}[source]▶

>>42731620 #

Which kernel, on which specific OS?

This is a very non portable assumption, even we constrain it to only across UNIX/POSIX flavours.

replies(1): >>42735327 #

5. PhilipRoman ◴[17 Jan 25 08:39 UTC] No.42735327{4}[source]▶

>>42734826 #

As far as assumptions go, it's actually one of the most portable ones and for a good reason, considering it is a basic part of building a reliable system. Quoting POSIX:

Consequences of Process Termination

Process termination caused by any reason shall have the following consequences:

[..] All of the file descriptors, directory streams, conversion descriptors, and message catalog descriptors open in the calling process shall be closed.

[..] Each attached shared-memory segment is detached and the value of shm_nattch (see shmget()) in the data structure associated with its shared memory ID shall be decremented by 1.

For each semaphore for which the calling process has set a semadj value (see semop()), that value shall be added to the semval of the specified semaphore.

[..] If the process is a controlling process, the controlling terminal associated with the session shall be disassociated from the session, allowing it to be acquired by a new controlling process.

[..] All open named semaphores in the calling process shall be closed as if by appropriate calls to sem_close().

Any memory locks established by the process via calls to mlockall() or mlock() shall be removed. If locked pages in the address space of the calling process are also mapped into the address spaces of other processes and are locked by those processes, the locks established by the other processes shall be unaffected by the call by this process to _Exit() or _exit().

Memory mappings that were created in the process shall be unmapped before the process is destroyed.

Any blocks of typed memory that were mapped in the calling process shall be unmapped, as if munmap() was implicitly called to unmap them.

All open message queue descriptors in the calling process shall be closed as if by appropriate calls to mq_close().

↑