Why did dlclose not unload the library? (2023)

(kishoreganesh.com)

1. abstractspoon ◴[27 Aug 25 08:00 UTC] No.45036738[source]▶

What if lib.B had another dependency? To deal with this dlclose would have to query every loaded module for its dependencies and then decide if lib.B could safely be unloaded.

What if lib.B had been loaded explicitly somewhere else such that it did not appear in any other module's dependency list?

replies(1): >>45075340 #

2. sgbeal ◴[30 Aug 25 13:34 UTC] No.45074556[source]▶

>>45034148 (OP) #

There is no safe, reliable, cross-environment way to deal with closing a DLL. A DLL initialization function can allocate arbitrary resources, some of which may be in use by clients of the DLL when it is closed.

The only safe, consistent, reliable approach is not to close DLLs.

replies(6): >>45074649 #>>45074720 #>>45075478 #>>45075558 #>>45075726 #>>45080214 #

3. 10000truths ◴[30 Aug 25 13:43 UTC] No.45074649[source]▶

>>45074556 #

You can run the DLL in a "shim" subprocess that proxies function calls over IPC. Then the DLL can muck about with global state all it wants, and the OS will clean up after it when you "unload" the DLL by killing the subprocess.

replies(1): >>45074914 #

4. AndrewStephens ◴[30 Aug 25 13:54 UTC] No.45074720[source]▶

>>45074556 #

This is the way.

As this article details, there are so many circumstances that preclude DLLs being unloaded completely that I was surprised that their design actually worked at all. So many language constructs do not play nicely with the idea that code and static data can just disappear at runtime.

5. sgbeal ◴[30 Aug 25 14:21 UTC] No.45074914{3}[source]▶

>>45074649 #

> You can run the DLL in a "shim" subprocess that proxies function calls over IPC.

That doesn't address the need of some DLLs to malloc() resources in the context of the applications linking to them.

This problem _cannot_ be solved _generically_. Any solutions are extremely API-specific and impose restrictions on their users (the linking applications) which, if violated, will lead to Undefined Behavior.

Edit: as an example of cases which must bind resources in the application's context: see the Classloading in C++ paper at <https://wanderinghorse.net/computing/papers/index.html#class...> (disclosure: i wrote that article).

replies(2): >>45075196 #>>45075239 #

6. duped ◴[30 Aug 25 14:26 UTC] No.45074956[source]▶

>>45034148 (OP) #

I'm curious what design led them to split code between two shared libraries but also require the state of them to be synchronized across calls to dlopen and dlclose. It sounds like that state should be in one library and not two.

replies(1): >>45075087 #

7. theamk ◴[30 Aug 25 14:43 UTC] No.45075087[source]▶

>>45074956 #

they kinda mention this - one of those is Rust, other is C++.

My guess they made an all-new Rust plugin which used an existing C++ library. Pretty common case when existing code base is slowly being converted to rust.

replies(2): >>45075459 #>>45078985 #

8. 10000truths ◴[30 Aug 25 14:55 UTC] No.45075196{4}[source]▶

>>45074914 #

True, it's application specific. The typical reason an application would need to load/unload DLLs cleanly and repeatedly (where the caveats of dlclose are in full force) is for a plugin system with hot-reload functionality. And for that use case, the expected API/ABI is known in advance, so the shim can be tailored for it.

9. LegionMammal978 ◴[30 Aug 25 15:00 UTC] No.45075239{4}[source]▶

>>45074914 #

"Extremely" API-specific is a bit much. One way to do it would be to guard all DLL-implemented functions, and all access to DLL-associated resources, with a reference-counted token, or otherwise use a language-level mechanism (such as Rust's lifetimes) to ensure that all resources are dead by the time of closure. This would take some tedious wrapper-writing, but it's not so complex that it couldn't be done by a source generator or whatever.

Of course, C/C++ applications written in the traditional model with static data everwhere would have difficulty not leaking tokens and holding the DLL open, but it's still far from impossible to write such a safe API.

> That doesn't address the need of some DLLs to malloc() resources in the context of the applications linking to them.

If there is a context boundary and really such a need, then the DLL can keep a list of all such resources, and destroy all those resources once closed. Access to them would similarly have to be protected by a token.

replies(1): >>45075607 #

10. LegionMammal978 ◴[30 Aug 25 15:11 UTC] No.45075340[source]▶

>>45036738 #

Do you mean, what if lib.B had another dependent? Every loaded library has a reference count, which counts both dependents and explicit dlopen() calls. Unless the caller is the last user (and the other conditions are satisfied), dlclose() has no effect except to decrement the reference count.

11. 01HNNWZ0MV43FF ◴[30 Aug 25 15:28 UTC] No.45075459{3}[source]▶

>>45075087 #

I guess the C++ library might be binary-only or difficult to compile but, for curious readers, you can definitely link C++ and Rust into a single binary library or exe

12. dataflow ◴[30 Aug 25 15:29 UTC] No.45075478[source]▶

>>45074556 #

I feel like I wouldn't frame it like this. Rather, the underlying assumption of the problem has to be that the DLL's resources are already released, otherwise the problem of what happens when the resources are used afterward is itself ill-posed. The problem is really how to ensure that.

13. zbentley ◴[30 Aug 25 15:33 UTC] No.45075506[source]▶

>>45034148 (OP) #

Not only is dlclose(3) not guaranteed to close/unload the library, close(2) isn't guaranteed to close a file!.

If the same file descriptor is open in another process (e.g. one that has it via fork(2) FD sharing either without child processes exec(2)ing or with CLOEXEC not set, or some older and esoteric abuses of fdpassing over UNIX sockets), then close(2) just decrements a refcount. The actual file isn't closed until the last holder of a reference to its file descriptor calls close(2).

This is rarely relevant, but when it comes up, it sure is wild. Common sources of confusion and pain due to this behavior are: files opened for "global library reasons" (e.g. /dev/shm buffers) in preforking servers, signalfd descriptors in preforking servers, processes that fork off and daemonize but leave their spawner around for e.g. CoW memory sharing efficiencies (looking at you, Python multiprocessing backends--the docs make it sound like they all act more or less the same, but in this regard they very much do not), libraries that usually swap STDIN/OUT/ERR file descriptors around with CLOEXEC disabled for a manually fork+exec'd child (e.g. situations where posix_spawn doesn't support needed pre-exec setup) but that are then used as part of larger applications that fork/spawn for other reasons and don't realize that file descriptor allocation/forking needs care and synchronization with the manually-fork/execing library in question: mixing forks and threads is one of those things that everyone says is a fast-track ticket to nasal demons, but that everyone also does regularly, I've found--if this describes you, be careful!

If you end up in one of those situations, suddenly invariants like "I called unlink on this path and then close(2)'d the descriptor to it, so that (maybe large) chunk of allocated space isn't taking up space on the filesystem/buffers any more" and "this externally-observable lockfile/directory is now unlocked due to its absence, now processes coordinating using that file on NFS will work as expected" no longer hold.

https://www.ibm.com/docs/en/aix/7.1.0?topic=domains-unix-dom...

I know that close(2)'s weirdness isn't a superset of dlopen(3)'s and there are different reasons for both behaviors. But it's still interesting that they "rhyme" as it were.

replies(2): >>45075721 #>>45077203 #

14. immibis ◴[30 Aug 25 15:39 UTC] No.45075558[source]▶

>>45074556 #

There is no safe, reliable, cross-environment way to deal with deallocating memory. A memory block can be referenced from arbitrary locations, some of which may be on the stack by clients of the memory block when it isndeallocated.

The only safe, consistent, reliable approach is not to deallocate memory.

replies(3): >>45075570 #>>45076060 #>>45077479 #

15. zbentley ◴[30 Aug 25 15:40 UTC] No.45075570{3}[source]▶

>>45075558 #

You joke, but there are programs that do exactly this.

16. zbentley ◴[30 Aug 25 15:45 UTC] No.45075607{5}[source]▶

>>45075239 #

> One way to do it would be to guard all DLL-implemented functions, and all access to DLL-associated resources, with a reference-counted token, or otherwise use a language-level mechanism (such as Rust's lifetimes) to ensure that all resources are dead by the time of closure.

That's true, but those approaches are only viable if you trust the DLL in question. External libraries are fundamentally opaque/could contain anything, and if you're in a tinfoil-hat mood, it's quite easy to make new libraries that emulate the ABI of the intended library but do different (maybe malicious, maybe just LD_PRELOAD tricksy) things.

Consider: an evil wrapper library could put the thinnest possible shim around the "real" version of the library and just not properly account for resources, exposing library (un)loaders to use-after-free without much work, even if the library loaders relied upon the approaches proposed.

Since there aren't good cross-platform and race-condition-free ways of saying "authenticate this external library via checksum/codesigning, then load it", there are some situations where the proposed approaches aren't good enough.

Sure, most situations probably don't need that paranoia level (or control the code/provenance of the target library implicitly). But the number of situations where that security risk does come up is larger than you'd think, especially given automatic look-up-library-by-name-via-LD_LIBRARY_PATH-ish behavior.

replies(2): >>45076007 #>>45076023 #

17. colanderman ◴[30 Aug 25 16:00 UTC] No.45075721[source]▶

>>45075506 #

Terminology nit: "file descriptor" is the reference itself. "Open file description" is the thing referenced. dup(2) and fork(2) create new file descriptors which reference the same underlying open file descriptions.

18. justincormack ◴[30 Aug 25 16:02 UTC] No.45075726[source]▶

>>45074556 #

thats why Musl libc has dlclose as a no-op [0]

[0] https://wiki.musl-libc.org/functional-differences-from-glibc...

19. LegionMammal978 ◴[30 Aug 25 16:43 UTC] No.45076007{6}[source]▶

>>45075607 #

If you load malicious code into your address space and execute it, then it can always do malicious things to your data. If you load malicious code into a separate process and execute it, then it can almost certainly do malicious things to your data, unless you put it into a locked-down user context and trust your OS and environment not to have any local privilege escalations (lol). The only real way to load untrusted native code is to put it in an OS-level container and communicate via IPC, or better yet, put it in a VM and communicate via a virtual network.

The measures I suggested before were all in the context of buggy users that can't resist the urge to keep references to the library's resources lying around all over the place. But untrusted code can never be made safe with anything short of a strong sandbox.

20. johnisgood ◴[30 Aug 25 16:45 UTC] No.45076023{6}[source]▶

>>45075607 #

> Since there aren't good cross-platform and race-condition-free ways of saying "authenticate this external library via checksum/codesigning, then load it", there are some situations where the proposed approaches aren't good enough.

Sign your libraries with Ed25519 and embed the public key in your app, verify before load. How is this not cross-platform enough?

Of course you still introduce a TOCTOU (time of check, time of use) race condition, which is why oftentimes you want to first check, load, then check again.

A common solution, however, is opening the library file once, then verify checksum/signature against trusted key, and if valid, create a private, unlinked temporary file (O_TMPFILE on Linux), write the verified contents into this temporary file, rewind and dlopen() (or LoadLibrary()) this temporary copy. Because the file is unlinked after creation (or opened with O_TMPFILE), no one else can swap it out, and you eliminate TOCTOU this way because you only ever read and load the exact bytes you verified. This is how container runtimes and some plugin systems avoid races. BTW on Linux you can use memfd_create() which creates an anonymous, in-memory file descriptor. You can do the same on Windows and macOS. Then you can verify the library's signature / hash, copy verified contents into a memfd (Linux) or FileMapping (Windows), and then load directly from that memory-backed handle.

TL;DR: never load from a mutable path after verification. Verifying untrusted binary bytes into a sealed memfd, for example, is race-safe.

FWIW, for applications I use firejail (not bubblewrap) for all applications such as my browser, Discord, LibreOffice, mupdf, etc. I recommend everyone to do the same. No way in hell I will give my browser access to files it does not need access to. It only has access to what it needs (related to pulseaudio, Downloads directory, etc), and say, no way I will give Discord access to my downloaded files (or my browser history) or anything really, apart from a directory where I put files I want to send.

replies(1): >>45076400 #

21. SkiFire13 ◴[30 Aug 25 16:50 UTC] No.45076060{3}[source]▶

>>45075558 #

The difference is that memory won't do anything under your nose, it can't run arbitrary code by itself. It won't spawn threads, create thread locals, or store data in global variables. And it's normal to track the lifetime of memory, much less the lifetime of code and function pointers passed around.

replies(2): >>45077149 #>>45078799 #

22. ◴[30 Aug 25 17:24 UTC] No.45076400{7}[source]▶

>>45076023 #

23. pharrington ◴[30 Aug 25 18:18 UTC] No.45076851[source]▶

>>45034148 (OP) #

Call the library's shutdown procedure before you dlclose. A dynamically linked library is a resource like any other. Ya gotta close it properly.

Of course, if the library doesn't probably attempt to close libraries it's responsible for during its shutdown procedure, that's another can of worms.

24. immibis ◴[30 Aug 25 19:03 UTC] No.45077149{4}[source]▶

>>45076060 #

The same is true of DLLs. They don't do anything by themselves; they are merely blocks of bytes mapped into memory.

Why is tracking the lifetime of a function pointer different from tracking the lifetime of any other pointer?

FreeLibrary on Windows unloads libraries when the reference count is zero.

25. tux3 ◴[30 Aug 25 19:13 UTC] No.45077203[source]▶

>>45075506 #

>or some older and esoteric abuses of fdpassing over UNIX sockets

One of the less well designed APIs, but as an aside it is still widely used for IPC.

26. themafia ◴[30 Aug 25 19:47 UTC] No.45077479{3}[source]▶

>>45075558 #

It seems like the major problem here was blindly calling an init function and then completely failing when it returned a non standard result.

The init function could return a status that indicates "already initialized."

The calling code could observe this and passively warn of the unusual condition but otherwise proceed.

27. AstralStorm ◴[30 Aug 25 23:07 UTC] No.45078799{4}[source]▶

>>45076060 #

You could do that, but then you're running valgrind.

28. saurik ◴[30 Aug 25 23:39 UTC] No.45078985{3}[source]▶

>>45075087 #

But how/why is libA failing if libB has already been initialized? That's the thing which seems broken.

29. kazinator ◴[31 Aug 25 03:45 UTC] No.45080202[source]▶

>>45034148 (OP) #

There needs to be a way to forcibly run the thread destructors for a library that is going away.

Since the library is going away, it means it's not used any more. Not being used means that no thread that is currently running will call into that library any more.

If no thread that is currently running will use the code of that library, it has no business hanging on to the data belonging to the library; that data should only be manipulable via that code. Therefore, it should be forcibly taken away.

The reason they flunk on this issue is that there isn't a nice way to enumerate through all the threads and snipe away a particular class of thread local storage from each. The architecture is oriented around the POSIX-induced brain-damaged idea that a thread must clean up all thread-specific storage after itself.

Fight me!

30. kazinator ◴[31 Aug 25 03:48 UTC] No.45080214[source]▶

>>45074556 #

The DLL consists of code and static data. If nothing needs the code or the static data, it can be blown away. If that library allocated something which it is responsible for freeing, but that something still exists even though that library's very code is no longer referenced (nobody will call into it), it almost certainly means there is a memory leak.

A memory leak caused by a library isn't necessarily something that should prevent it from being unloaded. Unless it is the leaked objects that are holding a reference! (Then we need a full blown GC system to detect cycles.)