Systemd mounted efivarfs read-write, allowing motherboard bricking via 'rm'
Essentially, systemd defaulted to a configuration where the computer's motherboard could be permanently destroyed by removing a 'file' from the command line. The bug reporter argued that this was unduly dangerous, but the systemd developers thought that systemd was working as intended.Here's a reasonably impartial discussion on a FreeBSD list that gives an overview: https://forums.freebsd.org/threads/54951/
And from that thread, here's a link to Matthew Garrett (the creator of efivarfs) saying that efivarfs is at fault here rather than systemd: https://twitter.com/mjg59/status/693494314941288448
Another thing that was missed was that Lennart wasn't being unreasonable, nor was he saying it wasn't a problem (he specifically stated the opposite, in fact). I had a feeling at the time (based on his responses) that the reason he wasn't specifically stating he was going to fix it or open a bug report for it in systemd was that he was going to push it up-stack to a more appropriate place, and it looks like that's what happened.
Really? Is that evidenced by Lennart's response to this, which stated "The ability to hose a system is certainly reason enought to make sure it's well protected and only writable to root."[1]? I think it implies the opposite.
But in the real world no one ever fixes firmware bugs, so this is the best we can do.
In addition to clearing EFI variables, the current behavior will also attempt to clear any mounted removable drives and any mounted network drives, which is usually even more harmful than messing with EFI.
Of course that would be a backwards incompatible change, although I don't think many scripts rely on this behavior.
Put another way, systemd mounting efivarfs read-only doesn't necessarily prevent you from bricking your machine by deleting files within that file system, it protects only those using systemd. The problem stems from the nature of efivarfs, and that it was implemented as a file system and not some other interface. Making it default to something more sane is a better (but by no means the best) solution, and that ideally would happen within the source, which is the kernel.
I read this as Lennart saying that when root issues an 'rm' in efivarfs, the variable should be removed even if this renders the motherboard unusable without physical repairs. What's your interpretation?
I've edited to fix my terrible punctuation, and to make it clear that 'it' refers to 'systemd', and to add a link to MJG's response on Twitter. I can edit further if you have a way to make it clearer.
I'd expect file operations to only permanently affect storage devices per default. Sure you can mount almost anything as a file in Unix, but to automatically mount more than necessary is bad design. It's like placing mystery files in the filesystem, and when a curious user deletes or modifies them, they loose their monitor's color profile, there printer's firmware, or all of their GMail attachments. You could say it was the user's fault to mess with it, but I'd say it's weird to expose such things as files (unless explicitly asked for by the user or a tool).
>These fixes are somewhat involved to maintain
>compatibility with existing install methods
>and other usage modes, while trying to turn
>off the 'rm -rf' bricking vector.
They go out of their way to make sure changes are backwards compatible.I would think a) this are mostly system tools like boot managers and b) these tools need root (or setuid root) anyway, so why can't they just mount it themselves temporarily?
Edit: It seems it is mostly grub-install, efibootmgr, and `systemctl reboot --firmware` that need this mounted rw. The first two aren't something that a casual user uses very often, and if someone does, a "Filesystem is mounted read-only" message will point them in the right direction. The latter is part of systemd and could easily be changed to mount efivarfs itself, no third party involved.
The Kernel team showed they are professionals by stepping up and doing the work.
This. It's really hard to blame this on systemd (not that people didn't try anyway).
> But beyond that: root can do anything really.
... so running "rm -rf /" as root should brick your motherboard because it's the responsibility of the motherboard manufacturer to protect against this. That's all fine and dandy in an idealized world, but in the "real world" there are going to be motherboard manufacturers that play fast and loose with these things.
So on a non-broken BIOS, there is "technically" no bug - but the pieces (BIOS, Kernel, Systemd) come together to make a bad design.
It's an improvement, but it seems like we should do this in addition to default mounting read only.
It should be mounted rw because existing userspace expects it to be rw.
> It shouldn't be mounted by default.
It should be mounted by default because it's information that's relevant to various pieces of userspace.
> It shouldn't even be a bloody filesystem.
With hindsight, it should absolutely not have been a filesystem. There's very little metadata associated with EFI variables and the convenience of exposing this in a way that can be handled using read and write is huge, but real-world firmware turns out to be fragile enough that this was a mistake. But, in the absence of a time machine, there's very little I can do to fix that now.
Systemd is as consistent as upstart in this.
The kernel developers are continuously working around bad behaviour by bios/firmware authors. It is the right place.
It's the responsibility of the OS to make sure it doesn't ruin firmware.
The standard practice for PCs has always been that firmware configuration settings can be cleared (through a jumper or by pulling the battery) to reset the system to its factory state, forcing it to fall back to its conservative and safe defaults. Some systems have apparently forgotten to have defaults. Their firmware is already broken and afflicted with a major bug even if you avoid triggering it in this particular manner.
PREFACE: This is an anecdote, but I do believe it reflects on general state of hardware vendors, because when I Google'd, it showed that people had similar, if not worse problems than I did.
And this is so incredibly sad. Especially when you buy a $2.5k laptop which only works with Windows (with quirks).
I bought a laptop^[model] on which you couldn't even install another OS because of a crippling firmware bug. It wasn't until a shit storm on their forums that they released a firmware update which fixed the issue (which was that the SATA controller was stuck in RAID mode, and you couldn't change it to AHCI), which prevented any OS from being installed (even window, that was installed already, which is bizarre) because no OS could recognise the PCIe NVMe M.2 SSDs.
After the update was released, I did happily install Linux on it, but the ACPI DSDT was so broken, I didn't know where to begin with fixing it (apart from this whole hardware stuff being outside of my domain). Other than that jack detection is jack shit (pun intended). I literally can't use my headphones without special OEM or Realtek software (forgot which) on Windows, and I can't use them at all on Linux because there's no equivalent. I tried playing with various modes^[modes] and output configurations, but to no avail.
Also, on Windows I hear a subtle scratchy sound from somewhere in my laptop, but I don't hear it on Windows. I noticed it the most while moving my USB mouse or when there's a lot of CPU intensive work. No, all the solutions recommended online didn't work, and this is apparently an issue with Windows on Asus/Realtek for years, if not decades.
Furthermore, there's a bizarre flicker which subtly intensifies and then subtly goes away on Windows (and it interestingly happens only in some applications which appear to use GPU acceleration) which doesn't happen on Linux (even during an intensive OpenGL benchmark followed by a WebGL benchmark).
The things I thought I'd have most issues with (the GPU and the Skylake processor) turned out to be the least of my problems. Actually, 0 problems with them. So, kudos to NVIDIA for their proprietary Linux drivers (the novueau ones worked great, too, but I devcided to go for the proprietary ones due to the slight performance benefit).
So, no this isn't a Linux issue to anyone who wants to scream "boohoo linux is bad for consumer PCs". This is all an issue of shitty hardware vendors. There's probably over a hundred models documented on the Archlinux Wiki[archwiki] with all their various quirks and what not. Most of those are actually hardware problems, and there's no way for Linux to fix all these problems without there being some giant database with each laptop model and its quirks and applying configuration fixes, and this would also have to be distro-agnostic or cover various distros to work properly. The only reason why most of it kinda (not flawlessly) works on Windows is because the various Vendors actually cooperate with the Windows developers (I imagine), and its rare that I see them even trying to cooperate with Linux developers; maybe I just missed it, but each time someone does cooperate, it's met with this grand praise that's quite hard to miss, so I doubt I missed it (this excludes certain vendors who have always cooperated with Linux devs, or who specifically write drivers for linux in the first place).
It's so, so solemnly sad that people blame most of this, if not all, on Linux. Especially considering Linux does its best to try and patch this endless stream of oncoming shitty hardware and nobody (not literally nobody, but a very small percentage) sees or recognises that effort.
----------
[model]: ASUS ROG G752, for anyone wondering
[modes]: https://www.kernel.org/doc/Documentation/sound/alsa/HD-Audio...
[archwiki]: https://wiki.archlinux.org/index.php/Category:Laptops
> ... so running "rm -rf /" as root should brick your motherboard because it's the responsibility of the motherboard manufacturer to protect against this.
You are assuming "only writable by root" means "root can write without any restrictions" which is a fairly uncharitable reading, and requires assumptions about his intent which are not evident.
I could make a statement such as "cars in the united states can only be legally driven by people of an appropriate age" and you could assume I meant that's all that needs to apply and start calling out my statement as wrong, or you could assume I was aware of the additional requirement of a driver's license, or you could ask me to clarify my point of view. I just don't believe the first option is conducive to useful discussion, nor do I think it's appropriate to use assumed information in a negative way towards a third party.
Edit: s/disparage/use assumed information in a negative way/ for lack of better phrasing coming to mind. The statement wasn't really disparaging, just an uncharitable interpretation, so I don't want to overstate that.
Can you explain this a bit? I'm not familiar with the particulars of firmware, but I'm having a hard time imagining why any userspace program would expect a firmware partition to be writable. Even if there are any, certainly I'd have a hard time believing that they would need it to be mounted and writable all the time.
As for why userspace couldn't remount it itself - it could. It doesn't. Changing the behaviour of systemd without changing the behaviour of the rest of userspace would result in userspace being broken, and making that kind of incompatible change is annoying - especially when fixing it in the kernel allows us to avoid that breakage.
Really, I think the two appropriate places for this fix are the kernel, and if that's not expected to be rolled out soon, as a patch to systemd to mount it read-only by the distro shipping systemd. Systemd shipping the fix would only really help the small group of people that install systemd from source during the short window from 11 days ago until the next kernel is available (or that choose to run an older kernel), and makes the whole efivarfs situation a bit more confusing by leaving it read-only and immutable for the future.
Best example of why you need to have it RW: if uefi is in fast boot, the only way to actually enter uefi is to boot an OS and have the OS set the uefi variables so that it goes into the firmware screen on next restart. This is (on Linux) done by changing something in that virtual filesystem.
> certainly I'd have a hard time believing that they would need it to be mounted and writable all the time.
You could probably remount it, but that also means that you all the sudden have concurrency problems with that operation. So not really ideal.
""" UEFI stands for "Unified Extensible Firmware Interface", where "Firmware" is an ancient African word meaning "Why do something right when you can do it so wrong that children will weep and brave adults will cower before you", and "UEI" is Celtic for "We missed DOS so we burned it into your ROMs". """
Probably. But that's not systemd's fault.
I don't think that's a useful perspective here. This is not a feature - nothing should ever want to permanently brick a motherboard, there's no use case for that. There's no benefit to allowing root to do this. The OS is supposed to abstract the hardware in a way that it can be safely operated.
Raymond Chen talked[1] about the importance of supporting that ran on Win95 but broke on WinXP, even if they weren't complying to Microsoft specs.
I also remember reading that web browsers had to go to great length to render completely non-compliant web pages.
In your experience, when should you decide to support "non-complying" behavior?
[1]: Unfortunately I cannot find the original article by Chen but I could find extensive mentions of it in [this article by Joel Spolsky](http://www.joelonsoftware.com/articles/APIWar.html)
I say as someone still ambivalent, not pro nor against it.
If the problem is no output at all, it may be just a matter of toggling some HDA codec GPIO or EAPD pin to power up external amplifier chip, which can be done with hda-analyzer. But if it's some combo headphone/mic jack and detection doesn't work then I have no idea.
systemd can't take all the blame either - I bricked (yes really bricked) one of these by grub installing a "stub" that only booted into grub-rescue on my EFI partition. I can't get into the firmware settings and the rescue loader can't read the partition tables -> bricked unless I can corrupt the EEPROM somehow and force a menu (no CMOS battery in these low-end devices to pull)
you say "You cannot delete the directory you are immediately in, so that at the very least is prevented."
I have no idea what the rest of your comment is in relation to what I said other than I'm pretty sure you can accidentally delete a directory your in given what Steam did.
Deleting you current directory is against the POSIX standard. It should not be allowed.
If however you delete a directory that is higher up the directory tree (e.g. the parent directory), it will be deleted.
As far as I can tell this does not violate the POSIX standard[1], as that situation is left as undefined (since in theory the directory you are deleting will chain to the directory you are currently in which is open in the tty).
Edit: The rest of my previous comment was trying to say that the utility of being able to self destruct the current directory is arguable. Why should it be prevented (especially when it could just be hidden behind a flag to prevent accidental destruction)?
Edit2: D'oh. Forgot the reference:
[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/rm...
Unless you're running FreeBSD (or Illumos) with ZFS and Boot Environments, in which case you'd just select a backup boot environment and continue working :-) Probably without your home directory though, as that is usually excluded from boot environments. But you can set them up however you want.
But if you're running Linux (before this update) on a laptop with terrible piece of shit firmware, you'd end up with a brick.
P.S. found a cool post about rm -rf / in my bookmarks: https://lambdaops.com/rm-rf-remains/ – you can recover a running rm'd Linux machine by using a running shell and /dev/tcp :D
rm -rf "$STEAMROOT/"*
When $STEAMROOT was empty, "steam apparently deleted everything owned by my user recursively from the root directory. Including my 3tb external drive I back everything up to that was mounted under /media."
Ok, I read that wrong a long while back, but "allowed error" is really odd. I guess I still side on an error is an error and it should not be allowed.
Specifically, the way a mental model of a hierarchy is broken by mounting a higher-order ressource (UEFI variables) as a subordinate of a file system that is itself a subordinate of the OS.
UEFI vars are just hardware resources. Mapping them as a file system object is just unnatural and, yes, stupid.
Trying to use a permission model ("only root can do it") overlooks the real problem: The user do not expect higher order objects to be mapped as subordinates of the file system.
When you delete from the file system, you expect objects to be deleted from the disk - not UEFI variables to be altered or deleted! And because the user does not expect such behavior, there's a good chance she/he will override warnings and go ahead with the operation expecting only file system objects to be affected.
This is "everything is a file" taken a bridge too far.
Being able to brick hardware through a very oftenly used action (unliking a filesystem entry) throws us back into the times where one could damage display devices beyone repair by feeding them scan frequencies outside their operational range or by destroying hard disks by smashing the heads into a parking position outside of the mechanical range. We left those days behind us some 20 years ago: Devices got smart enough to detect potentially dangerous inputs and execute failsafe behaviour. It's just reasonable to expect this from system firmware.
When talking about (U)EFI variables we're not talking about firmware updates, which are kind of a special action (and even for firmware updates its unacceptible that a corrupted update bricks a system¹). Manipulating (U)EFI variables is considered a perfectly normal day-to-day operation and the OS should not have to care about sanity checks and validity at boot time. (U)EFI is the owner and interpreter of these variables, so it is absolutely reasonable to expect the firmware to have safeguards and failsafe values in place.
IMHO (U)EFI is a big mess, a bloated mishap of system boostrap loader. And I'm totally against trying to workaround all the br0kenness in higher levels. The more and often systems brick due to the very fundamentals of (U)EFI being so misguided, the sooner we'll move on to something that's not verengineered.
----
¹: Just to make the point: When we developed the bootstrap loader for our swept laser product we implemented several safeguards to make it unbrickable. It's perfectly fine to cut the power or reset the device in the middle of a firmware upgrade. It safely recovers from that. Heck, the firmware in flash memory could become damaged by cosmic radiation, the bootloader would detect it and reinstall it from a backup copy in secondary storage.
http://web.archive.org/web/20061021192240/http://blogs.sun.c...
GNU coreutils changed the default to --preserve-root some years later, in version 6.2:
http://git.savannah.gnu.org/cgit/coreutils.git/tree/ChangeLo...
A sane linux distro will mount it ro and switch to rw whenever need. Defaulting to rw efivars is, excuse the language stupid.
I've done a fair share of efi debugging even removing some of the variables that the kernel will now protect you from breaking.
If the issue is that users should be able to remount efivars as rw whenever needed then that should be addressed, not prevent you from doing stuff to it because there is a rogue init system doing crazy stuff.
EDIT: BTW, i don't think systemd does anything besides write to the various Boot* variables, but I may be wrong. I don't see why that can't be addressed with a remount. If you replace the boot.efi you still have to remount the efi partition anyway.
While Matthew may be right that there is an issue that needs to be addressed, but in one of his tweets he basically says the kernel should fix it because tooling isn't and bioses suck. Well, maybe tooling should be forced to fix it.
or from the issue:
Matthew-Jemielity commented 24 days ago
What needs efivars mounted at all anyway? So far I've seen:
grub
systemctl --firmware-setup reboot
efibootmgr
Since those likely need superuser, couldn't they handle (un)mounting it themselves?
@annejan
annejan commented 23 days ago
As long as distribution that are aimed at consumers remount it ro and on updating kernels wrap grub with remount this is a complete non-issue.
Awesome. This dynamic loading of bash plugins is mad.
rm /sys/efivar/MyVariable
is roughly equivalent to: var obj = {MyVariable: 1};
obj.MyVariable = null;
or: var obj = {MyVariable: 1};
delete obj.MyVariable;
This bricks the firmware, because some of these variables end up being required. It's on the firmware maker / motherboard manufacturer to make sure that there is a way to recover from this rather than having the firmware fail to startup because some variables are missing.My understanding is that firmware made to spec would not brick over the the EFI variables getting wiped. The motherboards that are encountering these issues are running firmware that is cutting corners. Unfortunately cutting corners and ignore specs is nothing new for hardware manufacturers who are more concerned with the manufacture of the physical devices and usually put only the minimum amount of effort into getting any sort of software components (firmware, OS device drivers, etc) running.
The fact is that 'rm -rf /' is a common mistake, both at the command line (i.e. manually typing) and in scripts that don't adequately protect against things like missing variables. E.g.:
MY_DIR=
rm -rf "$MY_DIR/"
The fact that this could brick a system is a big deal. Pushing blame around doesn't do anybody any good.[1] I realize that boot loaders like grub are everywhere and probably need to write to efivarfs, but that's still a data point of 1. Would it be that difficult for grub and it's related scripts to upgrade to remount the filesystem readwrite when it needs to perform an operation? I'm sure it's only been a couple of years since efivarfs functionality was even added to grub.
Not necessarily. Leaving it "wide open" for anything to accidentally write to it all of the time vs. just mounting it readwrite when you actually need to write to it are two different risk profiles.
That it was returning zero would cause the linux ACPI framework to ignore it and not probe its driver. My vague understanding is that windows works differently, and calling _STA is done by the driver, so it's possible to just not do it and still have a working system.
I don't know what the device itself is, but given that the script says "audio" in there it's probably the audio codec.
To my mind there are two places to fix this, in the kernel for a real mitigation technique that helps "solve" the problem, and in the distros for quick fixes and hacks, or backports of the kernel fix, as necessary. Systemd pushing a fix of their own 1) only affects distros using systemd, while this problem affects all recent distros that use efivarfs), 2) probably won't get picked up by distros immediately excapt as a backported fix anyway, as I doubt most of them push new versions of something as integral as systemd every time a new version is out, at least not with a lot of testing, and 3) would not have been a good fix, and would have required the utilities that still needed access to actually remount a filesystem.
1: Triggering this problem is not as easy as what you (and I) wrote in most instances, as / has special consideration in rm, and generally requires the "--no-preserve-root" flag.
This motherboard... It refuses any sort of reflashing of the firmware? Taking the button cell out of the battery slot, and removing all power from the board does nothing? The motherboard won't enter BIOS, upon pressing F10 at power on?
What is this... "bricking" we speak of, here?
I mean, yeah, I've destroyed many a partition table in my day, and I've permanently lost myself some data, I've even dd'ed in the wrong direction with no recourse but to suck it up and deal with it, but I've never fried a computer with a rm command. (Contrary to what some commenters seems to be viciously defending, this does seem to be a legitimately different level of destructive possibility than has conventionally been available. This is the sort of thing that would put me off having ever installed Linux in the first place.)
There is a --one-file-system argument that skips directories not on the same filesystem. You could add this layer of protection by adding it to an alias in your shell.
[1]https://www.digikey.com/product-search/en/test-and-measureme...
Already exposing processes as files is an abstraction. It somewhat works because you can imagine the file representation being maintained by the process. But it is an abstraction, because a process is not a file.
But what is more important: A file system is a hierarchy. At the root is the most fundamental object. Each level has subordinate objects. That the model you expect.
Having UEFI variables mounted as a file is a surprising loop back to something even more fundamental that the OS itself: The firmware of the physical computer. It a breach of the mental model.
It breaks one of the most fundamental principles that should be followed in man-machine interaction: The principle of least surprise.
I have a machine. I have installed an operating system on it. The OS manages several disks. On the disks the OS manages file systems. I expect the files of that system to be managed by the OS.
I do not expect that regular file system actions have effect outside the hierarchy of the directory on which I perform the actions. Specifically I do not expect files on that system to manage the physical computer.
You may want to call Linux as a crazy operating system , but then you are veering close to Godwins Law anyway.
The kernel is where the buck stops when it comes to protecting hardware and, therefore, protecting software from misdesigned and/or buggy hardware. That's been true for longer than most of us have been alive.
No. The file system, is the abstraction. Adding /proc onto it is a use of that abstraction.
There's two basic extreme positions, and you're adopting one, your parent is adopting something closer to the other.
a. The filesystem only exposes filesystems actually on disk, mapped to some hierarchy. As you say, "On the disks the OS manages file systems."
b. The filesystem is (roughly) a hierarchical container of named binary blobs (called "files") with some defined associated metadata, such as permissions.
While you can adopt (a), and that's fine, some of us (myself included) see a lot of value in (b). The biggest problem with only exposing "real" file-storing FSes in the file hierarchy is that it leaves you with a ton of questions about how to expose all the other things. Taking the stance that we're only going to expose "real" files in the file hierarchy leaves us with several classes of objects that aren't files-on-disk, and you need to name them s.t. the user can interact with them. It is certainly possible to expose each different type of thing in a completely separate namespace. You'll probably also need to be able to associate permissions with those objects¹, as so now you've got a named, ACL'd list or hierarchy of objects, and it's starting to look a lot like a filesystem. You now also need another set of tooling to work with each of these classes of objects. You need another set of syscalls for each of these objects.
The great thing about having a unified file hierarchy in the (b) abstraction is that tooling works on all of these different classes of objects different. It's really just the "CRUD" idiom, and normally it allows things to interoperate quite smoothly. I can write a bash script that draws a progress bar of my battery, and it requires no knowledge other than where in the file hierarchy the battery is.
This is, of course, a case where the power is somewhat biting us. That doesn't make the abstraction wrong, nor does it mean the abstraction isn't leaky. (In fact, in this case, the abstraction works really well, I'd say. Any other implementation of UEFI variables is going to have a "delete" call, AFAICT. What bit us here is that all the objects are in one bucket together, and thus rm -rf / removes more than just files.)
> It breaks one of the most fundamental principles that should be followed in man-machine interaction: The principle of least surprise.
While I agree, that doesn't mean we need to throw out all the power of having a unified file system, but it might beget some way of ensuring the user understands what `rm -rf /` actually does. There's certainly more than one way to solve this, some of which don't involve limiting what can be done with the FS. (As some examples: perhaps rm shouldn't recurse to a different FS, and objects of similar types are on different FSs, which prevent the very error that got us here; perhaps some files force "user acknowledgement" of their removal; perhaps it really does get mounted read-only.)
¹While you might be able to get away with "only root accesses UEFI vars" in the scenario that they're not in the file hierarchy, if you remove all non-real-files then you've got a lot of other things to deal with: unix sockets, block devices, terminals, all the various I/O ports, temp sensors, battery data… the list is extensive.
I replaced the bricked device and I'm going to be a lot more careful this time.
Booting Ubuntu Wily works, but there's no battery (status/charging?), wifi, audio or touchscreen. So if you use the XDA scale it's working perfectly!
I have another Z3735 device (MeegoPad T01 - Intel Compute Stick knockoff), but it's unusable because the clock runs fast, then slow, then fast - enough that an NTP sync makes the clock go backwards and then everything breaks.
These chipsets are turning up everywhere and most of the time the implementation is garbage. I hope Intel did better with the reference implementation/s but I can't afford them at the moment.
No, it's not the same. Mounting `/boot` rw by default does not put your system in the danger of getting damaged beyond repair. If you hose the boot partition you can always start a recovery system (live Linux or similar) to repair the damage.f
But if deleting efivars renders a system inoperable on a firmware level you're essentially SOL, save for rewriting the contents of the system firmware flash using an external programmer and a clean image. That is an absolutely inacceptable situation. The year is 2016 and hosing a firmware by writing malformed values into the firmware API is, simply put, a software vulnerability that allows to permanently DoS a system. As such this is a security issue that must be fixed at where the security issue happens. And in case of efivars the issue is that certain input is not properly validated and/or sanitized. If a system firmware can not properly start with certain variables being unset or removed or set to invalid valued, its should be a implementation requirement to validate input on such variables before executing the change.
> Defaulting to rw efivars is, excuse the language stupid.
It probably it. But it's not the responsibility of the OS to sanitize values that are not intended for being used by the OS. efivars are intended to be used by (U)EFI and hence it's the (U)EFI implementation's task to properly sanitize access to them.
Essentially we're talking Bobby Tables here, just with a different API.
There's a certain rich irony that Lennart is being flogged for following the Unix tradition (root can do anything, including blow up their monitor with bad X configs); that his detractors are suggesting userland tools manage their hardware (normally the job of the kernel in a Unix system); that systemd ought to be expanded to manage hardware (after years of complaining it's too big), presumably by adding whitelisting capabilities and a database of known-good/known-bad UEFI implementations.
I guess at least it demonstrates how utterly unhinged some people become when his name is attached to anything.
It's perfectly reasonable to ro mount efivars. But doing so must not be the workaround to fix an security issue (and yes, this is a security issue) in (certain implementations of)(U)EFI. Just to make clear why this is an security issue: Security rests of three pillars:
- availability - confidentiality - authenticity
Rendering a system unusable (DoS-ing it) it an attack on availability. And speaking of security, if the goal is sabotage and causing large financial damages, then being able to permanently brick a system in case of a privilege escalation (there's nothing stopping UID=0 from remounting rw efivars) is pretty bad; and no, the implemented fixes in the efivars kernel code don't help, because an attacker can still mount a custom kernel module which will talk to the respective efivars code directly circumventing sanity checks (or directly talk to (U)EFI without using the efivars code).
Agree that the file system is an abstraction. Makes us think in terms of directories (containers) and files (items). Everything in the file system is designed around the idea of files and directories. Permissions (rwx), operations (create, move, copy, append, delete).
However, already adding /proc challenges that. What does it mean to have "execute" right to a process? It is already running? What does it mean to append to a process? to move it? If processes are "files", why can I not kill the process by deleting the file? Processes are not naturally files. Yes, it makes somewhat sense if you think of /proc as status information being maintained for each process, i.e. they are extracts, owned by the OS.
But UEFI vars makes absolutely no sense. It is a true leaky abstraction. If one need to be able to write to UEFI vars, then create an API for it, possibly some utilities. That way I need not risk altering fundamental firmware settings by performing seemingly file system operations whose effect I expect to be limited to the hierarchy!
> The great thing about having a unified file hierarchy in the (b) abstraction is that tooling works on all of these different classes of objects different. It's really just the "CRUD" idiom, and normally it allows things to interoperate quite smoothly. I can write a bash script that draws a progress bar of my battery, and it requires no knowledge other than where in the file hierarchy the battery is.
But it actually just sweeping complexity under the rug. I need documentation for what the file contains on each "line" - what it means to write to it, etc. It is not discoverable at all. If you expose system resources as actual resources and do not try to map them onto files, you can actually make a discoverable system. An example of such a regime is CIM. On Windows, PowerShell (or Python or VBScript or ...) can be used to interact with such fundamental system resources. To use your example of a progress bar of the battery, here is an example of how the entire process from discovering the correct ressource (the battery) to displaying a progress bar on Windows without consulting documentation:
PS C:\> #there's probably some class for batteries. let's look for it by name
PS C:\> get-cimclass *battery*
NameSpace: ROOT/cimv2
CimClassName CimClassMethods CimClassProperties
------------ --------------- ------------------
CIM_Battery {SetPowerState, R... {Caption, Description, InstallDate, Name...}
Win32_Battery {SetPowerState, R... {Caption, Description, InstallDate, Name...}
Win32_PortableBattery {SetPowerState, R... {Caption, Description, InstallDate, Name...}
CIM_AssociatedBattery {} {Antecedent, Dependent}
PS C:\> # the Win32_Battery probably offers the most specific information
PS C:\> Get-CimInstance Win32_Battery
Caption : Internal Battery
Description : Internal Battery
Name : DELL 1C75X31
Status : OK
Availability : 2
CreationClassName : Win32_Battery
DeviceID : 647Samsung SDIDELL 1C75X31
PowerManagementCapabilities : {1}
PowerManagementSupported : False
SystemCreationClassName : Win32_ComputerSystem
...
BatteryStatus : 2
Chemistry : 6
DesignCapacity :
DesignVoltage : 12992
EstimatedChargeRemaining : 94
EstimatedRunTime : 71582788
ExpectedLife :
MaxRechargeTime :
...
ExpectedBatteryLife :
PS C:\> # yep - that's it. lets save this instance in a variable
PS C:\> $bat = Get-CimInstance Win32_Battery
PS C:\> # display a progress bar and update it continually every 10 secs
PS C:\> for(){ Write-Progress Battery -PercentComplete $bat.EstimatedChargeRemaining -Status "Charge remaining"; sleep 10 }
> This is, of course, a case where the power is somewhat biting us.No, what biting us is a leaky abstraction that surprises us: We can accidentally delete firmware variables because file system operations are not constrained to the directories/files they operate on.
> That doesn't make the abstraction wrong
It is an abuse of the abstraction.
> Any other implementation of UEFI variables is going to have a "delete" call, AFAICT.
Indeed. In PowerShell you can discover the commands for manipulating by gcm UEFI
> What bit us here is that all the objects are in one bucket together, and thus rm -rf / removes more than just files.
No, what bit us is the broken expectation (a surprise) that a higher-level resource was mapped below some file system directory.
It does not change the fact the fault lies with shitty proprietary UEFI implementations, and nobody writing free software is at fault here.
The EFIVars table is stored in mainboard flash as... file(s). Probably only one, since the firmware isn't going to be using a filesystem.
But it is still a map of key-value pairs. That maps perfectly fine to a filesystem.
Nothing about the "brick your laptop with rm -rf /" is the fault of any free software component or philosophy. By the specification of UEFI itself the efivar table is for transient data storage between the OS and the firmware. It is supposed to be mutable, removable, you can do whatever you want to it as an OS and the firmware can do whatever it wants as well and neithers behavior should stop the world.
All this was is a demonstration of why proprietary firmware is bad, and that again the free software community needs to work around broken proprietary crap that cannot obey its own design documents.
If you want to soapbox about how everything would be easier if we had to link a library and access all OS level data through some 50k command API instead of through files... I'm not sure you are going to actually find an instance where people are improperly treating data as files, because almost anything can be treated as a file. You can implement it poorly, but if it is data and it has organization you can put it in a filesystem.
A virus should not be able to destroy the system BIOS. The problem with efivars is illuminating a vulnerability, not a feature for "doing stuff". I would expect to see this used in the wild if left unfixed, especially the next time we hear about a remote vulnerability that permit arbitrary memory execution.
If the implementation is bad (which it is), that doesn't excuse reckless userland software that refuses to acknowledge flaws in the firmware.
Saying that Lennart is wrong has become a rather popular sport, let's not go overboard to say it even when he's right by all accounts.
The kernel fix doesn't have such drawback.
The kernel fix prevents that, using mount flags alone only restrict the vulnerability but it doesn't make it go away.
The kernel fix prevents that, using mount flags alone only restrict the vulnerability but it doesn't make it go away.
But on sysvinit if you mount efivars rw for any reason and your hands slip a bit a stray `rm` could still brick your motherboard, so it's not really fair to say that without automounting the issue goes away.
Another problem is that the laptop has a 2.1 sound system (or 4.1 maybe, I am not actually sure?) and the outputs are a bit wonky (which can, apparently, also be fixed/configured with hda-analyzer).
In short, the whole laptop is a mess. I imagine it will be fixed eventually by Linux sound drivers. I am still collecting data to open a bug report on kenrel.org, hoping it helps future people not having to go through all this… bullshit, for a lack of more apt expression.
I once tried 'format C:' on a Windows 10 laptop I didn't care about and I just got a boring error message.
I recently ran into a funny bug, however, that makes me more sympathetic to your point. In Emacs a version of TRAMP mode (which is used to connect to remote servers or to connect locally as a different user) would try to cleanup itself by deleting some sort of tramp history file after a session. And if the history file didn't exist to begin with (or if a setting disabled it, I'm not sure exactly) someone thought it appropriate to simply open up "/dev/null", throwing away all writes to the file -- OK, makes sense so far.
But in TRAMP I often connect as root to my own machine -- it makes it easy to edit files as root, or run a shell in root. And as root you have the power to delete /dev/null! So, TRAMP would delete it without my knowledge... what's odd is that it quickly gets created again (perhaps by Emacs) so that it appears to exist, except that suddenly a) it's a regular file and b) it's owned as root without world write/read permission, so that suddenly all sorts of things start to fail because they can't open /dev/null. Fun.
Yes, it might've been hyperbole, but the fact that systemd has so much code inside it which has nothing to do with "being an init system" just begs the question "why?".
OpenRC is what got closest to be the replacement SysV-Init, just by its qualities and market share. But other, nicer systems exist, but they never found wide adoption because the main distributions didn't let go of SysV.
Now Lennart claims all the bragging rights, but people who used modern init systems before it was cool know better (there are a lot of Lennart opponents among those, BTW).