Unauthenticated Remote Code Execution in Erlang/OTP SSH

1. aftbit ◴[17 Apr 25 14:28 UTC] No.43717385[source]▶

As I understand it, this is talking about an SSH server built into Erlang/OTP, not e.g. OpenSSH on a server with Erlang installed.

>Any service using Erlang/OTP's SSH library for remote access such as those used in OT/IoT devices, edge computing devices are susceptible to exploitation.

https://thehackernews.com/2025/04/critical-erlangotp-ssh-vul...

replies(2): >>43717937 #>>43719581 #

2. kimi ◴[17 Apr 25 15:02 UTC] No.43717937[source]▶

>>43717385 (TP) #

Yes - one of the many things that you can find in OTP is a programmable SSH/SCP client and server. The vulnerability is in the server component.

See for example https://blog.differentpla.net/blog/2022/11/01/erlang-ssh/

replies(1): >>43718289 #

3. davidw ◴[17 Apr 25 15:29 UTC] No.43718289[source]▶

>>43717937 #

Erlang, because of its architecture, has something of a habit of people rewriting various protocols in Erlang itself, rather than calling out to some C library.

This has pros and cons.

replies(2): >>43718454 #>>43718607 #

4. innocentoldguy ◴[17 Apr 25 15:39 UTC] No.43718454{3}[source]▶

>>43718289 #

This is probably because C NIFs run in the same process as the Erlang scheduler. If you have a long-running or blocking NIF, it can starve the scheduler and cause significant performance degradation across the system.

replies(2): >>43718599 #>>43721295 #

5. natrys ◴[17 Apr 25 15:49 UTC] No.43718599{4}[source]▶

>>43718454 #

I think they now have "dirty" NIFs that use a separate scheduler for this.

replies(1): >>43718797 #

6. toast0 ◴[17 Apr 25 15:50 UTC] No.43718607{3}[source]▶

>>43718289 #

Writing protocol code in Erlang is nice, because the parsing is so easy and clear. And if you want to do something that's not so easy by spawning a command, then you may as well build it in Erlang. And it's fun and symmetric to build both a server and a client... I've not looked at OTP SSH code, but I'd assume the ciphering is still calls to external c libraries, as it is in the OTP TLS code.

Of course, easy protocol parsing doesn't do the whole job; state management is required too (and was missed here, clearly).

7. throwawaymaths ◴[17 Apr 25 16:06 UTC] No.43718797{5}[source]▶

>>43718599 #

yes, but there is a finite number of them, by default equal to the number of available cores. If your connection stays in c-land for too long you might run into trouble, if more than one connection are desired.

8. rollcat ◴[17 Apr 25 17:06 UTC] No.43719581[source]▶

>>43717385 (TP) #

This is why I generally do not rely on SSH servers other than OpenSSH. It's (by far) the most widely deployed implementation, thoroughly battle-tested, etc. It's also hard to actually get pwned; the OpenBSD[1] guys believe in security as the default.

There's some value in avoiding a monoculture, or choosing different trade-offs (e.g. binary size, memory usage). But as exemplified by this incident, any incentives must be carefully weighted against the risks. SSH is your final line of defence.

[1]: https://www.openbsd.org/donations.html

replies(3): >>43720311 #>>43720798 #>>43722007 #

9. whizzter ◴[17 Apr 25 18:08 UTC] No.43720311[source]▶

>>43719581 #

There's a huge difference here, historically that was because many C codebases were vulnerable due to inherent C flaws and ssh daemons due to their age was C based. OpenBSD folks stances on coding and system design avoids pitfalls.

This is an Erlang daemon, thus written in a managed language without buffer overflows,etc, but it seems like someone left a huge gaping logic hole to drive a bus through. SSH or not, this could've equally well been a logic hole in a base webserver,etc.

I'd say this is more akin to the Log4j debacle, a perfectly safe language but bad design makes it vulnerable to something trivial.

10. PhilipRoman ◴[17 Apr 25 18:58 UTC] No.43720798[source]▶

>>43719581 #

I also have this principle, although I make an exception for https://tinyssh.org

11. hinkley ◴[17 Apr 25 19:45 UTC] No.43721295{4}[source]▶

>>43718454 #

I wonder if there's space for a libuv inspired solution now.

replies(1): >>43723070 #

12. VWWHFSfQ ◴[17 Apr 25 20:41 UTC] No.43722007[source]▶

>>43719581 #

OpenSSH has actually been "pwned" numerous times though. It's a very desirable target.

replies(2): >>43724022 #>>43725818 #

13. toast0 ◴[17 Apr 25 22:52 UTC] No.43723070{5}[source]▶

>>43721295 #

libuv is more or less abstraction around an event loop for async i/o right?

The BEAM is also more or less an abstraction around an event loop for async i/o. If you want async i/o in nifs, I think you want to integrate with BEAM's event loop. Inside NIFs, I think you want to use enif_select [1] (and friends), available since OTP 20 originally from 2017-06-21. In a port driver, you'd use driver_select [2] which I think has been around forever --- there's mentions of changes in R13 which I think was mostly release 2009-11-20 (that may have been R13B though).

[1] https://www.erlang.org/doc/apps/erts/erl_nif.html#enif_selec...

[2] https://www.erlang.org/doc/apps/erts/erl_driver.html#driver_...

replies(1): >>43723271 #

14. hinkley ◴[17 Apr 25 23:19 UTC] No.43723271{6}[source]▶

>>43723070 #

It uses different threads in order to make infrequent blocking calls look like they are asynchronous.

When we (or at least some quantity of “we”) want is infrequent native calls to be able to fail without taking the BEAM down.

The problem with doing it with threads though is that a bad thread can still vomit all over working memory, still causing a panic even if it itself doesn’t panic.

replies(1): >>43725727 #

15. throwawaymaths ◴[18 Apr 25 01:25 UTC] No.43724022{3}[source]▶

>>43722007 #

yeah and iirc erlang's ssl was one of the only ssl implementations not affected by heaetbleed since erlang is memory safe

replies(1): >>43734767 #

16. toast0 ◴[18 Apr 25 07:04 UTC] No.43725727{7}[source]▶

>>43723271 #

Oh I see. If you need to isolate your native code from your BEAM, then you've got several options.

a) run the native code as a separate process; either a port program, a c-node, or just a regular program that you interact with via some interprocess communication (sockets, pipes, signals, a shared filesystem, shared memory segments if you're brave)

b) some sort of sandybox thing; like compile to wasm and then jit back to (hopefully) safe native.

c) just run the native code, it's probably fine, hopefully. My experience with NIFs is that they are usually very short code that should be easy to review, so this isn't as bad as it sounds...

If your native code is short, option c is probably fine; if your native code is long, option a makes more sense. If you want to make things harder without real justification, b sounds good :P

replies(1): >>43730775 #

17. rollcat ◴[18 Apr 25 07:24 UTC] No.43725818{3}[source]▶

>>43722007 #

I think in case of any security-critical project it's important to evaluate the track record objectively:

https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=OpenSSH

It's true that there are 5 advisories so far in this year alone, but let's consider the actual impact:

    CVE-2025-32728 - Error in documentation, possibly leading to misconfiguration
    CVE-2025-30095 - Debian+dropbear-specific
    CVE-2025-27731 - Windows-specific; local privilege escalation; OpenSSH doesn't target/support Windows
    CVE-2025-26466 - Remote DoS
    CVE-2025-26465 - MitM involving host key DNS verification; high attack complexity (relies on exhausting client memory)

OpenBSD enables sshd(8) in the default install, and has so far had two RCEs in 30 years. Now, not everyone runs OpenBSD, but I'd personally throw the stones at e.g. Debian (see CVE-2008-0166).

18. hinkley ◴[18 Apr 25 18:43 UTC] No.43730775{8}[source]▶

>>43725727 #

I suspect the confusion of priorities comes from running native code that’s time consuming and wanting it to horizontally scale with your cluster, which is easier if you try to let the beam do it.

While what is easier for those of us not working on the beam is to put the glitchy code into its own service and put up with the maintenance overhead.

But when you have one or two solutions the friction to move to three becomes difficult. People start talking about how having dozens will be chaos. While true, sometimes you really do just need three and it’s not a slippery slope.

replies(1): >>43734673 #

19. toast0 ◴[19 Apr 25 06:40 UTC] No.43734673{9}[source]▶

>>43730775 #

> sometimes you really do just need three

If it's worth doing, it's worth doing three times.

20. toast0 ◴[19 Apr 25 07:07 UTC] No.43734767{4}[source]▶

>>43724022 #

I'm a big fan of Erlang, but I don't think this is a fair thing to praise.

Only OpenSSL had heartbleed. No other implementation of TLS protocols was affected. Many systems integrate with OpenSSL's protocol code, but there's also several that do their own protocol work and use ciphers from OpenSSL (and some that do both).

Erlang's ssl implementation at the time of heartbleed wasn't anywhere close in throughput to using OpenSSL separately. If I'm remembering right, OTP 18 (June 2015) is when it got good enough that it made more sense to run an Erlang https server without a separate TLS termination daemon. Heartbleed became known April 2014, so Erlang SSL was too late to help there, really. More secure, but unusable wirh load doesn't help much.

Also, Erlang SSL was one of many implementations thst needed to be reminded of 1998 era security issues in 2017. [1]

[1] https://nvd.nist.gov/vuln/detail/CVE-2017-1000385