Then you top it on with `?` shortcut and the functional interface of Result and suddenly error handling becomes fun and easy to deal with, rather than just "return false" with a "TODO: figure out error handling".
Then you top it on with `?` shortcut and the functional interface of Result and suddenly error handling becomes fun and easy to deal with, rather than just "return false" with a "TODO: figure out error handling".
This isn't really true since Rust has panics. It would be nice to have out-of-the-box support for a "no panics" subset of Rust, which would also make it easier to properly support linear (no auto-drop) types.
not sure what the latest is in the space, if I recall there are some subtleties
Also there is the no_panic crate, which uses macros to require the compiler to prove that a given function cannot panic.
Even then, though, I do see a need to catch panics in some situations: if I'm writing some sort of API or web service, and there's some inconsistency in a particular request (even if it's because of a bug I've written), I probably really would prefer only that request to abort, not for the entire process to be torn down, terminating any other in-flight requests that might be just fine.
But otherwise, you really should just not be catching panics at all.
you could have a panic though, if you wrongly make assumptions
I will fight against program aborts as hard as I possibly can. I don't mind boilerplate to be the price paid and will provide detailed error messages even in such obscure error branches.
Again, speaking only for myself. My philosophy is: the program is no good for me dead.
Assuming that you are not using much recursion, you can eliminate most of the heap related memory panics by adding limited reservation checks for dynamic data, which is allocated based on user input/external data. You should also use statically sized types whennever possible. They are also faster.
But for arithmetics Rust has non-aborting bound checking API, if my memory serves.
And that's what I'm trying hard to do in my Rust code f.ex. don't frivolously use `unwrap` or `expect`, ever. And just generally try hard to never use an API that can crash. You can write a few error branches that might never get triggered. It's not the end of the world.
But on those places, you better know exactly what you are doing.
That may be true, but the program may actually be bad for you if it does something unexpected due to an unforeseen state.
Return an AllocationError. Rust unfortunately picked the wrong default here for the sake of convenience, along with the default of assuming a global allocator. It's now trying to add in explicit allocators and allocation failure handling (A:Allocator type param) at the cost of splitting the ecosystem (all third-party code, including parts of libstd itself like std::io::Read::read_to_end, only work with A=GlobalAlloc).
Zig for example does it right by having explicit allocators from the start, plus good support for having the allocator outside the type (ArrayList vs ArrayListUnmanaged) so that multiple values within a composite type can all use the same allocator.
>Also many functions use addition and what is one supposed to do in case of overflow?
Return an error ( https://doc.rust-lang.org/stable/std/primitive.i64.html#meth... ) or a signal that overflow occurred ( https://doc.rust-lang.org/stable/std/primitive.i64.html#meth... ). Or use wrapping addition ( https://doc.rust-lang.org/stable/std/primitive.i64.html#meth... ) if that was intended.
Note that for the checked case, it is possible to have a newtype wrapper that impls std::ops::Add etc, so that you can continue using the compact `+` etc instead of the cumbersome `.checked_add(...)` etc. For the wrapping case libstd already has such a newtype: std::num::Wrapping.
Also, there is a clippy lint for disallowing `+` etc ( https://rust-lang.github.io/rust-clippy/master/index.html#ar... ), though I assume only the most masochistic people enable it. I actually tried to enable it once for some parsing code where I wanted to enforce checked arithmetic, but it pointlessly triggered on my Checked wrapper (as described in the previous paragraph) so I ended up disabling it.
Panics are a runtime memory safe way to encode an invariant, but I will generally prefer a compile time invariant if possible and not too cumbersome.
However, yes I will panic if I'm not already using unsafe and I can clearly prove the invariant I'm working with.
Well on Linux they are apparently supposed to return memory anyway and at some point in the future possibly SEGV your process when you happen to dereference some unrelated pointer.
I partially disagree with this. Using Zig style allocators doesn't really fit with Rust ergonomics, as it would require pretty extensive lifetime annotations. With no_std, you absolutely can roll your own allocation styles, at the price of more manual lifetime annotations.
I do hope though that some library comes along that allows for Zig style collections, with the associated lifetimes... (It's been a bit painful rolling my own local allocator for audio processing).
As long as the type is generic on the allocator, the lifetimes of the allocator don't appear in the type. So eg if your allocator is using a stack array in main then your allocator happens to be backed by `&'a [MaybeUninit<u8>]`, but things like Vec<T, A> instantiated with A = YourAllocator<'a> don't need to be concerned with 'a themselves.
Eg: https://play.rust-lang.org/?version=nightly&mode=debug&editi... do_something_with doesn't need to have any lifetimes from the allocator.
If by Zig-style allocators you specifically mean type-erased allocators, as a way to not have to parameterize everything on A:Allocator, then yes the equivalent in Rust would be a &'a dyn Allocator that has an infectious 'a lifetime parameter instead. Given the choice between an infectious type parameter and infectious lifetime parameter I'd take the former.
Of course, just like with opening files or integer arithmetic, if you don't pay any attention to handling the errors up front when writing your code, it can be an onerous if not impossible to task to refactor things after the fact.
I guess all that to say, I agree then that this should've been in std from day one.
I was approaching these problems strictly from the point of view of what can Rust do today really, nothing else. To me having checked and non-panicking API for integer overflows / underflows at least gives you some agency.
If you don't have memory, well, usually you are cooked. Though one area where Rust can become even better there is to give us some API to reserve more memory upfront, maybe? Or I don't know, maybe adopt some of the memory-arena crates in stdlib.
But yeah, agreed. Not the types of problems I want to have anymore (because I did have them in the past).
You see this whenever you use cargo test. If a single test panics, it doesn’t abort the whole program. The panic is “caught”. It still runs all the other tests and reports the failure.
It would be annoying to use - as you say, you couldn’t even add regular numbers together or index into an array in nopanic code. But there are ways to work around it (like the wrapping types).
One problem is that implicit nopanic would add a new way to break semver compatibility in APIs. Eg, imagine a public api that just happens to not be able to panic. If the code is changed subtly, it could easily start panicing again. That could break callers, so it has to be a major version bump. You’d probably have to require explicit nopanic at api boundaries. (Else assume all public functions from other crates can panic). And because of that, public APIs like std would need to be plastered with nopanic markers everywhere. It’s also not clear how that works through trait impls.
Odin has them, too, optionally (and usually).
Unfortunately even the Rust core language doesn't treat them this way.
I think it's arguably the single biggest design mistake in the Rust language. It prevents a ton of useful stuff like temporarily moving out of mutable references.
They've done a shockingly good job with the language overall, but this is definitely a wart.
Rust also provides Wrapping and Saturating wrapper types for these integers, which wrap (255 + 1 == 0) or saturate (255 + 1 == 255). Depending on your CPU either or both of these might just be "how the computer works anyway" and will accordingly be very fast. Neither of them is how humans normally think about arithmetic.
Furthermore, Rust also provides operations which do all of the above, as well as the more fundamental "with carry" type operations where you get two results from the operation and must write your algorithms accordingly, and explicitly fallible operations where if you would overflow your operation reports that it did not succeed.
Panic is absolutely fine for bugs, and it's indeed what should happen when code is buggy. That's because buggy code can make absolutely no guarantees on whether it is okay to continue (arbitrary data structures may be corrupted for instance)
Indeed it's hard to "treat an error" when the error means code is buggy. Because you can rarely do anything meaningful about that.
This is of course a problem for code that can't be interrupted.. which include the Linux kernel (they note the bug, but continue anyway) and embedded systems.
Note that if panic=unwind you have the opportunity to catch the panic. This is usually done by systems that process multiple unrelated requests in the same program: in this case it's okay if only one such request will be aborted (in HTTP, it would return a 5xx error), provided you manually verify that no data structure shared by requests would possibly get corrupted. If you do one thread per request, Rust does this automatically; if you have a smaller threadpool with an async runtime, then the runtime need to catch panics for this to work.
Rust picked the right default for applications that run in an OS whereas Zig picked the right default for embedded. Both are good for their respective domains, neither is good at both domains. Zig's choice is verbose and useless on a typical desktop OS, especially with overcommit, whereas Rust's choice is problematic for embedded where things just work differently.
Honestly this is where you'd throw an exception. It's a shame Rust refuses to have them, they are absolutely perfect for things like this...
My current $dayjob involves a "server" application that needs to run in a strict memory limit. We had to write our own allocator and collections because the default ones' insistence on using GlobalAlloc infallibly doesn't work for us.
Thinking that only "embedded" cares about custom allocators is just naive.
The only place where it would be different is if you explicitly set panics to abort instead of unwind, but that's not default behavior.
As a video game developer, I've found the case for custom general-purpose allocators pretty weak in practice. It's exceedingly rare that you really want complicated nonlinear data structures, such as hash maps, to use a bump-allocator. One rehash and your fixed size arena blows up completely.
95% of use cases are covered by reusing flat data structures (`Vec`, `BinaryHeap`, etc.) between frames.
Requests are matched against the smallest tier that can satisfy them (static tiers before dynamic). If no tier can satisfy it (static tiers are too small or empty, dynamic tier's "remaining" count is too low), then that's an allocation failure and handled by the caller accordingly. Eg if the request was for the initial buffer for accepting a client connection, the client is disconnected.
When a buffer is returned to the allocator it's matched up to the tier it came from - if it came from a static tier it's placed back in that tier's list, if it came from the dynamic tier it's free()d and the tier's used counter is decremented.
Buffers have a simple API similar to the bytes crate - "owned buffers" allow &mut access, "shared buffers" provide only & access and cloning them just increments a refcount, owned buffers can be split into smaller owned buffers or frozen into shared buffers, etc.
The allocator also has an API to query its usage as an aggregate percentage, which can be used to do things like proactively perform backpressure on new connections (reject them and let them retry later or connect to a different server) when the pool is above a threshold while continuing to service existing connections without a threshold.
The allocator can also be configured to allocate using `mmap(tempfile)` instead of malloc, because some parts of the server store small, infrequently-used data, so they can take the hit of storing their data "on disk", ie paged out of RAM, to leave RAM available for everything else. (We can't rely on the presence of a swapfile so there's no guarantee that regular memory will be able to be paged out.)
As for crates.io, there is no option. We need local allocators because different parts of the server use different instances of the above allocator with different tier configs. Stable Rust only supports replacing GlobalAlloc; everything to do with local allocators is unstable, and we don't intend to switch to nightly just for this. Also FWIW our allocator has both a sync and async API for allocation (some of the allocator instances are expected to run at capacity most of the time, so async allocation with a timeout provides some slack and backpressure as opposed to rejecting requests synchronously and causing churn), so it won't completely line up with std::alloc::Allocator even if/when that does get stabilized. (But the async allocation is used in a localized part of the server so we might consider having both an Allocator impl and the async direct API.)
And so because we need local allocators, we had to write our own replacements of Vec, Queue, Box, Arc, etc because the API for using custom A with them is also unstable.
This is a sign you are writing an operating system instead of using one. Your web server should be handling requests from a pool of processes - so that you get real memory isolation and can crash when there is a problem.
And now your language has exceptions - which break control flow and make reasoning about a program very difficult - and hard to optimize for a compiler.
Who writes the crates?
Going from panic to panic free in Rust is as simple as choosing 'function' vs 'try_function'. The actual mistakes in Rust were the ones where the non-try version should have produced a panic by default. Adding Box::try_new next to Box::new is easy.
There are only two major applications of panic free code in Rust: critical sections inside mutexes and unsafe code (because panic safety is harder to write than panic free code). In almost every other case it is far more fruitful to use fuzzing and model checking to explicitly look for panics.
If there was a special case that would not work, then the design dictates that requests are not independent and there must be risk of interference (they are in the same process!)
What I definitely do not want is a bug ridden “crashable async sub task” system built in my web program.
you’re vastly over estimating the overhead of processes and number of simultaneous web connections.
> only to gain a minor amount of safety
What you’re telling me is performance (memory?) is such a high priority you’re willing to make correctness and security tradeoffs.
And I’m saying thats ok, one of those is crashing might bring down more than one request.
> one that is virtually irrelevant in a memory safe language
Your memory safe language uses C libraries in its process.
Memory safe languages have bugs all the time. The attack surface is every line of your program and runtime.
Memory is only one kind of resource and privilege. Process isolation is key for managing resource access - for example file descriptors.
Chrome is a case study if these principles. Everybody thought isolating JS and HTML pages should be easy - nobody could get it right and chrome instead wrapped each page in a process.
It's less the actual overhead of the process but the savings you get from sharing. You can reuse database connections, have in-memory caches, in-memory rate limits and various other things. You can use shared memory which is very difficult to manage or an additional common process, but either way you are effectively back to square one with regards to shared state that can be corrupted.
Handling thousands of concurrent requests is table stakes for a simple web server. Handling thousands of concurrent processes is beyond most OSs. The context switching overhead alone would consume much of the CPU of the system. Even hundreds of processes will mean a good fraction of the CPU being spent solely on context switching - which is a terrible place to be.
You can configure your lints in your workspace-level Cargo.toml (the folder of crates)
“””
[workspace.lints.clippy]
pedantic = { level = "warn", priority = -1 }
# arithmetic_side_effects = "deny"
unwrap_used = "deny"
expect_used = "deny"
panic = "deny"
“””
then in your crate Cargo.toml “””
[lints]
workspace = true
“””
Then you can’t even compile the code without proper error handling. Combine that with thiserror or anyhow with the backtrace feature and you can yeet errors with “?” operators or match on em, map_err, map_or_else, ignore them, etc
[1] https://rust-lang.github.io/rust-clippy/master/index.html#un...
I said absolutely no such thing? In my $dayjob working on graphics I, too, have used custom allocators for various things, primarily in C++ though, not Rust. But that in no way makes the default of a global allocator wrong, and often those custom allocators have specialized constraints that you can exploit with custom containers, too, so it's not like you'd be reaching for the stdlib versions probably anyway.
Like
this
Although as a library vendor, you kind have to assume your library could be compiled into an app configured with panic=abort, in which case it will not do that
It works fine on Linux - the operating system for the internet. Have you tried it?
> good fraction of the CPU being spent solely on context switching
I was waiting for this one. Threads and processes do the same amount of context switching. The overhead of processes switch is a little higher. The main cost is memory.
I just said one of the costs of those saving is crashing may bring down multiple requests - and you should design with that trade off.
As far as I can tell, no_std doesn't change anything with regard to either the usability of panicking operators like integer division, slice indexing, etc. (they're still usable) nor on whether they panic on invalid input (they still do).
# Errors
`foo` returns an error called `UnspecifiedError`, but this only
happens when an anticipated bug in the implementation occurs. Since
there are no known such bugs, this API never returns an error. If
an error is ever returned, then that is proof that there is a bug
in the implementation. This error should be rendered differently
to end users to make it clear they've hit a bug and not just a
normal error condition.
Imagine if I designed `regex`'s API like this. What a shit show that would be.If you want a less flippant take down of this idea and a more complete description of my position, please see: https://burntsushi.net/unwrap/
> Honestly, I don't think libraries should ever panic. Just return an UnspecifiedError with some sort of string.
The latter is not a solution to the former. The latter is a solution to libraries having panicking branches. But panics or other logically incorrect behavior can still occur as a result of bugs.
> Text after a blank line that is indented by two or more spaces is reproduced verbatim. (This is intended for code.)
Not saying there aren't applications where using these lints couldn't be alright (web servers maybe), but at least in my experiences (mostly doing CLI, graphics, and embedded stuff) trying to keep the program alive leads to more problems than less.
Yes, therefore real webservers use a limited amount of threads/processes (in the same ballpark as a number of CPU cores). Modern approach is to use green threads which are really cheap to switch, it is like store registers, read registers and jmp.
> The main cost is memory.
The main cost is scheduling, not switching per se. Preemptive multitasking needs to deal with priorities to not waste time, and algorithms that do it are O(N) mostly. All these O(N) calculations needs to be completed multiple times per second, the higher the frequency of switching the more work to do. When you have thousands of processes it is the main cost. If you have tens of thousands it starts to bite hard.
You abandon the current activity and bubble up the error to a stage where that effort can be tossed out or retried sometime later. i.e. Use the same error handling approach you would have to use for any other unreliable operation like networking.
The person I am having a conversation with is advocating for threads instead of processes. How do you think threads work?
> Modern approach is to use green threads which are really cheap to switch, it is like store registers, read registers and jmp.
That’s certainly the popular approach. As I said at the beginning this approach is making a mini operating system with more bugs and less security rather than leveraging the capabilities of your operating system.
Once again, im waiting to here about your experience of maxing out processes and after that having to switch to green threads.
Are they? I looked back and I've found this quote of them: "The overhead of process-per-request, or even thread-per-request, is absurd if you're already using a memory safe language." Doesn't seem as an advocacy for thread-per-request to me.
> As I said at the beginning this approach is making a mini operating system with more bugs and less security rather than leveraging the capabilities of your operating system.
Lets look at Apache for example. It starts a few processes and/or threads, but then each thread deals with a lot of connections. The threads Apache starts are for spreading work over several CPUs and maybe to overcome some limits of select/poll/epoll. The main approach is to track a state of a connection, and when something happens on a socket, Apache find the state of the connection and deals with events on the socket. Then it stores the new state and moves to deal with other sockets in the same manner.
It is like green threads but without green threads. Green threads streamlines all this state keeping by allowing each connection to have it's own stack. And I'd say it is easier to do right than to write a finite automata for HTTP/HTTPS.
> Once again, im waiting to here about your experience of maxing out processes and after that having to switch to green threads.
Oh, I didn't. A long long time ago I was reading stuff on networking. All of it was in one opinion: 10k kernel tasks maybe a tolerable solution, but 100k is bad. IIRC Apache had a document describing its internal architecture and explaining why it is as it is.
So I wouldn't even try to start thousands of threads. I mean I tried to start 1000s of processes when I was young and learned about fork-bombs, and this experience confirmed it for me, that 1000s of processes is not a really good idea.
Moreover I completely agree with them: if you use a memory-safe language, then it is strange to pay costs for preemptive multitasking just to have separate virtual address spaces. I mean, it will be better to get a virtual machine with JIT compiler, and run code for different connection on different instances of a virtual machine. O(1) complexity of cooperative switching will beat O(N) complexity of preemptive switching. To my mind hardware memory management is overrated.
https://rust-lang.github.io/rust-clippy/master/#indexing_sli...
Usually I just use the `?` and `.map_err` (or `anyhow` / `thiserror`) to delegate and move on with life.
I have a few places where I do pattern-matches to avoid exactly what you described: imposing the extra internal complexity to users. Which is indeed a bad thing and I am trying to fight it. Not always succeeding.
And that's why my programs get compiled with panic=abort, that makes panics just quit the program, with no ability to catch them, and no programs in zombie states where some threads panicked and others keep going on.
But see, catch_panic is an escape hatch. It's not meant to be used as a general error handling mechanism and even when doing FFI, Rust code typically converts exceptions in other languages into Results (at a performance cost, but who cares). But Rust needs a escape right, it is a low level language.
And there is at least one case where the catch_unwind is fully warranted: when you have an async web server with multiple concurrent requests and you need panics to take down only a single request, and not the whole server (that would be a DoS vector). If that weren't possible, then async Rust couldn't have feature parity with sync Rust (which uses a thread-per-request model, and where panics kill the thread corresponding to the request)
I was certainly not, I explicitly said that thread-per-request is as bad as process-per-request. I could even agree that it's the worse of both worlds to some extent - none of the isolation, almost all of the overhead (except if you're using a language with a heavy runtime, like Java, where spawning a new JVM has a huge cost compared to a new thread in an existing JVM).
Modern operating systems provide many mechanisms for doing async IO specifically to prevent the need for spawning and switching between thousands of processes. Linux in particular has invested heavily in this, from select, to poll, to epoll, and now unto io_uring.
OS process schedulers are really a poor tool for doing massively parallel IO. They are a general purpose algorithm that has to keep in mind many possible types of heterogeneous processes, and has no insight into the plausible behaviors of those. For a constrained problem like parallel IO, it's a much better idea to use a purpose-built algorithm and tool. And they have simply not been optimized with this kind of scale in mind, because it's much more important and common use case to run quickly for a small number of processes than it is to scale up to thousands. There's a reason typical ulimit configurations are limited to around 1000 threads/processes per system for all common distros.
Panicking should always be treated as a bug. They are assertions.
Addressed in sibling thread - it’s a poor default to design Rust around.
Correction. People who wanted to do async IO went and added additional support for it. The primary driver is node.js.
> And they have simply not been optimized with this kind of scale in mind,
yes, processes do not sacrifice security and reliability. That’s the difference.
The fallacy here is assuming that a process is just worse for hand wavy reasons and that your language feature has fa secret sauce.
If it’s not context switching then that means you have other scheduling problems because you cannot be pre-empted.
> There's a reason typical ulimit configurations are limited to around 1000 threads/processes per system
STILL waiting to hear about your experience of maxing out Linux processes on a web server - and then fixing it with green threads.
I suspect it hasn’t happened.
Apache has years of engineering work - and almost weekly patches to fix issues related to security. Many of these security issues would go away if they were not using special technique to optimize performance.
But the best part of the web is its modular. So now your application doesn’t need to that. It can leverage those benefits without complexity cascade.
For example, Apache can manage more connections than your application needs running processes for.
> I was reading stuff on networking….
That’s exactly my point. Too many people are repeating advice from Google or Facebook and not actually thinking about real problems they face.
Can you serve more requests using specialized task management? Yes. You can make a mini-OS with fewer features to squeeze out more scheduling performance and that’s what some big companies did.
But you will pay for that with reduced security and reliability. To bring it back to my original complaint - you must accept that a crash can bring down multiple requests.
And it’s an insane default to design Rust around. It’s especially confusing to make all these arguments about how “unsafe” languages are, but then ignore OS safety in hopes of squeezing out a little more perf.
> So I wouldn't even try to start thousands of threads.
Please try it before arguing it doesn’t work. Fork bombing is recursive and unrelated.
> if you use a memory-safe language, then it is strange to pay costs for preemptive multitasking just to have separate virtual address spaces
Then why do these “memory-safe” languages need constant security patches? Why does chrome need to wrap each page’s JS in its own process?
In theory you’re right. If they are actually memory-safe then you don’t need to consider address spaces. But in practice the attack surface is massive and processes give you stronger invariants.