←back to thread

269 points aapoalas | 5 comments | | HN request time: 1.368s | source

We're building a different kind of JavaScript engine, based on data-oriented design and willingness to try something quite out of left field. This is most concretely visible in our major architectural choices:

1. All data allocated on the JavaScript heap is placed into a type-specific vector. Numbers go into the numbers vector, strings into the strings vector, and so on.

2. All heap references are type-discriminated indexes: A heap number is identified by its discriminant value and the index to which it points to in the numbers vector.

3. Objects are also split up into object kind -specific vectors. Ordinary objects go into one vector, Arrays go into another, DataViews into yet another, and so on.

4. Unordinary objects' heap data does not contain ordinary object data but instead they contain an optional index to the ordinary objects vector.

5. Objects are aggressively split into parts to avoid common use-cases having to reading parts that are known to be unused.

If this sounds interesting, I've written a few blog posts on the internals of Nova over in our blog, you can jump into that here: https://trynova.dev/blog/what-is-the-nova-javascript-engine

Show context
throwawaymaths ◴[] No.42175887[source]
doesn't using these sorts of data structures turn the security safety properties of rust into silent logic errors?
replies(1): >>42176198 #
1. aapoalas ◴[] No.42176198[source]
Nope. It rather explains the memory ownership of a JavaScript heap in a way that Rust understands: The heap owns all data in the heap, and JavaScript objects holding "references" to other objects do not imply memory ownership in the sense that Rust understands it.

So, safery properties are not being silenced: The indexes definitely _are_ Rust wise unreliable where a pointer wouldn't be so bounds checks need to be done. But memory safety is not under threat here.

This does mean that we have to take care of garbage collection ourselves, Rust will not do that for us, but that was the case anyhow since Rust doesn't have a garbage collector we could use (thank heavens). If we make mistakes here, it will lead to the JavaScript heap being corrupted from the JS code point of view but from the engine point of view the memory is still fully safe: The worst thing that can happen is a panic from out of bounds vector indexing.

replies(2): >>42176380 #>>42180981 #
2. int_19h ◴[] No.42176380[source]
I think the point is that "memory safety", understood broadly, goes beyond that. You still have the notion of dangling pointers and such with indices, and while it doesn't allow for e.g. stack corruption, it still exposes many similar classes of bugs - for example, a "dangling" index to a deallocated object can potentially allow the code to read or write into a completely different object than it was originally pointing to. Hence, silent logic errors, potentially exploitable as well.
replies(1): >>42176580 #
3. aapoalas ◴[] No.42176580[source]
Yeah, sure in that sense this partially turns off Rust's safety features. That being said: A big part of making the engine be safe for interleaved GC is about using ZST's with lifetimes held inside them to bind any JavaScript Values when they appear on the stack and getting back parts of those security guarantees.

We can still make mistakes, especially in the garbage collector, but that is again somewhat helped by code-sharing and coding conventions enabled by Rust ie. using destructuring in GC to make sure we don't forget any part of the heap data. (Coding conventions are not a solution, they are a mitigation at most. I _can_ write the heap GC as a map from one heap data of 'old lifetime to 'new, but that leads to worse code generation than mutating a 'static lifetime heap data :( )

4. throwawaymaths ◴[] No.42180981[source]
> But memory safety is not under threat here.

Note I did not say memory safety. I said security safety.

replies(1): >>42188436 #
5. aapoalas ◴[] No.42188436[source]
I don't know what "security safety" is so I must've gotten confused. If you mean type safety, then we do make sure to stay on top of that: Our JS Value is an enum that contains either stack data or a typed index that corresponds to the tag. So the Array variant holds an Array index etc. So it is not possible to take type of index and turn it into another type of index without transmute.

If you refer to referential safety, so that your reference to object X still refers to X later on, then that is indeed something we "lose" because we need to implement GC ourselves. But that wouldn't actually really meaningfully change with using pointers either, as updating pointers after a move would need to be done manually as well.

Using references is right out because we cannot explain the JavaScript memory ownership model to Rust: The two are simply not compatible. There are of course safe GC crates that give you reference APIs but they do the pointer updating manually on the inside (if they have moving GC anyway), so the situation doesn't meaningfully change.