←back to thread

271 points aapoalas | 10 comments | | HN request time: 1.235s | source | bottom

We're building a different kind of JavaScript engine, based on data-oriented design and willingness to try something quite out of left field. This is most concretely visible in our major architectural choices:

1. All data allocated on the JavaScript heap is placed into a type-specific vector. Numbers go into the numbers vector, strings into the strings vector, and so on.

2. All heap references are type-discriminated indexes: A heap number is identified by its discriminant value and the index to which it points to in the numbers vector.

3. Objects are also split up into object kind -specific vectors. Ordinary objects go into one vector, Arrays go into another, DataViews into yet another, and so on.

4. Unordinary objects' heap data does not contain ordinary object data but instead they contain an optional index to the ordinary objects vector.

5. Objects are aggressively split into parts to avoid common use-cases having to reading parts that are known to be unused.

If this sounds interesting, I've written a few blog posts on the internals of Nova over in our blog, you can jump into that here: https://trynova.dev/blog/what-is-the-nova-javascript-engine

Show context
Etheryte ◴[] No.42171237[source]
Architectural choices are interesting to talk about, but I think most people reading this won't have any context to compare against, me included. How does this compare to e.g. the architecture of V8? What benefits do these choices give when compared against other engines? Etc, reading through the list it's easy to nod along, but it's hard to actually have an intuition about whether these are good choices or not.
replies(3): >>42171249 #>>42171285 #>>42171820 #
VPenkov ◴[] No.42171285[source]
They seem to have a blog post on that: https://trynova.dev/blog/why-build-a-js-engine

It reads like an experimental approach because someone decided to will it into existence. That and to see if they can achieve better performance because of the architectural choices.

> Luckily, we do have an idea, a new spin on the ECMAScript specification. The starting point is data-oriented design (...)

> So, when you read a cache line you should aim for the entire cache line to be used. The best data structure in the world, bar none, is the humble vector (...)

> So what we want to explore is then: What sort of an engine do you get when almost everything is a vector or an index into a vector, and data structures are optimised for cache line usage? Join us in finding out (...)

replies(3): >>42171319 #>>42171414 #>>42175089 #
1. andai ◴[] No.42171414[source]
This is cool, but I'm wondering

(1) Why doesn't V8, whose whole point is performance, lay out memory in an optimal way?

(2) Will Nova need to also implement all of V8's other optimizations, to see if Nova's layout makes any significant difference?

replies(3): >>42171479 #>>42173066 #>>42177927 #
2. aapoalas ◴[] No.42171479[source]
V8 could probably implement the backing object "trick" with some trouble. I'm half-hoping that Nova will show it to be worth their while and that they will eventually do it. It will be a major refactoring of the engine, however.

The heap vector "trick" is basically impossible, I believe. It wouldn't be a refactoring so much as it would be a complete rewrite of the engine. The entirety of V8 assumes it deals in pointers, and all of that would need to change to using indexes instead. I will eat my hat if they do it. Without heap vectors they can still split object data apart using pointer-keyed hash maps, so maybe they could take advantage of some of the ideas still.

V8 does offer ways to run code without optimisations, which we can use for a more apples-to-apples comparison. The most important optimisation that Nova really needs before any big performance comparisons become meaningful is property access inline caching, which requires implementing object shapes.

I'd say that once object shapes are done, then limited performance comparisons can probably be made, especially if V8's JIT is disabled.

replies(2): >>42171538 #>>42171749 #
3. reverius42 ◴[] No.42171538[source]
So is the whole point of this project to convince V8 to adopt a particular optimization?
replies(1): >>42171577 #
4. aapoalas ◴[] No.42171577{3}[source]
Not really: In my daydreams Nova becomes the premier JS engine in the world and takes the crown from V8. If V8 went all in and basically just copied all of Nova... I'd probably still develop Nova, as I don't want to work with C++ that much.

If V8 copied all of Nova AND adopted Rust, I might consider laying Nova to rest and going into V8 development. But I'd probably also be really angry at V8 just taking all of Nova's good ideas and peddling them off as their own without crediting Nova. So probably I'd still keep developing Nova while stewing in my anger and inability to do anything about it :)

I hope Nova can be a spark that ignites the JavaScript world into a bit of a renaissance with some of its ideas, but the point is not to burn bright and burn out. The point is to burn bright and stay lit.

replies(1): >>42171619 #
5. rob74 ◴[] No.42171619{4}[source]
> But I'd probably also be really angry at V8 just taking all of Nova's good ideas and peddling them off as their own without crediting Nova.

Who knows, maybe they'd even give you credit (while still taking the idea)?

replies(1): >>42171659 #
6. aapoalas ◴[] No.42171659{5}[source]
It could definitely happen. It would be a hard decision for me then :)
7. Leszek ◴[] No.42171749[source]
To be fair, pointer compression is morally and memory-wise similar to indexing a vector.
replies(1): >>42171757 #
8. aapoalas ◴[] No.42171757{3}[source]
Yup, and to a degree the whole "heap references as indexes" idea was inspired by pointer compression. Not in a direct sense of "hey, look at that, what if I took it a step further?" but as I was thinking of the indexes I realised that it looks a lot like pointer compression, and that made me think it is a viable idea.
9. liontwist ◴[] No.42173066[source]
1 it takes time and effort to make major architectural changes.

Certain design choices made for other reasons may conflict.

10. munificent ◴[] No.42177927[source]
Optimal memory layout isn't something you can know in advance. The optimal way to lay out objects in memory is exactly in the order that they will be accessed, but the runtime doesn't know that until after the user program has accessed them.

And if the program accesses a set of objects in different orders at different times, there is no one optimal layout.

I did ask Lars Bak once if they spent a lot of time thinking about cache usage and organizing objects in memory to take best advantage of it and, if I recall correctly, his answer was basically "no". They definitely think about it terms of packing objects into small amounts of memory. But in a dynamically typed language like JavaScript where every property is a reference to some other object elsewhere in memory, using the cache well is just profoundly hard.

Hell, it's hard even in Java where at least you do know the set of fields any given class has.