←back to thread

Show HN: Nova JavaScript Engine

(github.com)

271 points aapoalas | 1 comments | 17 Nov 24 23:07 UTC | HN request time: 0.236s | source

We're building a different kind of JavaScript engine, based on data-oriented design and willingness to try something quite out of left field. This is most concretely visible in our major architectural choices:

1. All data allocated on the JavaScript heap is placed into a type-specific vector. Numbers go into the numbers vector, strings into the strings vector, and so on.

2. All heap references are type-discriminated indexes: A heap number is identified by its discriminant value and the index to which it points to in the numbers vector.

3. Objects are also split up into object kind -specific vectors. Ordinary objects go into one vector, Arrays go into another, DataViews into yet another, and so on.

4. Unordinary objects' heap data does not contain ordinary object data but instead they contain an optional index to the ordinary objects vector.

5. Objects are aggressively split into parts to avoid common use-cases having to reading parts that are known to be unused.

If this sounds interesting, I've written a few blog posts on the internals of Nova over in our blog, you can jump into that here: https://trynova.dev/blog/what-is-the-nova-javascript-engine

Show context

lionkor ◴[18 Nov 24 11:40 UTC] No.42171567[source]▶

>>42168166 (OP) #

Isn't data oriented design driven by knowing what your data accesses look like? In your engine, you're building as if you're assuming that common data access will be linear access over objects of the same type. Why?

replies(1): >>42171651 #

aapoalas ◴[18 Nov 24 11:57 UTC] No.42171651[source]▶

Yeah, know your data and how it is used. I assume that data access is mostly linear because of a few reasons:

1. All performance issues arise in loops: I at least have never seen a performance problem that could be explained by a single thing happening once. It is always a particular thing happening over and over again.

2. All loops deal with collections of data, and the collections are usually created either created manually by a human being, or are created through parsing or looping many at a time.

3. A human being can manually create a collection of maybe a hundred items manually before they get bored and stop. A collection created this way may contain data from all over the place, with data access over it being nonlinear.

4. A collection created through parsing or looping will create its data in a mostly linear fashion. Accessing the data will then also be linear.

There are definitely cases where nonlinear collections exist, but these are usually either small or are created from smaller sets of linear data. eg. Think of dragging 10 lists of 1000 items to form a list of 10000 items. The entire 10000 items aren't going to be located linearly, but every 1000 items will be.

So in effect, I'm betting that most hot loops do deal with linear access over objects and that loops that work over nonlinear access are not particularly hot.

replies(1): >>42175476 #

1. lionkor ◴[18 Nov 24 18:40 UTC] No.42175476[source]▶

Makes sense when you put it like that, thanks very much for explaining your thought process.