Most active commenters

jampekka(6)
tightbookkeeper(5)
anyfoo(4)
magicalhippo(3)
igouy(3)
lucumo(3)

Popular/hot comments

>>41898816 #
>>41898879 #
>>41898949 #
>>41899464 #

←back to thread

Skeptical of rewriting JavaScript tools in "faster" languages

(nolanlawson.com)

1. jpalawaga ◴[20 Oct 24 21:53 UTC] No.41898791[source]▶

>>41898603 (OP) #

Anyone who has done a programming contest, advent of code, etc knows that the language doesn’t matter so much as your algorithm.

Yes, the language can bring a nice speed up, or might give you better control of allocations which can save a lot of time. But in many cases, simply picking the correct algorithm will deliver you most of the performance.

As someone who doesn’t JavaScript a lot, I’d definitely prefer a tool written in go and available on brew over something I need to invoke node and its environment for.

replies(12): >>41898816 #>>41898879 #>>41898890 #>>41898952 #>>41899000 #>>41899028 #>>41899401 #>>41901158 #>>41901185 #>>41901525 #>>41902030 #>>41904514 #

2. tyree731 ◴[20 Oct 24 21:56 UTC] No.41898816[source]▶

>>41898791 (TP) #

Lots of very smart people have worked very hard on Python tools written in Python, yet the rust rewrites of those tools are so much faster. Sometimes it really is the programming language.

replies(6): >>41898834 #>>41898859 #>>41898913 #>>41898949 #>>41899003 #>>41899226 #

3. anyfoo ◴[20 Oct 24 21:59 UTC] No.41898834[source]▶

>>41898816 #

Choosing the right algorithm effectively means optimizing runtime complexity. Then, once runtime complexity is fixed with the right algorithm, you're still left with a lot of constant factors that O-notation deliberately ignores (it's only about growth of the runtime). Sometimes, optimizing those constant factors can be significant, and then the choice of language matters. And even some details about the CPU you are targeting, and overall system architecture.

replies(1): >>41898920 #

4. charrondev ◴[20 Oct 24 22:04 UTC] No.41898859[source]▶

>>41898816 #

In the JavaScript world a lot of speed up comes from 3 major things as far as I can tell:

- easier concurrency. - the fact that things are actually getting rewritten with the purpose of speeding them up. - a lot of the JS tooling getting speedups deals with heavily with string parsing, tokenizing, generating and manipulation of ASTs. Being able to have shared references to slices of strings, carefully manage when strings are copied, and have strict typing of the AST nodes you enable things to be much faster than JavaScript.

5. noname120 ◴[20 Oct 24 22:07 UTC] No.41898879[source]▶

>>41898791 (TP) #

It's not just a matter of “picking the correct algorithm”. Algorithmic-interview exercises are algorithmic-interview exercices. They are barely related to real-world software engineering.

replies(3): >>41898892 #>>41898946 #>>41898965 #

6. tightbookkeeper ◴[20 Oct 24 22:08 UTC] No.41898890[source]▶

>>41898791 (TP) #

> knows that the language doesn’t matter so much as your algorithm.

I know what you’re referring to but these problems have also taught me a lot about language performance. python and JS array access is just 100x slower than C. Some difficult problems become much harder due to this limitation.

replies(1): >>41898967 #

7. anyfoo ◴[20 Oct 24 22:09 UTC] No.41898892[source]▶

>>41898879 #

Choosing the right algorithm is usually the prerequisite for fast code. Optimizing the constant factors is often pretty useless if you pick an algorithm with a runtime that grows quadratically, when there are much better options available.

replies(1): >>41898918 #

8. jvanderbot ◴[20 Oct 24 22:11 UTC] No.41898913[source]▶

>>41898816 #

This is a very nice counterexample, but it's not actually a counter example without an example.

Also, this was a thing before Rust. I've rewritten several things in C or Cpp for python back ends, and most pytbon performance-critical code is already an API to a shared library. You'd be surprised to run OR tools and find Fortran libraries loaded by your python code.

replies(1): >>41898995 #

9. noname120 ◴[20 Oct 24 22:12 UTC] No.41898918{3}[source]▶

>>41898892 #

What makes you think that the sluggishness of these tools is in any way related to not “choosing the right algorithm”?

replies(1): >>41898933 #

10. o11c ◴[20 Oct 24 22:13 UTC] No.41898920{3}[source]▶

>>41898834 #

Often languages like Javascript and Python don't allow optimal runtime complexity, because the types baked in to external interfaces fundamentally disallow the desired operation. And these languages are too slow to rewrite the core logic in the language itself.

(but of course, the vast majority of the code, even in widely used tools, isn't properly designed for optimization in the first place)

I only dabble in javascript, but `tsc` is abominable.

11. anyfoo ◴[20 Oct 24 22:14 UTC] No.41898933{4}[source]▶

>>41898918 #

What makes you think they're not? I don't know why these tools are sluggish, but I disagree with the notion that algorithms don't matter for "real-world software engineering".

The world is full of slow software because one chose the wrong algorithm: https://randomascii.wordpress.com/2019/04/21/on2-in-createpr... https://randomascii.wordpress.com/2019/12/08/on2-again-now-i... https://randomascii.wordpress.com/2021/02/16/arranging-invis... ...

12. magicalhippo ◴[20 Oct 24 22:16 UTC] No.41898946[source]▶

>>41898879 #

While picking the right algorithm seldom comes up in most programmers day-to-day activities, being aware of big-Oh and the performance guarantees/characteristics of the libraries you use most certainly should.

I don't care if you don't know how to write a merge sort from scratch. I do care about you knowing not to write an O(n^2) loop when it can be avoided.

replies(1): >>41901165 #

13. jampekka ◴[20 Oct 24 22:17 UTC] No.41898949[source]▶

>>41898816 #

Python is really really slow compared to JS though.

replies(3): >>41899001 #>>41899014 #>>41901003 #

14. hawski ◴[20 Oct 24 22:18 UTC] No.41898952[source]▶

>>41898791 (TP) #

In higher level languages your code may allocate memory or trigger a GC pass or other smartness in unexpected places. This may cause slowdowns you may not have control over or may change from version to version or vendor to vendor. It is often easier to manage in "faster" languages. Good algorithm may not be enough.

15. tightbookkeeper ◴[20 Oct 24 22:20 UTC] No.41898965[source]▶

>>41898879 #

Exactly. What do you do when you have the right algorithm and it’s too slow (very typical for linear problems that require visiting each item).

replies(1): >>41899023 #

16. jampekka ◴[20 Oct 24 22:20 UTC] No.41898967[source]▶

>>41898890 #

JS array access is a lot faster than Python array access. JS is easily within magnitude of C and can be even about as fast with typed arrays or well JITable code.

replies(1): >>41899140 #

17. runesoerensen ◴[20 Oct 24 22:25 UTC] No.41898995{3}[source]▶

>>41898913 #

Ruff is one example https://astral.sh/ruff

replies(1): >>41902025 #

18. zeroonetwothree ◴[20 Oct 24 22:26 UTC] No.41899000[source]▶

>>41898791 (TP) #

The types of problems in those contests are meant to highlight algorithms. In the real world you might have a trivial algorithm but a huge input size, where the constant factor matters much more.

19. gaganyaan ◴[20 Oct 24 22:26 UTC] No.41899001{3}[source]▶

>>41898949 #

CPython is (though it's slowly getting better). Pypy is amazingly fast

20. worik ◴[20 Oct 24 22:26 UTC] No.41899003[source]▶

>>41898816 #

> Lots of very smart people have worked very hard on Python tools written in Python

Yes, I agree that is very sad

Python is achingly slow. I know the Python people want to address this, I do not understand. Python makes sense as a scripting/job control language, and execution speed does not matter.

As an application development language it is diabolical. For a lot of reasons, not just speed

21. zeroonetwothree ◴[20 Oct 24 22:28 UTC] No.41899014{3}[source]▶

>>41898949 #

I once worked on a Python system that had 50 machines dedicated to it. We were able to rewrite it in a more performant language such that it easily ran on one machine. This also allowed us to avoid all the issues distributed systems have.

So yeah, Python is not great for systems programming

22. anyfoo ◴[20 Oct 24 22:29 UTC] No.41899023{3}[source]▶

>>41898965 #

You optimize the constant factors, e.g. the runtime of the inner loops. But this requires you to choose a sane algorithm in the first place.

Some problems are much more complicated, where you have to take, for example, locality (cache hierarchy etc.) and concurrency considerations like lock contention into account. This may affect your choice of algorithm, but by the time you reach that, you've almost certainly thought about the algorithm a lock already.

23. moomin ◴[20 Oct 24 22:30 UTC] No.41899028[source]▶

>>41898791 (TP) #

Here’s the thing: languages like C#, Java and Rust all have extensive libraries and packages that implement many common data structures and algorithms well. With all due respect to the incredible work that goes into projects like lodash, JavaScript does not. (Nor does C, for that matter.)

24. tightbookkeeper ◴[20 Oct 24 22:52 UTC] No.41899140{3}[source]▶

>>41898967 #

> JS is easily within magnitude of C

Typed arrays help a lot, but I’m still doubtful. Maybe it all the processing is restrict to idioms in the asm.js subset? And even then you’re getting bounds checking.

replies(1): >>41899241 #

25. bsder ◴[20 Oct 24 23:06 UTC] No.41899226[source]▶

>>41898816 #

> Lots of very smart people have worked very hard on Python tools written in Python, yet the rust rewrites of those tools are so much faster.

So?

Some tool got written and did its job sufficiently well that it became a bottleneck worth optimizing.

That's a win.

"Finishing the task" is, by far, the most difficult thing in programming. And the two biggest contributors to that are 1) simplicity of programming language and 2) convenience of ecosystem.

Python and Javascript are so popular because they tick both boxes.

replies(1): >>41899545 #

26. jampekka ◴[20 Oct 24 23:07 UTC] No.41899241{4}[source]▶

>>41899140 #

In benchmarks JS is usually well within magnitude (i.e. under 10x slower).

E.g. C++ vs Node.js here: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Couldn't find C vs JS easily with the new benchmarksgame UI.

replies(2): >>41899295 #>>41899458 #

27. tightbookkeeper ◴[20 Oct 24 23:16 UTC] No.41899295{5}[source]▶

>>41899241 #

I guess so. I clicked on the code for the first one. It’s using a C library to do the computation:

> mpzjs. This library wraps around libgmp's integer functions to perform infinite-precision arithmetic

And then the “array”:

> Buffer.allocUnsafe

So is this a good JavaScript benchmark?

replies(1): >>41899418 #

28. Ferret7446 ◴[20 Oct 24 23:31 UTC] No.41899401[source]▶

>>41898791 (TP) #

Quite the opposite, for most cases you don't hit the scale where asymptotic algorithmic performance really makes a big impact (e.g., for many small set comparisons, iterating over a list is faster than a hash set, but only by 10-50% or so), vs switching to a compiled language which instantly gets you 10x to 100x performance basically for free.

Or perhaps another way to look at it, if you care enough about performance to choose a particular algorithm, you shouldn't be using a slow language in the first place unless you're forced to due to functional requirements.

29. jampekka ◴[20 Oct 24 23:33 UTC] No.41899418{6}[source]▶

>>41899295 #

That's the only benchmark on the page that uses such wrapper.

Buffer.allocUnsafe just allocates the memory without zero-initializing it, just like e.g. malloc does. Probably usually not worth it, but in a benchmark it's comparable to malloc vs calloc.

replies(1): >>41899464 #

30. igouy ◴[20 Oct 24 23:40 UTC] No.41899458{5}[source]▶

>>41899241 #

> Couldn't find C vs JS easily

Try:

A) Find JS in the box plot charts

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

B) Find JS in the detail

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

31. tightbookkeeper ◴[20 Oct 24 23:42 UTC] No.41899464{7}[source]▶

>>41899418 #

Yeah and using byte buffers isn’t JavaScript array access. But it is for C.

The n-body looks most like canonical JS to me. It’s a small array, but’s it’s accessed many times.

Unfortunately the c++ version is simd optimized, so I don’t think that’s a fair comparison.

replies(3): >>41899515 #>>41902122 #>>41905665 #

32. igouy ◴[20 Oct 24 23:54 UTC] No.41899515{8}[source]▶

>>41899464 #

There are plain C++ programs: n-body C++ g++ #3

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

33. tyree731 ◴[21 Oct 24 00:03 UTC] No.41899545{3}[source]▶

>>41899226 #

Don’t disagree about finishing the task, but personally I don’t find more performant languages any less productive for the sort of programming I tend to do.

replies(1): >>41899723 #

34. bsder ◴[21 Oct 24 00:42 UTC] No.41899723{4}[source]▶

>>41899545 #

Congratulations on being a programming god. This discussion isn't for you.

From my point of view, I'm happy if I can convince my juniors to learn a scripting language. Okay? I don't care which one--any one. I'd prefer that they learn one of the portable ones but even PowerShell is fine.

I have seen sooooo many junior folks struggle for days to do something that is 10 lines in any scripting language.

Those folks who program but don't know a scripting language far outnumber the rest of us.

replies(1): >>41901234 #

35. kukkamario ◴[21 Oct 24 05:45 UTC] No.41901003{3}[source]▶

>>41898949 #

Node is so slow to start that python script can complete before Javascript even begins to execute.

replies(1): >>41902074 #

36. xlii ◴[21 Oct 24 06:16 UTC] No.41901158[source]▶

>>41898791 (TP) #

> Anyone who has done a programming contest, advent of code, etc knows that the language doesn’t matter so much as your algorithm.

This is one of the biggest falsehoods in the software engineering I know.

Language is a collaboration glue and influences way of thinking guiding solution development. As an analogy: you can make a statue from a glass or from ice, but while both can be of the same shape and be equally awed upon, the process and qualities will differ.

For the prototypes and throwaways context doesn’t matter - That’s why all short lived contests, golfs and puzzles ignore it. Yet, when software is to be developed not over the week but over the decades and (hopefully) delivered to thousands if not millions of computers it’s the technological context (language, architecture, etc.) that matters the most.

37. lucumo ◴[21 Oct 24 06:18 UTC] No.41901165{3}[source]▶

>>41898946 #

I don't.

Let me rephrase that.

I do, but only in very, very rare circumstances. Basically only when you a) know that the typical use case is going to involve large ns, like millions to billions, b) the loop body takes a long time per invocation, or c) have profiled a performance issue and found that improving it would help.

If you're working with sets of 10 items, just write the damn nested loop and move on. Code jockeying is unlikely to be faster, and even if it is, it doesn't help enough to matter anyway.

Computer science theory likes to ignore constants. Big-O notation does that explicitly. But in the real world, it's usually the constants that kill you. Constants, and time to market.

replies(1): >>41901249 #

38. eviks ◴[21 Oct 24 06:23 UTC] No.41901185[source]▶

>>41898791 (TP) #

And anyone who expands the horizon to the real world, instead of focusing on the artificial one of contests, knows that the language matters a great deal

39. fredrikholm ◴[21 Oct 24 06:32 UTC] No.41901234{5}[source]▶

>>41899723 #

> I have seen sooooo many junior folks struggle for days to do something that is 10 lines in any scripting language.

> Those folks who program but don't know a scripting language far outnumber the rest of us.

What domain are you in? This sounds like the complete inverse of every company I've ever worked at.

Entire products are built on Python, Node ect, and the time after the initial honeymoon phase (if it exists) is spent retrofitting types on top in order to get a handle, any handle, on the complexity that arises without static analysis and compile time errors.

At around the same time, services start OOM'ming left and right, parallellism=1 becomes a giant bottleneck, JIT fails in one path bringing the service performance down an order of magnitude every now and then etc...

> Congratulations on being a programming god. This discussion isn't for you.

On the behalf of mediocre developers everywhere, a lot of us prefer statically typed languages because we are mediocre; I cannot hold thousands of implicit types and heuristics in my head at the same time. Luckily, the type system can.

40. magicalhippo ◴[21 Oct 24 06:34 UTC] No.41901249{4}[source]▶

>>41901165 #

> If you're working with sets of 10 items

If you are working with a hardcoded 10 items, and you are certain that won't change significantly, sure.

If not I strongly disagree, because I've seen way too often such cases blow up due to circumstances changing.

Now, if it is very difficult to avoid a nested loop then we can discuss.

But it can simply be due to being unaware that some indexed library call is in fact O(n) or something like that, and avoiding it by using a dictionary or some other approach is not hard.

While constants matter to some degree, the point of big-O is that they don't so much if you get handed two orders of magnitude more data than you expected.

I'll gladly sacrifice a tiny bit of performance for code that doesn't suddenly result in the user not being able to use the application.

replies(1): >>41903827 #

41. CrimsonRain ◴[21 Oct 24 07:17 UTC] No.41901525[source]▶

>>41898791 (TP) #

That's why almost every thing important in python is in C

42. aragilar ◴[21 Oct 24 08:52 UTC] No.41902025{4}[source]▶

>>41898995 #

But can I write plugins for it? My understanding it is only implements a subset of the common plugins (and does not do any of the linting that pylint is useful for), so it avoids scanning the filesystem for plugins?

43. ◴[21 Oct 24 08:53 UTC] No.41902030[source]▶

>>41898791 (TP) #

44. jampekka ◴[21 Oct 24 09:01 UTC] No.41902074{4}[source]▶

>>41901003 #

For extremely simple scripts maybe. I get around 70 ms difference in startup time.

  $ time python3 -c "print('Hello world')"
  Hello world

  real 0m0.017s

  $ time node -e "console.log('Hello world')"
  Hello world
  
  real 0m0.084s

45. jampekka ◴[21 Oct 24 09:10 UTC] No.41902122{8}[source]▶

>>41899464 #

I'd guess using typed arrays or even normal arrays wouldn't slow the code much, and the slowdown will be probably a small constant factor.

If the JIT detects the array as homogenous it will compile it to low level array access. JS JITs are very good.

46. lucumo ◴[21 Oct 24 13:14 UTC] No.41903827{5}[source]▶

>>41901249 #

It doesn't need to be hardcoded. You should have a reasonable sense on how much data is going to go through a certain piece of code.

Wasting time on optimising cases that don't occur is just wasteful. Go solve something that's a real problem, not some imagined performance problem.

replies(1): >>41904132 #

47. magicalhippo ◴[21 Oct 24 13:48 UTC] No.41904132{6}[source]▶

>>41903827 #

> You should have a reasonable sense on how much data is going to go through a certain piece of code.

I've been in several past-midnight war rooms due to exactly that mindset.

Customer suddently gets a new client which results in 10x as much data as anyone imagined and boom, there you are getting dragged out of bed at 2am and it's not even your fault.

I'm not saying you should spend time optimizing prematurely.

I'm saying you should be aware of what you're doing, and avoid writing bad code. Because almost always it does not take any significant time to avoid writing bad code.

If you know that indexed library function is O(n) and you need to check all items, don't write a for loop but use a while loop using its .first() and .next() functions which are O(1).

Or reach for a dictionary to cache items. Simple stuff, just be aware of it so someone isn't dragged out of bed at 2am.

replies(1): >>41913524 #

48. rk06 ◴[21 Oct 24 14:23 UTC] No.41904514[source]▶

>>41898791 (TP) #

I have done so, and one has to be out of their mind to attempt in java instead of c++ in those contests.

49. igouy ◴[21 Oct 24 16:14 UTC] No.41905665{8}[source]▶

>>41899464 #

> I don’t think that’s a fair comparison

Shouldn't simd use make it less likely JS would be within magnitude, so if compared against simd programs and the JS is still less than 10x slower then that strengthens jampekka's point?

50. lucumo ◴[22 Oct 24 12:20 UTC] No.41913524{7}[source]▶

>>41904132 #

We're probably talking at cross purposes. Being aware is exactly my message as well.

I've been on way too many overdue projects where people went "but this doesn't scale" for stuff that's just not going to happen. Don't waste time on avoiding so called bad algorithms if there's just not that much data going through it. But yeah, don't write a badly scaling algorithm if it does.

Most lists are just way too small to care about the difference. Literal single digit number of items, and a small loop body. You can go up 3, 4, 5 orders of magnitude before you can even measure the nested loop being slower than lower big-O solutions, and a few more before it becomes a problem.

But if you have that one loop that goes into the millions of items and/or has a big body, you'd better be thinking about what you're doing.

↑