Most active commenters
  • lmm(3)
  • thesz(3)
  • PaulHoule(3)
  • jerf(3)

←back to thread

42 points todsacerdoti | 15 comments | | HN request time: 1.043s | source | bottom
Show context
PaulHoule ◴[] No.42190446[source]
I think is forgotten here that one of the benefits of nominal typing is that the compiler can know that data layout at run time so performance benefits.

There has been so much ink spilled on the question of what kind of type systems help programmers be productive but there is not such controversy on the performance side.

replies(3): >>42190752 #>>42191794 #>>42199051 #
1. lmm ◴[] No.42191794[source]
Premature optimisation is the root of all evil. If you get the semantics right then good performance will generally follow, if you make the wrong thing then it doesn't matter how fast it is.
replies(4): >>42191922 #>>42192250 #>>42192332 #>>42194078 #
2. taeric ◴[] No.42191922[source]
Performant code in the wild largely disagrees? This is the difference between a numpy array versus a regular python list. It is stark.

You are correct that you can often engineer the performance later. But layout concerns and nominal types for performance are fairly non controversial.

replies(1): >>42192750 #
3. biorach ◴[] No.42192250[source]
> Premature optimisation is the root of all evil

If you're going to quote Knuth you better be damn sure you've fully understood him.

The quote in question is about micro-optimisations which were pointless on account of design issues. The situation you are commenting on is kind of the opposite.

replies(1): >>42193643 #
4. fanf2 ◴[] No.42192332[source]
Good data layout is the root of all optimization.
replies(1): >>42192934 #
5. thesz ◴[] No.42192750[source]
Python does not do optimizations at all.

In contrast, you can find Fortran compilers that can substitute your inefficient implementation of some numerical algorithm with different much more performant one.

replies(1): >>42193650 #
6. thesz ◴[] No.42192934[source]
Worst-case optimal join algorithm [1] might have a different opinion.

[1] https://en.wikipedia.org/wiki/Worst-case_optimal_join_algori...

7. PaulHoule ◴[] No.42193643[source]
I would say this debate

https://en.wikipedia.org/wiki/AoS_and_SoA

(which I think is weakened in the Wikipedia version) goes to the heart of where profiling can hide huge opportunities for optimization.

It is a good prop bet that rewriting something (that doesn't allocate lots of dynamic memory) that is SoA style in AoS style will speed it up significantly. The victim is emotionally attached to their code and the profiler would never show you the many small costs, often distributed through the memory system, that add up. In a case like that they might accuse you of cheating by applying a bunch of small optimizations which are often easier to express in SoA if not downright obvious. When they apply the same optimizations to their code they will speed it up but not as much.

Oddly it's a discussion that's been going on even since people were writing games in assembly language in the 1980s and probably since that; it would be interesting to see more tools which are AoS on the inside but look SoA on the outside.

8. PaulHoule ◴[] No.42193650{3}[source]
Python has a peephole optimizer: https://akaptur.com/blog/2014/08/02/the-cpython-peephole-opt...
replies(1): >>42194022 #
9. maccard ◴[] No.42194022{4}[source]
I think what OP meant was "Python's optimisations are not very fast", which (IMO) is a fair statement. Python is many, many things, but it's best case scenarios (without rewriting in C, at which point you're writing C not python) are slower than the naive implementations in many other languages.
replies(1): >>42199243 #
10. jerf ◴[] No.42194078[source]
This is basically a form of the "Sufficiently Smart Compiler" claim. It is generally false, barring a definition of "semantics right" that simply includes a presumption that the "semantics" will be chosen for performance in which case the claim becomes circular.

At this point in the collective journey we are all on understanding programming languages and what they can do, the evidence is overwhelming that there are in fact plenty of useful semantics that are intrinsically slower than other useful semantics, relative to any particular chunk of hardware executing them. That is, what is slow on CPU or GPU may differ, but there is certainly some sort of difference that will exist, and there is no amount of (feasible) compiling around that problem.

Indeed, that's why we have CPUs and GPUs and soon NPUs and in the future perhaps other types of dedicated processors... precisely because not all semantics can be implemented equally no matter how smart you are.

replies(1): >>42194415 #
11. lmm ◴[] No.42194415[source]
From where I stand the sufficiently smart compiler paradigm is already dominant, and getting more so every day.

- The overwhelming majority of new code is written in high-level languages

- High-level languages have continued to close what small performance gaps remain

- There have been no serious efforts to implement a true low-level language for post-Pentium (superscalar) CPUs, yet alone the CPUs of today

- Even GPUs and NPUs are largely programmed by using languages that express largely the same semantics as languages for CPUs, and relying heavily on compiler optimisation

replies(1): >>42196556 #
12. jerf ◴[] No.42196556{3}[source]
That's not what the "sufficiently smart compiler" is. The "sufficiently smart compiler" is one that allows you to write any amazingly capable language you want with whatever amazing semantics you want, and it compiles it into code that is competitive with C written for speed.

You can easily observe in any cross-language shootout in 2024 that optimized code bases in the various languages still have gradients and we do not live in a world where you can just start slamming out Python code and expect no performance delta against C.

https://prog21.dadgum.com/40.html

Merely smart compilers are amazing; one of the defining characteristics of the software world is that you can be handed these things for free. The "sufficiently smart compiler", however, does not exist, and while there is no mathematical proof that I'm aware of that they are impossible, after at least two decades of people trying to produce them and totally failing, the rational expectation at this point must be that they do not exist.

replies(1): >>42199816 #
13. thesz ◴[] No.42199243{5}[source]
Python optimizations are not deep and/or broad. I did not mean "very fast" as in "compilation speed."

Python does not attempt to apply anything more complex than that peephole optimizer, whatever that means.

Judging from my experience with Python, that peephole optimizer cannot lift operations on "dictionary with default value" to a regular dictionary operations using conditionals. I had to manually rewrite my program to use conditionals to recover it's speed.

14. lmm ◴[] No.42199816{4}[source]
> You can easily observe in any cross-language shootout in 2024 that optimized code bases in the various languages still have gradients and we do not live in a world where you can just start slamming out Python code and expect no performance delta against C.

Microbenchmarks may show an advantage for C - but it's one that is shrinking all the time (and that goes doubly for Java, which was the go-to example in the original "sufficiently smart compiler" conversations - but no longer is, because you can't actually be confident that Java is going to perform worse than C any more). And the overwhelming majority of the time, for real-world business problems, people do just start slamming out Python code, and if anything it tends to perform better.

And conversely even those C programs now rely extremely heavily on compiler smartness to reorder instructions, autovectorise, etc., often producing something quite radically different from what a naive reading of the code would mean - and there is no real appetite for a language that doesn't do this, one with semantics designed to perform well on today's CPUs or GPUs. Which suggests that designing the language semantics for performance is not actually particularly important.

replies(1): >>42204534 #
15. jerf ◴[] No.42204534{5}[source]
If you honestly believe that Python is already, right now, in practice, as fast as C on all tasks and nobody need to even consider the matter of performance because the Sufficiently Smart Compiler is already a reality that we can count on, you and I are too far apart to ever come to any agreement on this matter.

Best of luck in your engineering endeavors if you ever end up in a place you need high performance code. You're going to need a lot of it.