←back to thread

180 points xnacly | 2 comments | | HN request time: 0.415s | source
Show context
norir ◴[] No.44562212[source]
Lexing being the major performance bottleneck in a compiler is a great problem to have.
replies(3): >>44563135 #>>44568294 #>>44568430 #
norskeld ◴[] No.44563135[source]
Is lexing ever a bottleneck though? Even if you push for lexing and parsing 10M lines/second [1], I'd argue that semantic analysis and codegen (for AOT-compiled languages) will dominate the timings.

That said, there's no reason not to squeeze every bit of performance out of it!

[1]: In this talk about the Carbon language, Chandler Carruth shows and explains some goals/challenges regarding performance: https://youtu.be/ZI198eFghJk?t=1462

replies(3): >>44563278 #>>44564311 #>>44568469 #
1. munificent ◴[] No.44563278[source]
It depends a lot on the language.

For a statically typed language, it's very unlikely that the lexer shows up as a bottleneck. Compilation time will likely be dominated by semantic analysis, type checking, and code generation.

For a dynamically typed language where there isn't as much for the compiler to do, then the lexer might be a more noticeable chunk of compile times. As one of the V8 folks pointed out to me years ago, the lexer is the only part of the compiler that has to operate on every single individual byte of input. Everything else gets the luxury of greater granularity, so the lexer can be worth optimizing.

replies(1): >>44580487 #
2. norskeld ◴[] No.44580487[source]
Ah, yes, that's totally fair. In case of JS (in browsers) it's sort of a big deal, I suppose, even if scripts being loaded are not render-blocking: the faster you lex and parse source files, the faster page becomes interactive.

P.S. I absolutely loved "Crafting Interpreters" — thank you so much for writing it!