How Janet's PEG module works

(bakpakin.com)

81 points behnamoh | 1 comments | 11 Apr 25 02:04 UTC | HN request time: 0.426s | source

Show context

norir ◴[14 Apr 25 19:07 UTC] No.43684922[source]▶

I am not a fan of PEG. It is straightforward to write a fast parser generator for languages that require just one character of lookahead to disambiguate any alternation in the grammar. This gets you most of the expressivity of PEG and nearly optimal performance (since you only need to look at one character to disambiguate and there is no backtracking). Just as importantly, it avoids the implicit ambiguities that PEG's resolution algorithm can hide from the grammar author that lead to unexpected parse results that can be quite difficult to debug and/or fix in the grammar.

It does require a bit more thought to design an unambiguous language but I think it's worth it. While there is a learning curve for designing such languages, it becomes natural with practice and it becomes hard to go back to ambiguous languages.

replies(2): >>43685488 #>>43685901 #

janzer ◴[14 Apr 25 19:48 UTC] No.43685488[source]▶

>>43684922 #

For those further interested in PEG vs LL(1) parsers. The first few sections of the Python PEP[1] where they switched from an LL(1) to PEG parser for CPython has a nice short introduction to both and their rationale for switching from LL(1) to PEG.

https://peps.python.org/pep-0617/

replies(1): >>43685644 #

PaulHoule ◴[14 Apr 25 19:59 UTC] No.43685644[source]▶

>>43685488 #

It still seems to me the PEG revolution hasn't arrived.

PEG has the possibility for composable grammars (why not smack some SQL code in the middle of Python?) but it needs a few more affordances, particularly an easy way to handle operator precedence.

I think current parser generators suck and that more programmers would be using them if anybody cared about making compiler technology easier to use but the problems are: (1) people who understand compiler technology can get things done with the awful tools we have now and (2) mostly those folks think it is performance über alles.

With the right tools the "Lisp is better because it is homoiconic" would finally die. With properly architected compilers adding

  unless(X) { .. } -> if(!X) { ... }

to Java would just one grammar production, one transformation operator and maybe a new class in the AST (which might be codegenned), that and something to tell the compiler where to find these things. Less code than the POM file.

I gave up on Restructured text because it didn't support unparsing: I could picture all kinds of scenarios where I'd want to turn something else into RST or take RST and mix it up against other data and turn it back to RST; RST had the potential to work with or without a schema but it never got realized.

replies(2): >>43685801 #>>43687197 #

behnamoh ◴[14 Apr 25 20:12 UTC] No.43685801[source]▶

>>43685644 #

> "Lisp is better because it is homoiconic"

- Lisp is better because it manipulates the same data that the program code is represented in (car works on a data list, and it works on a code list as well).

- Lisp is better (at least, Common Lisp) because of image-and-REPL-driven development. Good luck finding exactly that level of flexibility in other REPL-ful languages.

- Lisp is better because of hot code reloading and restarts. Only Elixir/Erlang have a similar mechanism.

- Lisp is better because of structural editing (e.g., paredit). No more character-level editing.

I could go on but just wanted to point out that homoiconicity isn't the entire deal with Lisp.

replies(1): >>43686499 #

fuzztester ◴[14 Apr 25 21:27 UTC] No.43686499[source]▶

>>43685801 #

>> "Lisp is better because it is homoiconic"

>- Lisp is better because it manipulates the same data that the program code is represented in (car works on a data list, and it works on a code list as well).

Don't those two sentences mean the same?

https://news.ycombinator.com/item?id=43676798

replies(1): >>43686566 #

behnamoh ◴[14 Apr 25 21:33 UTC] No.43686566[source]▶

>>43686499 #

Yeah, but I wanted to emphasize that homoiconicity isn't just some superficial "nice thing" to have, it literally is why we can have powerful macros in Lisp.

replies(2): >>43686778 #>>43688535 #

1. PaulHoule ◴[14 Apr 25 21:59 UTC] No.43686778[source]▶

>>43686566 #

I went through quite a few stages of grief reading Graham's On Lisp starting with "this is so awesome" to nitpicking details like "he defined everything else but he didn't define nconc" to "if we was using Clojure he wouldn't be having these problems with nconc" to "funny I can write 80%+ of his examples in Python because most of the magic is in first-class functions and macros are a performance optimization except for that last bit about continuations... and Python has async anyway!"

Notably he doesn't do any interesting tree transformations on the code because tree structures in list are just a collection of twisty nameless tuples that all look alike. If you were trying to do nontrivial transformations on code you'd be better off with an AST in a language like Java or Typescript. In the end the dragon book is On Lisp squared or cubed, that is, games people play with macros are a pale shadow of what you can do if you actually understand how compilers work.

↑