How Janet's PEG module works

(bakpakin.com)

81 points behnamoh | 1 comments | 11 Apr 25 02:04 UTC | HN request time: 0.203s | source

Show context

norir ◴[14 Apr 25 19:07 UTC] No.43684922[source]▶

I am not a fan of PEG. It is straightforward to write a fast parser generator for languages that require just one character of lookahead to disambiguate any alternation in the grammar. This gets you most of the expressivity of PEG and nearly optimal performance (since you only need to look at one character to disambiguate and there is no backtracking). Just as importantly, it avoids the implicit ambiguities that PEG's resolution algorithm can hide from the grammar author that lead to unexpected parse results that can be quite difficult to debug and/or fix in the grammar.

It does require a bit more thought to design an unambiguous language but I think it's worth it. While there is a learning curve for designing such languages, it becomes natural with practice and it becomes hard to go back to ambiguous languages.

replies(2): >>43685488 #>>43685901 #

janzer ◴[14 Apr 25 19:48 UTC] No.43685488[source]▶

>>43684922 #

For those further interested in PEG vs LL(1) parsers. The first few sections of the Python PEP[1] where they switched from an LL(1) to PEG parser for CPython has a nice short introduction to both and their rationale for switching from LL(1) to PEG.

https://peps.python.org/pep-0617/

replies(1): >>43685644 #

PaulHoule ◴[14 Apr 25 19:59 UTC] No.43685644[source]▶

>>43685488 #

It still seems to me the PEG revolution hasn't arrived.

PEG has the possibility for composable grammars (why not smack some SQL code in the middle of Python?) but it needs a few more affordances, particularly an easy way to handle operator precedence.

I think current parser generators suck and that more programmers would be using them if anybody cared about making compiler technology easier to use but the problems are: (1) people who understand compiler technology can get things done with the awful tools we have now and (2) mostly those folks think it is performance über alles.

With the right tools the "Lisp is better because it is homoiconic" would finally die. With properly architected compilers adding

  unless(X) { .. } -> if(!X) { ... }

to Java would just one grammar production, one transformation operator and maybe a new class in the AST (which might be codegenned), that and something to tell the compiler where to find these things. Less code than the POM file.

I gave up on Restructured text because it didn't support unparsing: I could picture all kinds of scenarios where I'd want to turn something else into RST or take RST and mix it up against other data and turn it back to RST; RST had the potential to work with or without a schema but it never got realized.

replies(2): >>43685801 #>>43687197 #

1. zem ◴[14 Apr 25 22:56 UTC] No.43687197[source]▶

>>43685644 #

brag is a pretty user-friendly parser generator for racket: https://docs.racket-lang.org/brag/index.html

↑