Most active commenters

    ←back to thread

    429 points rui314 | 13 comments | | HN request time: 1.154s | source | bottom
    1. peter303 ◴[] No.10732006[source]
    Long ago UNIX had compiler writing tools like yacc and lex. I wonder if they are useful for exercises like this.
    replies(6): >>10732067 #>>10732084 #>>10732219 #>>10732591 #>>10732595 #>>10732964 #
    2. DSMan195276 ◴[] No.10732067[source]
    For writing a general compiler (or anything similar to that), they're extremely useful because they produce very good lexers and/or parsers. The GNU versions of those two are 'bison' and 'flex'. Really, just about anything that requires parsing text can gain from using both of them.

    Noting that though, for this specific exercise they're not as useful because the author intended for this compiler to be self-hosting. It would be hard to be self-hosting if the compiler have to be able to compile the C code from yacc or lex, which may do any number of strange things.

    replies(1): >>10732293 #
    3. fahadkhan ◴[] No.10732084[source]
    "Particularly, I'd use yacc instead of writing a parser by hand and introduce an intermediate language early on." Near the end of the article.
    4. nly ◴[] No.10732219[source]
    They might, although no production quality C or C++ compiler uses anything other than a hand-rolled recursive descent parser, afaik.

    The lex, parse and AST directories in Clangs source tree are ~100,000 LOC combined, and all hand-written.

    replies(2): >>10732576 #>>10735743 #
    5. vidarh ◴[] No.10732293[source]
    With the huge caveat that "nobody" uses these tools for production compilers because decent error handling becomes a nightmare.

    There are exceptions, but if you dig into most larger compilers, they usually sooner or later end up adopting their own handwritten recursive descent parsers.

    replies(1): >>10733910 #
    6. nickpsecurity ◴[] No.10732576[source]
    Semantic Designs toolkit uses GLR to handle about everything one could think of:

    http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html

    replies(1): >>10733685 #
    7. eliben ◴[] No.10732591[source]
    Here's a complete C frontend that uses the lex/yacc approach - https://github.com/eliben/pycparser (the Python ports of them, that is)

    FWIW, if I had to do this again today I would certainly go for hand-written recursive descent. lex/yacc charm you in but eventually prove to be much more difficult to tweak and reason about.

    8. kayamon ◴[] No.10732595[source]
    C is actually easier to parse by hand because of its wacky declaration syntax.
    9. rui314 ◴[] No.10732964[source]
    That's what I want to try next time. There are pros and cons in using the parser generator, but the tool seems to be useful at least if you want to create a small compiler which don't aim Clang-level diagnostics.
    10. marktangotango ◴[] No.10733685{3}[source]
    Ambiguous use of 'handle' here. For an ambiguous parse, GLR gives a forest of possible alternatives that still has to be disambiguated. Interesting to see a reference to Semantic Designs here, Ira Baxter used to pimp DMS on stack overflow quite a lot, I have not seen anything from them in years, are they still in business?
    replies(1): >>10733927 #
    11. DSMan195276 ◴[] No.10733910{3}[source]
    That's a fair point. Those tools have their uses, definitely - for a 'toy' compiler like this one, a simple lex lexer or yacc parser would/could be sufficient. Once you get more complex they start to become a limiting factor.
    12. nickpsecurity ◴[] No.10733927{4}[source]
    Having tried to write similar stuff, I was quite impressed with the claimed capabilities of the tool and how they went about it. He summarizes some of the issues here:

    http://www.semanticdesigns.com/Products/DMS/LifeAfterParsing...

    The stuff they support was also significant. Personally, I always thought they should open-source that then make their money on good front-ends, transformation heuristics, and services. Like LLVM, academic community would make core tool much better over time.

    Far as in business, site is still up with more content than before and a current copyright. Last news release was 2012. Not sure if that's a bad sign or just a business that focuses less on PR. There's a paper or two in ResearchGate in 2015 promoting it with him still on StackOverflow but with less DMS references because of moderator pressure (explained in his profile). So, probably still in business.

    My quick, top-of-head assessment of their situation, at least. Might be way off. :)

    13. alricb ◴[] No.10735743[source]
    Not now, no, but back in the 4.0 days (2005) GCC still used a yacc/bison parser for C; it had switched to a hand written parser for C++. The C++ yacc parser was still in use as of GCC 3.3 (2003).