Most active commenters
  • tehwalrus(4)
  • marcofiset(3)

←back to thread

517 points petercooper | 30 comments | | HN request time: 0.759s | source | bottom
1. marcofiset ◴[] No.8558994[source]
I honestly think this is ridiculous. Sure, this is an incredible feat, and congrats. But serioulsy, I would be ashamed to publish such unreadable code under my name.

What about naming your variables with descriptive names?

What about extracting complex conditions into well named function to understand what is going on (thus defeating the purpose of the "4 functions") ?

This list could go on forever...

Writing software is not a contest for who can write the most amount of code in the most cryptic way.

replies(12): >>8558999 #>>8559001 #>>8559009 #>>8559056 #>>8559071 #>>8559139 #>>8559196 #>>8559249 #>>8559421 #>>8560270 #>>8560608 #>>8561021 #
2. tehwalrus ◴[] No.8558999[source]
The variable names thing really annoyed me too - a habit of code golf, and people who were originally trained in an old FORTRAN edition that had a 6 or 7 char limit on names.
replies(2): >>8559484 #>>8560744 #
3. privong ◴[] No.8559001[source]
> Writing software is not a contest for who can write the most amount of code in the most cryptic way.

It can be: http://ioccc.org/

replies(2): >>8559048 #>>8559545 #
4. mappu ◴[] No.8559009[source]
I assume it's a minified version of the other C compiler in his Github account, which does have better variable names and some comments. https://github.com/rswier/swieros/blob/master/root/bin/c.c
5. Lerc ◴[] No.8559048[source]
And indeed, that is how TCC http://bellard.org/tcc/ began its life http://bellard.org/otcc/
6. mturmon ◴[] No.8559056[source]
> This list could go on forever...

Yes it could. While you are adding to your list of coding rules, the OP will have written another fun, tiny compiler.

Who is having more fun?

replies(1): >>8559104 #
7. cbhl ◴[] No.8559071[source]
I actually found the code surprisingly easy to read; "tk" stands for "token", "ty" for "type", and so forth.

It's worth noting that compilers don't pop out of thin air -- you have to start with something simple in order to compile a more complicated compiler. Bootstrapping your own self-hosting compiler is a useful academic exercise, and you should try it sometime if you haven't already: http://en.wikipedia.org/wiki/Bootstrapping_(compilers)

replies(1): >>8559108 #
8. marcofiset ◴[] No.8559104[source]
Those are not rules, they are mostly common sense to help the next programmer who has to maintain your code.
replies(1): >>8559141 #
9. marcofiset ◴[] No.8559108[source]
Well, if this particular compiler is defined in 4 functions, why couldn't it be made out of more functions, enhancing the readability and maintainability of the code?
replies(2): >>8559144 #>>8559242 #
10. fla ◴[] No.8559139[source]
The goal here is to write a compiler that can compile itself.
replies(1): >>8562362 #
11. afandian ◴[] No.8559141{3}[source]
And most common sense of all is that you choose your rules for the audience. This is obviously not production code.

In any case if someone were competent enough to work on it then the style is actually quite readable. It's even full of comments.

12. afandian ◴[] No.8559144{3}[source]
Who says it's designed to be maintainable? Not everyone writes code for the reason that everyone else does.
13. more_original ◴[] No.8559196[source]
> I honestly think this is ridiculous. Sure, this is an incredible feat, and congrats.

It's not that bad (or difficult), really. It's a hand-written parser for a subset of C that emits assembly code right away. This is how compilers like Turbo Pascal used to work (see http://www.pcengines.ch/tp3.htm for an explanation of what's happening).

Sure, you could apply cosmetic changes like making "*++e = bla;" into "emit(bla);" and you could move the cases into independent methods (that are used once), but this isn't meant to be a state-of-the art compiler and it won't become one if one applies best practices to it.

replies(1): >>8559773 #
14. cbhl ◴[] No.8559242{3}[source]
There is no reason why couldn't it be made out of more functions. Absolutely none. In fact, you can fork the repository and do the refactoring yourself, right now.

If I had to guess, the functions are tied pretty closely to how the author is parsing the file; "next" (next token), "expr" (expression), "stmt" (statement), and "main".

As for the project being called "C in 4 functions"; at best, I'd argue that's just a linkbait-y title since it's not actually C (it's a subset). I don't have a problem with the code _per se_; just the title.

15. karlgrz ◴[] No.8559249[source]
I honestly think this comment is ridiculous. Sure, this is an incredible feat, and congrats. But seriously, I would be ashamed to publish such unreadable English under my name. What about spelling all the words correctly? What about not being an asshole and criticizing someone just because? This list could go on forever... Writing comments is not a contest for who can write the most amount of bullshit in the most non-constructive way.
16. PavlovsCat ◴[] No.8559421[source]
Writing software is not a contest, period :) It may be for some, it may be for you, but you don't get to sign up other people for it without their consent.
17. _wmd ◴[] No.8559484[source]
Unless you're talking about a large piece of software composed entirely of single character functions and variable names, I pretty much disagree. Verbose variable names do not magically teach those reading a piece of code how it works, simultaneously they tend to make it impossible to write many kinds of expressions concisely, and consequently they regularly damage the readability of more complex pieces of code (e.g. arithmetic expressions involving 3 or more terms).

The same goes for symbolic constants. Sometimes (but not exactly "often"), use of numeric literals can vastly improve the maintainability of some code, assuming the reader understands how to maintain it in the first instance.

As for increasing reader comprehension, carefully thought out comments are a better mechanism by far.

In this case, it is sufficient to know that the file is a compiler/interpreter for its entirety to make sense, assuming the reader has implemented (or at least understood the principles behind) a compiler/interpreter in their past. Expanding the function/variable names, splitting "complicated" expressions out into their own functions, etc., does not magically improve the uninitiated's chance of understanding what is going on

replies(2): >>8560612 #>>8562188 #
18. tromp ◴[] No.8559545[source]
The IOCCC is more about writing the least amount of code in the most cryptic way.

For example, my entry http://www.ioccc.org/2012/tromp/hint.html is a 25 line "BLC in 7 functions".

Similar to C4, but completely unmaintainable, it compiles Binary Lambda Calculus to bytecode which is then interpreted by a virtual machine.

19. sehugg ◴[] No.8559773[source]
I don't think it supports forward references either, which also makes it more like Pascal.
20. nikki93 ◴[] No.8560270[source]
But, "it is only as an aesthetic phenomenon that existence and the world are eternally justified." This piece of code can be viewed aesthetically. Artificial guidelines take the fun out of coding.
21. fsloth ◴[] No.8560608[source]
Look, one aspect of programming is producing production code that is maintainable, etc.

There are other aspects that are equally as important, and sometimes even more so. Exploring, learning and prototyping are among them, as are "back of the envelope" constructs.

This is a really cool example that shows how far you can go with tiny amounts of code.

Pithyness is actually one way to make code _more_ understandable, given that the reader is familiar with the subject area.

Don't be a snob. The world is a better place because someone wrote this and not worse off.

Since compiler construction is kinda a deep technical field, documenting this in pedagocially sane way would have been a huge task.

I'm happy the author took the effort of writing and publishing this piece, it would be sad if he hadn't just because it's missing an embedded tutorial.

He's not asking you to maintain it. There are references in this thread that explain what is going on...

I write and maintain C++ production code as a day job and some of that stuff is an order of magnitude harder to grok than this (no, it's not documented in any _helpfull_ way either).

22. ajkjk ◴[] No.8560612{3}[source]
I mostly strongly disagree.

I see no value in naming a variable 'tk', 'pp', or 'bt'. It can only help to make the code more readable with less context. I do not need to understand compilers in detail to know what this program is doing, except, for the names being useless. And if I do understand them but have not spent half an hour or probably much more to digest the exact system by which it operates, I would be completely unable to identify bugs or talk about modifying or extending it in any useful way.

Well written comments do not make the code below them more intelligible as text even if they do give you good hints for how to load it into your brain.

And "Expanding the function/variable names, splitting "complicated" expressions out into their own functions, etc." all ABSOLUTELY improve the uninitiated's chance of understanding what is going on. Speaking as one of the uninitiated (though I can mostly make out how this works). Man, I wish I knew what 'tk' was. That whole section at line 204 would be so clear if I could follow the intent.

I agree with you on symbolic constants and arithmetic though.

replies(1): >>8563301 #
23. emmanueloga_ ◴[] No.8560744[source]
I used to rather long names too... but now I think short variables have their place, and sometimes they even improve readability.

This code seems to follow a lot of conventions (if I see the var "i" I could bet a million dollars is an int that is being used as a counter, probably to go through the positions of an array). It uses plenty of enumerated constants, which is good too.

I've been doing some functional programming, where you find that often types are more important than names. See the "Names are overrated" section of this article [1] ( although this point may not apply to this piece of software... C being the language that it is :p).

1: http://techblog.realestate.com.au/the-abject-failure-of-weak...

replies(1): >>8562154 #
24. boomlinde ◴[] No.8561021[source]
> Writing software is not a contest for who can write the most amount of code in the most cryptic way.

Except when it is.

25. tehwalrus ◴[] No.8562154{3}[source]
I'm not against local variables for counting called i and j.

I'm against a big list of forward declarations, with tens of different variables, each with a very short name, and a comment explaining what this is an abbreviation for. Just replace the names with the comments, with underscores for spaces, using find and replace. two minute job for the whole code base, much more readable code almost everywhere.

I agree that functional languages may be a counter case; but codebases in C and python (in my experience), benefit greatly from well named variables.

26. tehwalrus ◴[] No.8562188{3}[source]
Writing expressions concisely is an interesting problem, but in order to do, for example, tensor algebra in C you basically must use macros to define a DSL. It is not a goal which can always be achieved, but structures higher than variable names are what allow one to achieve it (usually.)

I didn't say that "verbose" variable names were mandatory everywhere - i and j have their place - but names which are at least pronounceable words are essential, especially if they appear in more places than a five line function.

This project is a special case, certainly, but toy compilers are nothing if not to learn from.

27. jessaustin ◴[] No.8562362[source]
Yeah I loved this part:

  ./c4 c4.c c4.c hello.c
Granted, most compilers can do the equivalent, but it's rarely so simple to invoke.
28. vajrabum ◴[] No.8563301{4}[source]
As you wish. From the program, line 19:

    tk,       // current token
replies(2): >>8566650 #>>8594399 #
29. tehwalrus ◴[] No.8566650{5}[source]
replace all: tk -> current_token
30. ajkjk ◴[] No.8594399{5}[source]
yeah, my point is that should be swapped out everywhere. There's no reason not to.