Not exactly. You have to remember that language and compiler design require a LOT of work and experience to understand, and that many programmers will only see this as, frankly, spaghetti.
I think it could have used some more block comments, but that's just me.
What's your favorite shorter intro? I'd especially like to reference other educational compilers that are self-hosting.
Of course figuring out what it's doing is one thing - understanding why it is done in this particular way is another, and while I was able to find my way around fairly quickly I'd cry if I had to re-implement it. I do love how small it is though, that gives it great educational value.
The 'why' of a very low-level tools like this is the sort of thing that needs to be explored at length in a paper or (in this case) a book, otherwise they'll swamp the actual code. Sometimes as a learning exercise I'll take something like this and comment the hell out of everything, but the value there is more in writing the comments than trying to read them again later. Of course this is very much a matter of personal taste.
I'm tempted to link to my draft chapter, but though the code is essentially done the text needs a lot of work.
reverseNaturalSortOrder(listOfItems); // case sensitive sort of items by reverse alphabetical order
or
reverseNaturalSortOrder(listOfItems); // sort this way because the upper layer expects items in reverse order since user requested it
I think it is usually significantly easier to understand what something is doing rather than why it is doing that. To answer the former it usually requires a narrow scope of focus, but the latter requires a very broad scope.
Mind, I mostly do hobby/experimental programming, I've just been doing it for a long time. So I'm not commending this as a work practice or anything - your point about business logic made good sense.
I think this code details a special case of the above though, in that it comments what the enums are instead of just naming the enums. I give that a pass strictly because this code needs to be able to compile itself, and I don't think it supports named enums, so the comment was necessary to make up for that.
This is not how you code when you work in a team or when you know some other poor soul has to come along and maintain it.
I suggest you take a look at [1] then go and read this excellent book by Martin Fowler: "Refactoring: Improving the Design of Existing Code" [2]
Not all programming is enterprise quality, some programming is intentionally not.
Being so dismissive of that, seems a little silly to me.
The demoscene doesn't exist because of a bunch of programmers trying to make the lives of every other programmer more difficult, it exists because people like the challenge.
The functions may be big, but they also don't have all that much duplicate code inside them. The choice of 4 functions isn't arbitrary either - they nicely divide the problem into:
- next() - splits the source code into a series of tokens
- expr() - parses expressions
- stmt() - parses statements
- main() - starts the processing of the source, and also contains the main interpreter VM's execution loop
Code generation is integrated into the parsing, since it's generating code for a stack-based machine and that also very nicely follows the sequence of actions performed when parsing.
In fact I'm of the opinion that the obsession with breaking up code into tiny pieces (usually accompanied by the overuse of OOP) is harmful to the understanding of the program as a whole since it encourages looking at each piece independently and misses "seeing the forest for the trees".
In contrast, this is code that is designed to be easily read and understood by a single person, showing how very simple an entire compiler and interpreter/VM can be. It doesn't attempt to hide anything with thick layers upon layers of abstraction and deep chains of function calls, but instead is the "naked essence" of the solution to the problem.
Someone used to e.g. enterprise Java may find this style of code quite jarring to their senses, but that's only because they've grown accustomed to an environment in which everything is highly-abstracted and indirect, hiding the true nature of the solution. Personally, I think the simplicity and "nakedness" of this code has a great beauty to it --- it's a functional work of art.
printf("%8.4s", &"LEA ,IMM ,JMP ,JSR ,BZ ,BNZ ,ENT ,ADJ ,LEV ,LI ,LC ,SI ,SC ,PSH ,"
"OR ,XOR ,AND ,EQ ,NE ,LT ,GT ,LE ,GE ,SHL ,SHR ,ADD ,SUB ,MUL ,DIV ,MOD ,"
"OPEN,READ,CLOS,PRTF,MALC,MSET,MCMP,EXIT,"[*++le * 5]);
Man, I can't even tell what this is supposed to be. My confusion is entirely founded. My thought process with articles like this goes something like "C in four functions, huh? Sounds like it could be clever. I'll just click and read the explanation... Oh, there isn't an explanation. Well, maybe this file will explain things! ...Nope, it's 500 lines of mostly-uncommented if-else statements. Maybe it's a compiler? I dunno!"
I'm sure there's a subset of the programming community for whom this is crystal clear on first sight, and that's great; but there's a lot more of us who could probably get the joke with a few hints, so it would be nice if you'd help out instead of declaring that if you understand it, it must be easy.
int *a, *b;
int t, *d;
I can't really see how anyone can say that code that uses one-letter variable names (with the exception of the "standard ones", whose meaning is defined at the top) is readable.1. Readability for modification
2. Readability for the "what"
3. Readability for the "why"
All human-readable description in code is there to make the difference between having a piece of documentation pointing you in the right direction, and having to reverse engineer your understanding. It's an optimization problem with definite tradeoffs. Although CS professors and many tutorials will tend to encourage you towards heavy description, over-description creates space for inconsistent, misleading documentation, which is worse than "not knowing what it does."
When you see code that is dense and full of short variables, it's written favorably towards modification. It is relying on a summary comment for "what" it does, and perhaps on namespaces, scope blocks, or equivalent guards to keep names from colliding. Such code lets you reach flow state quickly once you've grokked it and are ready to begin work, because more of it fits onscreen, and you're assured the smallest amount of typing and thus can quickly try and revise ideas. The summary often gets to stay the same even if your implementation is completely different. And if lines of code are small, you enjoy a high comments-to-code ratio, at least visually.
Code that builds in the "what" through more descriptive variable names pays a price through being a little harder to actually work in, with big payoffs coming mostly where the idea cannot be captured and summarized so easily through a comment.
In your example, one might instead rework the whole layout of the code so:
var urq; /* user request type */
... (we build list "l")
/* adjustments for user request */
{
if (urq=="rns") { /* case sensitive sort by reverse alphabetical order */ ... }
}
If you aren't reusing the algorithm, inline and summarize. And if you're writing comments about "what the upper layer expects", then you(or your team) spread and sliced up the meaning of the code and created that problem; that kind of comment isn't answering a "why," it's excusing something obtuse and unintuitive in the design, and is a code smell - the premise for that comment is hiding something far worse. If the sequence of events is intentionally meaningful and there's no danger of cut-and-paste modifications getting out of sync, it doesn't need to be split into tiny pieces. Big functions can be well-organized and easy to navigate just with scope blocks, comments, and a code-folding editor."Why" is a complicated thing. It's not really explainable through any one comment, unless the comment is an excuse like the example you gave. The whole program has to be written towards the purpose of explaining itself(e.g. "literate programming"), yet most code is grown organically in a way that even the original programmers can only partially explain, starting with a simple problem and simple data and then gradually expanding in complexity. Experience(and unfamiliar new problem spaces) eventually turns most programmers towards the organic model, even if they're aware of top-down architecture strategies. Ultimately, a "why" has to ask Big Questions about the overall purpose of the software; software is a form of automation, and the rationale for automating things has to be continuously interrogated.
First thing to say is that "* ++le" is the integer representing the operation to perform. This basically walks through the array of instructions returning each integer in turn.
Starting at the beginning of the line, we have "printf" with a format string of "%8.4s". This means print out the first 4 characters of the string that I pass next (padded to 8 characters). There then follows a string containing all of the operation names, in numerical order, padded to 4 characters and separated by commas (so the start of each is 5 apart). Finally, we do a lookup into this string (treating it as an array) at offset "* ++le * 5", i.e. the integer representing the operation multipled by 5 (5 being the number of characters between the start of each operation name). Doing this lookup gives us a char, but actually we wanted the pointer to this char (as we want printf to print out this char and the following 3 chars), so we take the address of this char (the & at the beginning of the whole expression).
It's concise, but not exactly self-documenting.
Does that make sense?
(I didn't downvote.)
There is one obsession that I am tired of is people posting "X awesome thing in 120 lines of JavaScript" or "Y in 4 functions". Just because the problem is reduced to small as possible metric, it doesn't make it good.
PS: Also you mention negatively connotated terms like "OOP", "Java", "thick layers of abstraction" and "deep chains of function calls" as if you've ascertained that I'm some enterprise developer that doesn't have any C experience and wouldn't know simplicity if it bit me in the ass.
If quite a few people are saying that it's unreadable, you don't just dismiss them as making "unfounded" statements.
> The demoscene doesn't exist because of a bunch of programmers trying to make the lives of every other programmer more difficult, it exists because people like the challenge.
I'm pretty sure that this doesn't need to be stated.
"Toy examples" are often the result of long & deep study and practice of a subject, creating something profound which casual observers are not entitled to instantly understand. In this case, it's a very clever compiler: everybody understands this summary, and if you want "a few contextual comments" beyond the source code itself then you know where to get enough information to learn what you need to understand this.
If you don't "get it", and don't want to "get it" on your own, it's not for you.
So, yes, it's either you're curious enough to dig into a code and find the relevant explanations somewhere else (the said Dragon Book and alike), or you won't get it, regardless of how comprehensive comments are.
I hope someone will write an explanation. I'm still working on one for my own (quite different) little compiler.
With code like this, the readability obviously isn't a high priority consideration and sometimes the exact opposite of the goal, with the impenetrability sometimes being part of its charm. This is Hacker News after all, and if your reaction to of a piece of code that describes itself as "an exercise in minimalism" is to leave a snarky comment about the lack of documentation, you should probably check your news elsewhere.
If you have any interest in the subject, the initial comment "just enough features to allow self-compilation and a bit more" should give the purpose of the code away. If not, it ought to have been a clear sign of dragons.
This submission in particular has dubious practical use, and the description, "an exercise in minimalism", is telling of a sort of artistic intent. If you don't understand what it does, how it does it, or if you don't like it, it won't lower my opinion of you in any way, but its inclusion on this site is part of why I like to come here every now and then. I get to discuss subjects that relate to my work and hobbies, but I also get to look at weird alien code and think hard in unfamiliar terms. From what you are saying, I think you can relate.
Personally, I could glance over it and get the idea that it is a C compiler, but if you were to show me some code in written with the latest JS MVC or FRP framework, don't hold your breath for me to tell you what it does. I can't say that I fully understand this, and that's why I enjoy the rich discussion the submission spawned here.