We Designed TigerBeetle's Docs from Scratch

1. mtlynch ◴[10 Apr 25 17:05 UTC] No.43645950[source]▶

I think this is a fun thing for TigerBeetle to do, but I'm pretty skeptical that this was a good engineering decision.

And it's fine to make non-optimal engineering decisions for fun, but the top reason in the article should be, "Because we thought it would be fun to code a docs site from scratch."

This post reminds me a lot of an article I read on HN about a year ago and can't find now, but the author was explaining how so many organizations end up investing humongous amounts of effort rolling their own databases from scratch because none of the off-the-shelf solutions meet all their requirements. But in most of these cases, it's because some of the "requirements" were actually "nice-to-haves" and they could have gotten by fine with an off-the-shelf database, but they talked themselves into building one from scratch.

A lot of the justifications here feel pretty weak:

- Didn't want to use a complicated React app? Use Hugo or Pelican or Eleventy.

- Wanted nice reading experience? Replace the default CSS in any SSG.

- Want a nice search experience? Theirs looks good, but is probably also achievable in off-the-shelf SSGs and is probably not worth rolling their own docs site from scratch.

>We employed a Content Security Policy to prevent Cross Site Scripting (XSS) as defense-in-depth, in case a seemingly friendly PR contains some innocent looking MathML. This MathML could contain obfuscated code that would run in the user’s browser. CSP prevents any unwanted inline scripts from running and keeps our users safer.

This was the silliest reason of all. Who's XSS'ing a docs site through obfuscated markup contributions? That sounds pretty difficult to achieve in the first place, and then what's the reward for achieving XSS on TigerBeetle's docs site? There's no valuable data there. At worst, you'd mine tiny amounts of crypto in a serviceworker. But also, you can mitigate this risk in lots of ways that don't involve rolling your own docs site.

Edit: They don't seem to actually be using CSP at all: https://gist.github.com/mtlynch/92c991cfe48652feee491d4f4652...

Edit2: Nevermind, they actually do but in HTML. Hat tip to pfg_.

replies(7): >>43646093 #>>43646192 #>>43646566 #>>43646625 #>>43647427 #>>43649264 #>>43650682 #

2. pfg_ ◴[10 Apr 25 17:20 UTC] No.43646093[source]▶

>>43645950 (TP) #

Content security policies can also set in a meta tag in html

replies(1): >>43646318 #

3. data_ders ◴[10 Apr 25 17:31 UTC] No.43646192[source]▶

>>43645950 (TP) #

> how so many organizations end up investing humongous amounts of effort rolling their own databases from scratch because none of the off-the-shelf solutions meet all their requirements. But in most of these cases, it's because some of the "requirements" were actually "nice-to-haves" and they could have gotten by fine with an off-the-shelf database, but they talked themselves into building one from scratch.

I love the term "arbitrary uniqueness" for this too. Like how different are your needs, really?

4. mtlynch ◴[10 Apr 25 17:42 UTC] No.43646318[source]▶

>>43646093 #

Ah, you're right. They are setting it in HTML. Updated!

5. jorangreef ◴[10 Apr 25 18:07 UTC] No.43646566[source]▶

>>43645950 (TP) #

Joran from TigerBeetle here!

We didn't design our docs because it was "a fun thing" (as suggested) but rather because we simply care deeply about the experience of developers reading our docs. For example, concerning performance and offline use, which were further reasons we gave in the post.

We have a high bar for taking on dependencies. We don't take on dependencies automatically without justification. It's just not a philosophy that we share, to assume or to insist that everything needs to be a dependency.

(The discussion on CSP in our post was also not given as motivation, but as an example of the thought process that went into this. Again, as per the post, it's a matter of defense-in-depth. We have plans for how our docs will be used in future, that you may not be aware of, and security is important.)

Finally, we're happy with the result, the project was small and didn't take long. We're used to "painting" things like this fairly quickly. It's just easier for us than trying to "sculpt" off the shelf dependencies. That's not to suggest that everyone needs to paint like we do at TigerBeetle, but it's equally true that not everyone needs to sculpt either. [1]

[1] To understand our engineering methodology, and why we prefer to paint than sculpt, see TigerStyle: https://www.youtube.com/watch?v=w3WYdYyjek4

replies(1): >>43647054 #

6. jessekv ◴[10 Apr 25 18:14 UTC] No.43646625[source]▶

>>43645950 (TP) #

It's mostly just pandoc though?

And they chose it to avoid any non-standard markdown.

replies(1): >>43646641 #

7. jorangreef ◴[10 Apr 25 18:16 UTC] No.43646641[source]▶

>>43646625 #

Indeed! :)

8. mtlynch ◴[10 Apr 25 19:00 UTC] No.43647054[source]▶

>>43646566 #

Hi Joran, thanks for your response!

For context, I like TigerBeetle, and I respect the team. I'm not trying to take cheap shots but rather to disagree respectfully.

>We didn't design our docs because it was "a fun thing" (as suggested) but rather because we simply care deeply about the experience of developers reading our docs. For example, concerning performance and offline use, which were further reasons we gave in the post.

To me, this still sounds like "for fun."

The blog post just talks about performance and offline use, but "maximize performance" isn't a real goal. You can invest ininite hours improving performance, so it comes down to how many engineering hours you're willing to trade in exchange for improving some set of performance metrics.

Maybe the issue is that the blog post doesn't present the decision making process well? Because the critical questions I don't see addressed are:

- What were the performance metrics that were critical to achieve?

- What alternative solutions were considered beyond Docusaurus?

- How do the alternatives perform on the critical metrics?

- How does the home-rolled solution perform on TigerBeetle's critical metrics?

In the absence of those considerations, it feels like the dominant factor is that it's more pleasant to work with greenfield, home-baked code than off-the-shelf code, even if the existing code could achieve the same thing in fewer engineering hours.

replies(1): >>43647135 #

9. jorangreef ◴[10 Apr 25 19:11 UTC] No.43647135{3}[source]▶

>>43647054 #

To be clear, we have fun at TigerBeetle!

And to be fair, we did present the metrics (footprint etc.), and we did discuss alternatives to Docusaurus (e.g. Zine, which is pretty great!).

I think at the heart of your argument is this assumption that unquestionably taking on dependencies would achieve the same quality in less time, and that a methodology such as TigerStyle that challenges this assumption need necessarily take "infinite time". You almost force us to apologize that we don't share this view! :)

But again, this was the quickest, highest quality path (for us, at least!).

Have you read TigerStyle, our engineering methodology? And have you watched our talk? Perhaps that will help close the gap in understanding how we think about engineering at TigerBeetle: not as an expense to be minimized, to minimize only our own development time, but as an asset, to be invested in, since we build it once, but developers enjoy it many times over. However, as you watch TigerStyle, you'll see it's not only about quality, but also a way to get quality in less time (go slow to go fast).

In other words, I think we differ when it comes to Total Cost of Ownership. We're not trying to minimize only our own development time, but investing in it, to produce something quality for our community, and so minimize the Total Cost of Ownership across the relationship as a whole (ours + community) [1].

[1] Our talk on our business methodology, Biodigital Jazz! goes into this idea of TCO across the community: https://www.youtube.com/watch?v=C98cyJ-wJuY

replies(1): >>43647613 #

10. dustbunny ◴[10 Apr 25 19:47 UTC] No.43647427[source]▶

>>43645950 (TP) #

To evaluate this as your are describing, you must reveal your estimate of the workload of what Tiger Beetle has done to roll their own docs. If it took them 5 minutes, for instance, the calculus is far different than if it took 5 years. Plus you must compare that time estimate to their other priorities to estimate the opportunity cost, something that you simply can not do accurately from the outside looking in.

And we must estimate the potential future value of what Tiger Beetle has done here. I value "no dependencies" pretty deeply and I can see how Tiger Beetle values it supremely. I don't see how you can hand waive it away so easily.

To assert that you don't believe Tiger Beetle at their word here is deeply disrespectful imo.

replies(1): >>43647681 #

11. mtlynch ◴[10 Apr 25 20:11 UTC] No.43647613{4}[source]▶

>>43647135 #

Thanks for the reply!

I haven't read TigerStyle yet, but I'll check it out. Is this the canonical URL?

https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TI...

12. mtlynch ◴[10 Apr 25 20:19 UTC] No.43647681[source]▶

>>43647427 #

>To evaluate this as your are describing, you must reveal your estimate of the workload of what Tiger Beetle has done to roll their own docs. If it took them 5 minutes, for instance, the calculus is far different than if it took 5 years. Plus you must compare that time estimate to their other priorities to estimate the opportunity cost, something that you simply can not do accurately from the outside looking in.

I don't need them to reveal their numbers to me to offer my critique, as I think few people would argue that the upfront cost of rolling your own docs site could possibly be lower than the cost of deploying an off-the-shelf solution like Hugo.

I think where reasonable people might disagree is about the total cost of ownership of Hugo vs. the home-rolled solution over five years, but I'd find it surprising if home-rolled solution wins.

>To assert that you don't believe Tiger Beetle at their word here is deeply disrespectful imo.

Where did I say that I doubt TigerBeetle's claims? I disagree with the justifications in the blog post, but it's a difference of opinion, not a question of facts.

They published this blog post, and this is HN, so I think it's well within the community standards to offer a respectful critique.

replies(1): >>43648166 #

13. bsder ◴[10 Apr 25 21:27 UTC] No.43648166{3}[source]▶

>>43647681 #

> I think few people would argue that the upfront cost of rolling your own docs site could possibly be lower than the cost of deploying an off-the-shelf solution like Hugo.

I'm not convinced. At some point, you will have to debug something weird in your docs system.

If you deploy Hugo, that means understanding Go. Docusaurus--Javascript, Node, and that entire ecosystem. With this, it's Zig all the way down.

Zig users tend to be (possibly notoriously) anti-dependency.

replies(2): >>43649013 #>>43649182 #

14. internetter ◴[10 Apr 25 23:47 UTC] No.43649013{4}[source]▶

>>43648166 #

> Zig users tend to be (possibly notoriously) anti-dependency.

I don't get anti-dependency, to be honest. Like say you want REGEX support in your database. You code a REGEX parser from scratch? What are the odds your parser doesn't have a vulnerability?

I think over zealous dependency usage is also bad, but it cuts both ways

replies(1): >>43649067 #

15. popularonion ◴[10 Apr 25 23:55 UTC] No.43649067{5}[source]▶

>>43649013 #

The post says they use pandoc for parsing, that’s a very good trade for being able to cut Node completely out of your project

16. mtlynch ◴[11 Apr 25 00:14 UTC] No.43649182{4}[source]▶

>>43648166 #

For a docs site with no special requirements, I'd be surprised if Hugo or another SSG can't do what they need out of the box. So, it's the cost of implementing your own SSG vs. the cost of figuring out how to use an existing one.

Also, just as a datapoint, I've been using Hugo on multiple sites for about five years, and I don't recall ever having to drop into Go to fix an issue. Hugo might be unique in this regard, as it ships as a single-file binary. You have to learn Go templates, but you don't have to learn anything about Go the language or standard library.

Before Go, I used Jekyll, and I don't recall ever having to learn Ruby, but I did have to work within the Ruby ecosystem because Jekyll required a Ruby environment.

Incidentally, TigerBeetle seems to have rolled their own rudimentary templating language, too.[0] I think that has potential to either limit the functionality they need or cause a lot of bugs.

[0] https://github.com/tigerbeetle/tigerbeetle/blob/0.16.29/src/...

17. ksec ◴[11 Apr 25 00:32 UTC] No.43649264[source]▶

>>43645950 (TP) #

>To be honest, the hard part of static site generation is parsing the Markdown, since Markdown is a complex language. Everything around it is simple scripting, which we can easily do ourselves.

I would think 95%+ of the work would be in pandoc if everything was from scratch. And they would have used Zine if it had supported the feature they want.

For larger project, mostly DB selection I completely agree with what you said. But for SSR, especially when there are other similar OSS like Zine available, I think they are fine.

Although I do wish if Zine had all the improvement Tigerbeetle wanted so at least the Zig community could all use one rather than roll their own.

18. code_biologist ◴[11 Apr 25 05:10 UTC] No.43650682[source]▶

>>43645950 (TP) #

I think this is a fun thing for TigerBeetle to do, but I'm pretty skeptical that this was a good engineering decision.

Ha, yeah, as an 8 year software engineering manager I'll agree that "fun" is not a good initial look for a new project, sadly — the best engineering decisions are boring far more often then not.

After years of insisting on picking boring options, I realized working like that was a buzz kill long term for my reports, I tried to relax and figure out how to have fun projects too. Give people with ideas space to run. My deal now is, the tighter the blast radius of the project you can give me, the more I'm ok with you going nuts.

Documentation is a great place for fun, low-blast-radius projects, so I totally get TB on this one!

Some other rules I give up front for project proposals. Hopefully the theme of blast radius control is charmingly obvious:

- No new languages. (I have had professional arguments over this)

- No fun projects that require ongoing labor/upkeep.

- No fun projects in stateful storage infrastructure. (I have had distressingly passionate professional arguments over this)

- No fun projects that involve new SaaS / hosting providers that can't be trivially cut loose or cost > $50-100/mo.

- Fun projects in generally persistent infrastructure need solid justification.

- Fun design system / UI infrastructure projects must be able to be gracefully incrementally adopted, or scoped tightly.