What Happened to Lee

(wired.com)

1298 points jgrahamc | 3 comments | 15 Apr 20 14:31 UTC | HN request time: 0.001s | source

Show context

jgrahamc ◴[15 Apr 20 21:54 UTC] No.22883548[source]▶

I posted this hours ago and then stepped away. The story captures so much about the Lee I knew so well. I'll add one piece of praise for Lee's early architecture of Cloudflare.

Everything was controlled by a single Postgres database that made very heavy use of stored procedures, that called other procedures, that called others. It was one giant program inside the database. It took me a while to comprehend what he'd done but it was really great. The database ran everything and all those functions made sure that audit logs were kept, that the calls were allowed for the user ID being passed in, and some of these procedures made external calls to APIs including getting things like SSL certificates.

It was a magnificent monolith inside a database.

I worked on the periphery of the database (it was truly Lee's domain) and he'd tell me what output to expect or API to create and I'd code to his spec. and we'd just hook it up.

If any single artefact represents what he did at Cloudflare, it's that database. And he used to code it on a laptop we called "The Beast" because it was so crazily heavy and overloaded with memory etc. that he'd carry around a mini, test Cloudflare wherever he went.

replies(2): >>22883751 #>>22883786 #

mynameishere ◴[15 Apr 20 22:16 UTC] No.22883751[source]▶

>>22883548 #

Working with stored procedures is just the pits. I wonder why you would describe that as "great". Maybe it's optimized, but so would be a giant executable written in assembly.

replies(4): >>22883916 #>>22883978 #>>22884300 #>>22884664 #

crazygringo ◴[16 Apr 20 00:06 UTC] No.22884664[source]▶

>>22883751 #

As long as you have the necessary tooling for source control, versioning, code search, etc. -- why is it the pits?

I mean, if you stick to a rigorous procedure of automatically exporting your stored procedures to files and committing them to git with messages... how is that different from any other development?

Serious question -- I've done tons of database work but only a little bit with stored procedures, so wondering if I'm missing a crucial detail here.

replies(4): >>22886392 #>>22886876 #>>22887472 #>>22892698 #

ggreer ◴[16 Apr 20 19:38 UTC] No.22892698[source]▶

>>22884664 #

Let's say that when you deploy, you want to direct 1% of traffic to the new code, then 5%, then 25%, then 50%, and finally 100%. And if monitoring metrics go red at any point, automatically roll back. A typical canary deployment.

How do you do that with a sproc? With normal application code, you can have the load balancer send an arbitrary percentage of traffic to the new instances of the service. A sproc binds the code and the data together. But you only have one database, so there's no way to do a gradual deploy of the code. That would be like doing a gradual deploy of a schema migration. It's only possible if you have a lot of databases.

Also unless you're using MS SQL server, debugging sprocs is a pain. The only workaround I've seen is to add a debug or loglevel arg to the sproc. But actually pausing execution and inspecting data? Good luck.

replies(2): >>22892758 #>>22899585 #

dragonwriter ◴[16 Apr 20 19:44 UTC] No.22892758[source]▶

>>22892698 #

> Let's say that when you deploy, you want to direct 1% of traffic to the new code, then 5%, then 25%, then 50%, and finally 100%. And if monitoring metrics go red at any point, automatically roll back.

> How do you do that with a sproc?

Wrap the real sproc (well, the set of real sprocs, since old and new will sometimes coexist) with a wrapper sproc that probabilistically directs to the appropriate working sproc based on a table value (or anything else the sproc can read) that is updated to reflect the current stage in the conversion process, including any monitoring-based reversals.

replies(1): >>22893286 #

ggreer ◴[16 Apr 20 20:37 UTC] No.22893286[source]▶

>>22892758 #

How do you correlate those errors at the application level? If only 1% of sproc calls are the new sproc, any elevated error rates in the application will be hard to see over typical noise.

replies(1): >>22893681 #

dragonwriter ◴[16 Apr 20 21:19 UTC] No.22893681[source]▶

>>22893286 #

> How do you correlate those errors at the application level? If only 1% of sproc calls are the new sproc, any elevated error rates in the application will be hard to see over typical noise.

By using the DBs error reporting features to report back errors to the application, including information about which ultimate implementation was handling the request.

replies(1): >>22893843 #

ggreer ◴[16 Apr 20 21:39 UTC] No.22893843[source]▶

>>22893681 #

That only covers errors that occur inside the sproc, not errors that are caused by the sproc. For example: What if the new version of the sproc succeeds, but the data it returns causes the application code to fail in some way? The sproc metrics would report 100% success and the application error rate wouldn't get above noise until a significant fraction of traffic was using the new buggy version of the sproc.

It's possible to add more epicycles to try and correct for these deficiencies, but the truth is that sprocs are dangerously double-edged. Yes, your code runs next to your data, but to get parity with application code (canary deploys, debugging, logging, etc) you must reinvent lots of tooling.

replies(1): >>22906106 #

KajMagnus ◴[18 Apr 20 08:03 UTC] No.22906106[source]▶

>>22893843 #

@ggreer, the grand-grand...parent, wrote:

> there's no way to do a gradual deploy of the code

There could be version numbers?

    some_procedure_v1()
    some_procedure_v2()

and new v2 version application server code calls v2 stored procedures. And if v2 app server code gets rolled back / cancelled, then, the v2 stored procedures automatically won't be used any more.

replies(2): >>22908152 #>>22949414 #

1. steve-chavez ◴[22 Apr 20 19:44 UTC] No.22949414{3}[source]▶

>>22906106 #

Version numbers work great as schemas. You could have v1.proc and v2.proc.

Also, for an API schema, only exposing views + stored procedures and hiding tables in private schemas makes the SQL easier to refactor. This is something we recommend at postgrest.org: http://postgrest.org/en/v7.0.0/schema_structure.html#schema-...

Sharing in case this is useful to you.

replies(1): >>22991553 #

2. KajMagnus ◴[27 Apr 20 00:08 UTC] No.22991553[source]▶

>>22949414 (TP) #

Interesting idea — I had never thought about that (I mean, looking at tables as internals, and procedures + views as the public interface). I'll think about that the next time I'm doing a data migration.

Thanks for sharing, Postgrest looks like intended for people building a web app, and who don't want to write application server code, instead, they install just a database?

Also, I imagine Postgrest is nice for internal admin apps? then one "just" needs to write Javascript and connect to Postgrest via REST?

Actually, it'd be nice if https://postgrest.org homepage explained some different use cases :-) I had to think for a little while, still not sure if I understand all cases when Postgrest can be useful.

Edit: Now I see there're explanations here: https://postgrest.org/en/v7.0.0/ " Motivation Using PostgREST is an alternative to manual CRUD programming" etc, if I just scroll down ab bit. — I didn't do that until now, I probably stopped reading at the "Sponsors" text & images previously, or I thought it was a documentation page.

This is quite cool: "Creating new views happens in SQL with known performance implications. A database administrator can now create an API from scratch with no custom programming".

It looks a bit like GraphQL also: https://postgrest.org/en/v7.0.0/api.html#resource-embedding

Nice that it's written in Haskell, was my favorite language long ago :- )

(PS. http://postgrest.org redirect to: http://postgrest.org/en/v6.0/ which is not the latest version (& not https), maybe you'd like to redirect to: https://postgrest.org/en/v7.0.0/ instead)

replies(1): >>22999110 #

3. steve-chavez ◴[27 Apr 20 19:31 UTC] No.22999110[source]▶

>>22991553 #

> Postgrest looks like intended for people building a web app, and who don't want to write application server code, instead, they install just a database?

Besides usual webapps, I've also used it for building an IoT project(sensors logic in the db) and a GIS project(postgis/pgrouting). Some users have also used it for open data projects(OpenAPI is provided).

It's also great for internal admin apps and there are ready-made libs like https://github.com/Wellers0n/react-admin-postgrest-client.

Those are some of the use cases off the top of my head. In general, I think it's good for cases where you'd like to use the database as a "processing engine" instead of a "dumb storage".

The docs definitely need a lot of improvement. Thanks a lot for your feedback!

↑