Most active commenters

ggreer(3)
KajMagnus(3)

Popular/hot comments

>>22884664 #

←back to thread

What Happened to Lee

(wired.com)

Show context

jgrahamc ◴[15 Apr 20 21:54 UTC] No.22883548[source]▶

>>22878136 (OP) #

I posted this hours ago and then stepped away. The story captures so much about the Lee I knew so well. I'll add one piece of praise for Lee's early architecture of Cloudflare.

Everything was controlled by a single Postgres database that made very heavy use of stored procedures, that called other procedures, that called others. It was one giant program inside the database. It took me a while to comprehend what he'd done but it was really great. The database ran everything and all those functions made sure that audit logs were kept, that the calls were allowed for the user ID being passed in, and some of these procedures made external calls to APIs including getting things like SSL certificates.

It was a magnificent monolith inside a database.

I worked on the periphery of the database (it was truly Lee's domain) and he'd tell me what output to expect or API to create and I'd code to his spec. and we'd just hook it up.

If any single artefact represents what he did at Cloudflare, it's that database. And he used to code it on a laptop we called "The Beast" because it was so crazily heavy and overloaded with memory etc. that he'd carry around a mini, test Cloudflare wherever he went.

replies(2): >>22883751 #>>22883786 #

1. mynameishere ◴[15 Apr 20 22:16 UTC] No.22883751[source]▶

>>22883548 #

Working with stored procedures is just the pits. I wonder why you would describe that as "great". Maybe it's optimized, but so would be a giant executable written in assembly.

replies(4): >>22883916 #>>22883978 #>>22884300 #>>22884664 #

2. icedchai ◴[15 Apr 20 22:34 UTC] No.22883916[source]▶

>>22883751 (TP) #

Last time I worked with stored procs with an Oracle 8 database, over 20 years ago. PL/SQL was awful.

replies(1): >>22884303 #

3. asguy ◴[15 Apr 20 22:40 UTC] No.22883978[source]▶

>>22883751 (TP) #

It is great: your code runs right near the data it’s working with.

If it’s the pits, you need a better suited development environment. I use emacs and interactively code inside the DB. Python, Lua, JavaScript, PGSQL, whatever.

replies(1): >>22884520 #

4. xupybd ◴[15 Apr 20 23:19 UTC] No.22884300[source]▶

>>22883751 (TP) #

I've found the opposite. It can take a long time to get your head into that space but it produces much more maintainable and robust systems in my experience.

replies(1): >>22885053 #

5. 7thaccount ◴[15 Apr 20 23:19 UTC] No.22884303[source]▶

>>22883916 #

I work with one extremely large application written in Oracle stored SQL procedures. It is a pain, but pretty efficient.

6. tomrod ◴[15 Apr 20 23:45 UTC] No.22884520[source]▶

>>22883978 #

I'm a journeyman in this space, and don't know emacs well. Can you help me unpack your comment here? You write python in the database directly?

replies(1): >>22884744 #

7. crazygringo ◴[16 Apr 20 00:06 UTC] No.22884664[source]▶

>>22883751 (TP) #

As long as you have the necessary tooling for source control, versioning, code search, etc. -- why is it the pits?

I mean, if you stick to a rigorous procedure of automatically exporting your stored procedures to files and committing them to git with messages... how is that different from any other development?

Serious question -- I've done tons of database work but only a little bit with stored procedures, so wondering if I'm missing a crucial detail here.

replies(4): >>22886392 #>>22886876 #>>22887472 #>>22892698 #

8. cat199 ◴[16 Apr 20 00:18 UTC] No.22884744{3}[source]▶

>>22884520 #

not OP but likely:

1) coding via repl - e.g. one emacs buffer has stored procedure code, send functions over via keyboard shortcut to SQL repl buffer connected to DB to evaluate it 'live' (jupyter notebooks are basically a copy of this mode of operation in a more consumer-friendly package)

2) w/r/t python in db, as OP mentions pgsql, pgsql has support for in-db extensibility using a variety of languages, of which python is one: https://www.postgresql.org/docs/current/plpython.html probably referring to this

replies(1): >>22884770 #

9. tomrod ◴[16 Apr 20 00:21 UTC] No.22884770{4}[source]▶

>>22884744 #

Thanks!

10. pingec ◴[16 Apr 20 01:04 UTC] No.22885053[source]▶

>>22884300 #

I have a similar experience. I've worked with a product that consists of tens of thousands of stored procedures implementing all business logic. Because you are right next to the data you can get away with doing many thing you would have not otherwise.

Also it acted like a barrier, the core team who understood the business well worked on the stored procedures, keeping things fast and stable.

Another team was doing the tech to interface with the database and UX/UI was then implemented in various technologies in parallel and changed/got recycled every few years to whatever was in fashion.

The difference in culture between the two teams was huge, it seemed to work out well in their case.

replies(1): >>22885254 #

11. xupybd ◴[16 Apr 20 01:45 UTC] No.22885254{3}[source]▶

>>22885053 #

It also saves a lot of extra work. Often with more conventional systems you have to pull the data out. Move it into a structure native to the language doing the processing.

In the database you don't need to do any of this. The data is there. The type information is there.

12. throw149102 ◴[16 Apr 20 05:27 UTC] No.22886392[source]▶

>>22884664 #

Not OP, but I want to say there's a lot more that goes into the development process than just source control, versioning, and code search. There's linters, and dozens of different kinds of tests, and ways of automatically running those tests, and layers of procedure before code changes go public. This seems much harder to do for stored procedures. Especially considering that stored procedure languages are fragmented between different databases; you don't just pay the cost of having to change the stored procedures when changing databases, you are also paying for the fact that it is harder to build automated tools to analyze the stored procedures, because the API for each database is going to be different.

That being said, there are lots of factors that influence how maintainable a specific stored procedure is. Team size, business needs, company culture, etc. Stored procedures get a bad rep from the places where they are train wrecks. YMMV.

13. Intermernet ◴[16 Apr 20 07:00 UTC] No.22886876[source]▶

>>22884664 #

I'd highly recommend playing around with stored procedures. They, at a minimum, can provide data integrity guarantees that you can't get with anything else. As an example, for ages the only way to do a reliable (pseudo atomic) upsert call on most databases was through a stored procedure. Once I discovered stored procedures, I started using them everywhere. After a while I realised that you need to put your logic in the appropriate place for all of the usual reasons (performance, maintainability, security etc) and resisted the temptation to dump all business logic into stored procedures, but I still use them a lot. They're like the bridge between the dba world and the traditional developer world.

"An elegant weapon, for a more civilised age..."

14. danielbarla ◴[16 Apr 20 09:02 UTC] No.22887472[source]▶

>>22884664 #

It's a completely valid question, so I'll try to give a serious answer, from my perspective. Firstly, I'd like to say that SPs can be quite awesome, for several of the reasons already stated. It is normally one of the few ways one can truly enforce integrity, access controls, and logging at the DB level. (Also, most of my experience is from an MS SQL and C# background, so some of this may not translate over to PostgreSQL quite so well)

1. Verbosity and fitness for purpose: while there is a kind of beauty in pure set-based SQL querying, stored procedures by their nature tend to be a procedural paradigm, and in my opinion, they are not particularly well suited to it. They are insanely verbose by today's standards, and doing anything typically requires several times more code than the equivalent in most modern languages (your mileage may vary with different database engines!). This results in large, monolithic SPs, which are difficult to read, understand, and test. The "standard library" is generally far less extensive than mainstream languages, leading to sub-standard, verbose, slow implementations (string concatenations and manipulations are a good example in MS SQL).

2. Code navigation, dependency management: the inverse of point 1 - in general, the tooling around databases for code searching, navigation, etc, is quite woeful in comparison to non database languages. In MS SQL, dependency management is so unreliable, that the official tooling misses several classes of dependencies, just depending on the type of syntax you happened to use. In your question, you start by saying that if you have the necessary tooling already obtained, it's all good. In my experience, rarely are even "barely good enough" tools in place, let alone anything close to what exists in other environments.

3. Unit testing: since databases tend to be stateful, the very concept of unit testing is a bit of an alien one. People don't tend to write SPs with unit tests in mind, and at least in the MS SQL world, the few "unit testing" frameworks that did exist tended to be incredibly painful to use - to the point that they were generally not worth it. Brittle, expensive integration tests are not the same.

4. Execution plan caching: the way databases handle caching of SP execution plans leaves much to be desired - generally the first set of parameters that an SP gets called with end up determining the execution plan for all future calls. Even in well managed code bases, this ends up being an issue, occasionally needing a DBA to manually go in, identify a misbehaving execution plan, and evict it. Vary by param, and no-cache options do not fully solve this. While the need for such mechanisms is obvious, it is surprising that these have been a pain for 20+ years. Now, execution plan caching also happens for normal SQL too, so my point is rather that the culture around SPs tends to have an unreasonable fear of dynamic SQL, which would be far more appropriate _where the query actually is of a dynamic nature!_

I can imagine people replying "but clearly, you're doing things that you shouldn't be - that's not what SPs are for!". And I agree, in a large part that is my point too. But then, what exactly should SPs be used for?

replies(1): >>22888975 #

15. crazygringo ◴[16 Apr 20 13:13 UTC] No.22888975{3}[source]▶

>>22887472 #

Thank you so much for writing that up!

16. ggreer ◴[16 Apr 20 19:38 UTC] No.22892698[source]▶

>>22884664 #

Let's say that when you deploy, you want to direct 1% of traffic to the new code, then 5%, then 25%, then 50%, and finally 100%. And if monitoring metrics go red at any point, automatically roll back. A typical canary deployment.

How do you do that with a sproc? With normal application code, you can have the load balancer send an arbitrary percentage of traffic to the new instances of the service. A sproc binds the code and the data together. But you only have one database, so there's no way to do a gradual deploy of the code. That would be like doing a gradual deploy of a schema migration. It's only possible if you have a lot of databases.

Also unless you're using MS SQL server, debugging sprocs is a pain. The only workaround I've seen is to add a debug or loglevel arg to the sproc. But actually pausing execution and inspecting data? Good luck.

replies(2): >>22892758 #>>22899585 #

17. dragonwriter ◴[16 Apr 20 19:44 UTC] No.22892758{3}[source]▶

>>22892698 #

> Let's say that when you deploy, you want to direct 1% of traffic to the new code, then 5%, then 25%, then 50%, and finally 100%. And if monitoring metrics go red at any point, automatically roll back.

> How do you do that with a sproc?

Wrap the real sproc (well, the set of real sprocs, since old and new will sometimes coexist) with a wrapper sproc that probabilistically directs to the appropriate working sproc based on a table value (or anything else the sproc can read) that is updated to reflect the current stage in the conversion process, including any monitoring-based reversals.

replies(1): >>22893286 #

18. ggreer ◴[16 Apr 20 20:37 UTC] No.22893286{4}[source]▶

>>22892758 #

How do you correlate those errors at the application level? If only 1% of sproc calls are the new sproc, any elevated error rates in the application will be hard to see over typical noise.

replies(1): >>22893681 #

19. dragonwriter ◴[16 Apr 20 21:19 UTC] No.22893681{5}[source]▶

>>22893286 #

> How do you correlate those errors at the application level? If only 1% of sproc calls are the new sproc, any elevated error rates in the application will be hard to see over typical noise.

By using the DBs error reporting features to report back errors to the application, including information about which ultimate implementation was handling the request.

replies(1): >>22893843 #

20. ggreer ◴[16 Apr 20 21:39 UTC] No.22893843{6}[source]▶

>>22893681 #

That only covers errors that occur inside the sproc, not errors that are caused by the sproc. For example: What if the new version of the sproc succeeds, but the data it returns causes the application code to fail in some way? The sproc metrics would report 100% success and the application error rate wouldn't get above noise until a significant fraction of traffic was using the new buggy version of the sproc.

It's possible to add more epicycles to try and correct for these deficiencies, but the truth is that sprocs are dangerously double-edged. Yes, your code runs next to your data, but to get parity with application code (canary deploys, debugging, logging, etc) you must reinvent lots of tooling.

replies(1): >>22906106 #

21. Benjammer ◴[17 Apr 20 15:05 UTC] No.22899585{3}[source]▶

>>22892698 #

>Let's say... you want... 1%... then 5%, then 25%

Honestly though, it feels like the argument you are making here is that it's impossible in general to incrementally roll out database changes? Sure, it's definitely not as simple or easy as spinning up new cloud application instances and diverting http requests, but it's not impossible or infeasible by any means. The problem with databases is that the best solutions are always so subjective and context specific. It's very hard to write out simple, easy-to-understand, "rules," or "concepts," or whatever. Every answer, to every hard problem, is "it depends."

What are we rolling out here? Does the sproc write data or read data? Both? From the same table/schema/server/etc? Does it move data around? How much data? Do you already have read-replicas? How is data replicated to those? Do you have any parallel-write setups with eventual consistency? What parts of the higher level application/platform utilize this part of the database that we want to change? Can we spin up a new DB instance, with the new sproc, and establish data replication/change resolution with the legacy system as we are rolling out the change? Should we move everything to the new system and set up the compat piece as a _forward_ compat on the legacy side instead? etc, etc, etc.

>But actually pausing execution and inspecting data? Good luck.

There are plenty of ways to do this. Copy-paste the code of the proc into your DB IDE of choice, and execute it line by line. If the proc is using temp tables or @variables or anything else you should be easily able to instantiate those locally too and see what the values look like. Use a transaction and rollback and you can run this stuff over and over on prod if you really want to be sure of what it does. As long as you aren't locking stuff in the DB improperly.

There's also the idea of writing sprocs in a more testable/verifiable way. Maybe you have progress tracking tables with status id columns and timestamps. Maybe you split up a long sproc into phases which each write their outputs into "intermediate tables" in some sort of "processing" schema. Maybe you write things to files somewhere to be consumed by some federated monitoring system. Idk, I could think of random ideas all day here. A lot of them probably sound dumb at face value, but I really don't think any of these goals sound infeasible.

22. KajMagnus ◴[18 Apr 20 08:03 UTC] No.22906106{7}[source]▶

>>22893843 #

@ggreer, the grand-grand...parent, wrote:

> there's no way to do a gradual deploy of the code

There could be version numbers?

    some_procedure_v1()
    some_procedure_v2()

and new v2 version application server code calls v2 stored procedures. And if v2 app server code gets rolled back / cancelled, then, the v2 stored procedures automatically won't be used any more.

replies(2): >>22908152 #>>22949414 #

23. cpx86 ◴[18 Apr 20 14:52 UTC] No.22908152{8}[source]▶

>>22906106 #

Exactly this. Treat the DB schema as you would any typical API schema. A lot of the techniques used for evolving application APIs can be used for sprocs and views as well, e.g. versioning for breaking changes, adding optional parameters or new result fields for non-breaking changes. Fundamentally I don't think there's much difference between say, a DB schema defined with sprocs/view or an HTTP schema defined with OpenAPI. Both describe an API contract between two remotely communicating processes, the former case just happens to use a SQL dialect/transport to do it.

replies(1): >>22938263 #

24. KajMagnus ◴[21 Apr 20 19:08 UTC] No.22938263{9}[source]▶

>>22908152 #

Interesting that you mention doing this (version numbers) with views too, I didn't think about that ...

... Maybe that could be a way to "perview" a database migration, before running the migration for real. There could be a table 'Some_table_view_v2' that shows how that table would look, after an upcoming data migration. And then v2 app server code, would use that new view. — Then one could be more certain that the data migration, will work fine.

(At the same time, one might need to be careful to let just a small a fraction of the requests, use the new View, if the view is a bit / a-lot slower than the real table.)

25. steve-chavez ◴[22 Apr 20 19:44 UTC] No.22949414{8}[source]▶

>>22906106 #

Version numbers work great as schemas. You could have v1.proc and v2.proc.

Also, for an API schema, only exposing views + stored procedures and hiding tables in private schemas makes the SQL easier to refactor. This is something we recommend at postgrest.org: http://postgrest.org/en/v7.0.0/schema_structure.html#schema-...

Sharing in case this is useful to you.

replies(1): >>22991553 #

26. KajMagnus ◴[27 Apr 20 00:08 UTC] No.22991553{9}[source]▶

>>22949414 #

Interesting idea — I had never thought about that (I mean, looking at tables as internals, and procedures + views as the public interface). I'll think about that the next time I'm doing a data migration.

Thanks for sharing, Postgrest looks like intended for people building a web app, and who don't want to write application server code, instead, they install just a database?

Also, I imagine Postgrest is nice for internal admin apps? then one "just" needs to write Javascript and connect to Postgrest via REST?

Actually, it'd be nice if https://postgrest.org homepage explained some different use cases :-) I had to think for a little while, still not sure if I understand all cases when Postgrest can be useful.

Edit: Now I see there're explanations here: https://postgrest.org/en/v7.0.0/ " Motivation Using PostgREST is an alternative to manual CRUD programming" etc, if I just scroll down ab bit. — I didn't do that until now, I probably stopped reading at the "Sponsors" text & images previously, or I thought it was a documentation page.

This is quite cool: "Creating new views happens in SQL with known performance implications. A database administrator can now create an API from scratch with no custom programming".

It looks a bit like GraphQL also: https://postgrest.org/en/v7.0.0/api.html#resource-embedding

Nice that it's written in Haskell, was my favorite language long ago :- )

(PS. http://postgrest.org redirect to: http://postgrest.org/en/v6.0/ which is not the latest version (& not https), maybe you'd like to redirect to: https://postgrest.org/en/v7.0.0/ instead)

replies(1): >>22999110 #

27. steve-chavez ◴[27 Apr 20 19:31 UTC] No.22999110{10}[source]▶

>>22991553 #

> Postgrest looks like intended for people building a web app, and who don't want to write application server code, instead, they install just a database?

Besides usual webapps, I've also used it for building an IoT project(sensors logic in the db) and a GIS project(postgis/pgrouting). Some users have also used it for open data projects(OpenAPI is provided).

It's also great for internal admin apps and there are ready-made libs like https://github.com/Wellers0n/react-admin-postgrest-client.

Those are some of the use cases off the top of my head. In general, I think it's good for cases where you'd like to use the database as a "processing engine" instead of a "dumb storage".

The docs definitely need a lot of improvement. Thanks a lot for your feedback!

↑