So, I mean, good luck.
So, I mean, good luck.
>Today, the Python backends to the web services communicate directly with PostgreSQL via SQLAlchemy, but it is my intention to build out experimental replacement backends which are routed through GraphQL instead. This way, the much more performant and robust GraphQL backends become the single source of truth for all information in SourceHut.
I wonder how adding a layer of indirection can significantly improve performance. If I were writing this service, I would go all in on GraphQL and have the frontend talk to the GraphQL services directly rather than routing the requests from Python through to a GraphQL service then presumably to PostgreSQL.
Perhaps I am missing something. Indeed good luck to Drew here.
But claiming some nebulous backend that's more performant and robust than Postgres is like, WTF? Are you using an actual GraphDB like Neo4J? Are you putting a graph frontend on Postgres like PostGraphQL? None of the post really makes any sense because GraphQL is a Query Language, not a data store. What are the CAP theorem tradeoffs in the new backend? What does more robust mean? What does more performant mean? This is a source control app. Those tradeoffs are meaningful.
There seems to be a lot of conflation between API design and data store and core programming tools all mixed into a big post that mostly sounds to me like, "I don't get how to make this (extremely popular and well-known platform that drives many websites 10000x my size) work well, so I'm trying something different that sounds cool."
Which, again, the author has always said this is an experiment, and that's cool. But the conceptual confusion in the post makes me think that moving away from boring tools and trying new tools is not going to end up going well.
But this is a source control app, and it's hopefully backed up somewhere besides sourcehut so it should be fine if he needs to backtrack.
Python's single-threaded design makes it difficult to be responsive to small queries quickly while simultaneously serving large, time-consuming queries (i.e. git operations). You can get around this using worker queues to separate interpreter processes and an async design, or otherwise splitting your workload up... or you can use a language where "have a threadpool" is actually a properly supported concept, and an architecture where sharding git/email/etc backends is feasible.