Most active commenters
  • ianamartin(5)
  • searchableguy(5)
  • CameronNemo(3)
  • vertex-four(3)

←back to thread

153 points michaelanckaert | 39 comments | | HN request time: 0.406s | source | bottom
1. ianamartin ◴[] No.23485568[source]
Well, that's too bad. I always thought this was a cool project. But if you can't dev your way into decent performance for a small alpha project using python/flask/sql, I don't think your tools are the problem. And I guarantee that a graphql isn't the solution.

So, I mean, good luck.

replies(5): >>23485616 #>>23485627 #>>23485632 #>>23487973 #>>23489310 #
2. ghoshbishakh ◴[] No.23485616[source]
what do you feel the problem is?
replies(1): >>23485681 #
3. searchableguy ◴[] No.23485627[source]
I didn't read the post that way. I feel scaling is a small issue more so organization. Graphql does make sense for something like source hunt.

Why not use type hints in python? Isn't that a good enough substitute?

I wonder why go instead of rust if he wanted static typing, long term ease of maintanence and performance. Go's type system is not great especially for something like graphql. Gqlgen relies heavily on code generation. Last time I used it, I ran into so many issues. I ditched go together after several painful clashes with it that community always responded with: oh you don't need this.

(yeah except they implemented all those parts in arguably worse ways and ditched community solutions in the next few years)

One major benefit the GP fails to mention is that with graphql, it is easy to generate types for frontend. This makes your frontend far more sane. It's also way easier to test graphql since there are tools to automatically generate queries for performance testing unlike rest.

There is no need to add something for docs or interactivity like swagger.

replies(3): >>23485642 #>>23485652 #>>23485686 #
4. CameronNemo ◴[] No.23485632[source]
This quote stuck out to me:

>Today, the Python backends to the web services communicate directly with PostgreSQL via SQLAlchemy, but it is my intention to build out experimental replacement backends which are routed through GraphQL instead. This way, the much more performant and robust GraphQL backends become the single source of truth for all information in SourceHut.

I wonder how adding a layer of indirection can significantly improve performance. If I were writing this service, I would go all in on GraphQL and have the frontend talk to the GraphQL services directly rather than routing the requests from Python through to a GraphQL service then presumably to PostgreSQL.

Perhaps I am missing something. Indeed good luck to Drew here.

replies(3): >>23485739 #>>23485808 #>>23487994 #
5. CameronNemo ◴[] No.23485642[source]
>One major benefit the GP fails to mention is that with graphql, it is easy to generate types for frontend. This makes your frontend far more sane.

By GP (grandparent?) do you mean the article / blog post?

Because if so I see no indication that Drew plans to adopt a SPA architecture -- he seems intent on continuing to use server side rendering with little javascript, which would make "frontend types" sort of irrelevant.

replies(1): >>23485669 #
6. TeeWEE ◴[] No.23485652[source]
Code generation (of types and clinet/server stubs) can be done with any IDL (interface definition language). OpenAPI, GRPC, Thrift etc etc all support it. No reason to choose GraphQL only because of this.

The power in GraphQL comes from the graph and flexibility in fetching what you need. Usefull in general purpose APIs (like GitHub has).

replies(1): >>23485659 #
7. searchableguy ◴[] No.23485659{3}[source]
I never said it was the only reason. I said it is one of the benefits that GP failed to mention in the post. Other benefits that GP mentioned still applies.

You can of course do this with other standards but ime, it's easier to do this with graphql since you only have to build the api. There is less overhead overall since type information is part of the standard, not necessarily something people add afterwards or choose to. Introspection, graphiql and all the tooling is easier to use and doesn't need integrating something like swagger.

It comes setup by default on most solid graphql frameworks.

8. searchableguy ◴[] No.23485669{3}[source]
You don't need spa to take benefits of the generated types from graphql. If you use something like typescript on the backend (SSR app), the types will be stripped out in the end so it doesn't affect your bundle size.
replies(2): >>23485680 #>>23487539 #
9. CameronNemo ◴[] No.23485680{4}[source]
You must understand my confusion when your original comment explicitly states that frontend types can be generated, but your reply here seems to be talking about a javascript/typescript backend service.
replies(1): >>23485706 #
10. ianamartin ◴[] No.23485681[source]
There are lots of production sites that serve 10,000x as much traffic as sourcehut that are built on Python/flask/sqlalchemy serving RESTful APIs.

If you can't make that combination work well, there's another place to look for problems besides your tool kit. You might need to ask yourself if you really understand the tools you're trying to use.

But like I said, this has always been a very cool project. My "good luck" was meant more as actual good luck than a Morgan Freeman You're-trying-to-blackmail-batman kind of good luck.

replies(2): >>23485782 #>>23485792 #
11. erk__ ◴[] No.23485686[source]
As for the reason to use Go instead of Rust it is probably just down to the creator of Sourcehut he has multiple times expressed that he dislikes rust quite a bit.

He has written a blog post about how he chooses programming languages as well https://drewdevault.com/2019/09/08/Enough-to-decide.html

replies(1): >>23486796 #
12. searchableguy ◴[] No.23485706{5}[source]
Sorry by backend, I mean your SSR app.
13. ◴[] No.23485739[source]
14. ◴[] No.23485782{3}[source]
15. xrisk ◴[] No.23485792{3}[source]
What are some production websites that run Python stacks? I’m curious.

On a tangential note: if anybody has blog posts on scaling Flask/SQLAlchemy or Django stacks I would appreciate it.

replies(3): >>23485819 #>>23485827 #>>23485987 #
16. ianamartin ◴[] No.23485808[source]
That quote is sort of exactly what's conceptually wrong with what's goin on in my opinion. Yes, I know, armchair quarterback and I'm not the one out there building stuff like this for free, etc., etc.

But claiming some nebulous backend that's more performant and robust than Postgres is like, WTF? Are you using an actual GraphDB like Neo4J? Are you putting a graph frontend on Postgres like PostGraphQL? None of the post really makes any sense because GraphQL is a Query Language, not a data store. What are the CAP theorem tradeoffs in the new backend? What does more robust mean? What does more performant mean? This is a source control app. Those tradeoffs are meaningful.

There seems to be a lot of conflation between API design and data store and core programming tools all mixed into a big post that mostly sounds to me like, "I don't get how to make this (extremely popular and well-known platform that drives many websites 10000x my size) work well, so I'm trying something different that sounds cool."

Which, again, the author has always said this is an experiment, and that's cool. But the conceptual confusion in the post makes me think that moving away from boring tools and trying new tools is not going to end up going well.

But this is a source control app, and it's hopefully backed up somewhere besides sourcehut so it should be fine if he needs to backtrack.

replies(3): >>23486254 #>>23487671 #>>23489020 #
17. searchableguy ◴[] No.23485819{4}[source]
Dropbox is a notable one born here.
replies(1): >>23485998 #
18. alexchamberlain ◴[] No.23485827{4}[source]
Don't Instagram and YouTube still have large Python code bases?
19. ianamartin ◴[] No.23485987{4}[source]
Reddit is one you might have heard of.

pypi.org is another that's familiar. You know, every time you type `pip install x` yeah, that's pypi.

Although I think those are both powered mainly by Pyramid rather than flask. Still, same concept.

As others mention, large parts of google and youtube are still python. Dropbox was so invested in python that they employed Guido van Rossum for a while. Instagram, a lot of Yahoo! back when they were a thing, Spotify, Quora, Pinterest, Hipmunk, Disqus, and this really obscure satire site called The Onion that totally never gets any traffic at all.

All of them powered by python at their core, many of them Django, some Pyramid, and some Flask.

Yes, getting that big does require big teams. Becoming one of the top 100 or so sites on the internet always requires some special sauce as well as dedicated teams. But most of these companies started with Python and a framework and got to massive web scale along the way and never changed the core platform because there really wasn't a need. Handling scale isn't about your core language or framework. It's about dozens of other things that you can offload to other things if you're smart. But let's be real: sourcehut isn't close to any of that level of traffic.

My negativity on this isn't about stanning a particular language. I'm an agnostic in multiple ways. I'll use whatever tool seems like the best fit. I'm down on this because the explanation is tool-blaming, murky, unclear, and doesn't provide a lot of the detail I would want to have if I were depending on this service.

On the other hand, the guy has always said this is an alpha project and you should expect major changes. That's all fine. It's just weird to me to see a "why I'm changing from X to Y" post that doesn't really explain anything other than "I might be bad at this."

replies(1): >>23494341 #
20. procinct ◴[] No.23485998{5}[source]
Especially considering they employed the BDFL.
replies(1): >>23486929 #
21. zapf ◴[] No.23486254{3}[source]
Well said! Couldn't agree more.

The GraphQL confusion is one more bullshit in the world of web dev.

replies(1): >>23486460 #
22. ianamartin ◴[] No.23486460{4}[source]
I'm not totally against GraphQL in general. As an alternative to REST it can sometimes make sense. And let's be real, most REST APIs are absolute garbage. Anything would be better than a bad REST API.

And if, in fact, you are storing a graph in a graph database, the QL makes a bit of sense.

But nothing in the post makes any sense out of any of that. It's just Python bad; REST bad; I read too much hacker news, and I feel like it's time for a change.

Like, when I complain about other people's REST APIs, that's out of my control. This guy is saying that his API is garbage, and instead of fixing it to make it better, he's just going to redo everything with a worse result. I don't get it.

23. Kuinox ◴[] No.23486796{3}[source]
It's the first time I hear that haskell has awful package management...
replies(2): >>23487508 #>>23490584 #
24. square_usual ◴[] No.23486929{6}[source]
No longer the BDFL, unfortunately.
replies(1): >>23490426 #
25. tome ◴[] No.23487508{4}[source]
It used to a few years ago, before Cabal v2-style. Nowadays package management is rather good, but its reputation hasn't caught up yet.
26. vertex-four ◴[] No.23487539{4}[source]
What bundle size? There's no javascript being shipped to the client.
replies(1): >>23487759 #
27. vertex-four ◴[] No.23487671{3}[source]
The goal here is to generate a typed API across a bunch of microservices (written in some typed language suited for the job) that are consumed by a Python frontend. The current design is a pile of vertically-integrated monoliths that touch the disk, database, perform backend operations and rendering all in one process.

Python's single-threaded design makes it difficult to be responsive to small queries quickly while simultaneously serving large, time-consuming queries (i.e. git operations). You can get around this using worker queues to separate interpreter processes and an async design, or otherwise splitting your workload up... or you can use a language where "have a threadpool" is actually a properly supported concept, and an architecture where sharding git/email/etc backends is feasible.

replies(1): >>23489341 #
28. gbear605 ◴[] No.23487759{5}[source]
Technically there is some used for a few things, like a text editor (for writing build manifests) and for payments. But those are very limited and aren’t relevant to the use of GraphQL.
29. ddevault ◴[] No.23487973[source]
Performance is a secondary concern. SourceHut already has the best performance in the industry, built on Python:

https://forgeperf.org

But I think it could be even better, and this work will help. It will make it easier to write performant code without explicitly hand-optimizing everything.

There are more important reasons to consider GraphQL than performance, which I cover in detail in TFA.

30. ddevault ◴[] No.23487994[source]
It will be performing this indirection over localhost. I don't see it as much different from the indirection of SQLAlchemy. Yes, there is the question of parsing the GQL and so on, but I think that they're surmountable and fit well within our desired performance budget.

Performance is also just one of many reasons why this approach is being considered.

31. karatestomp ◴[] No.23489020{3}[source]
> Are you using an actual GraphDB like Neo4J? Are you putting a graph frontend on Postgres like PostGraphQL? None of the post really makes any sense because GraphQL is a Query Language, not a data store. What are the CAP theorem tradeoffs in the new backend? What does more robust mean? What does more performant mean? This is a source control app. Those tradeoffs are meaningful.

GraphQL isn't particularly "graphy". Its name sucks. But don't worry, plenty of half-techy middle managers are out there making the same mistake and going "we do graph things, why don't you guys look into this GraphQL thing that's getting so much buzz?" It's not a great fit for graph operations, in fact. Not more than SQL, certainly.

As for N4J in particular, don't count on that to improve performance even if you're doing lots of graph stuff. Depends heavily on your usage patterns and it's very easy to modify a query in a way that seems like it'd be fine but in fact makes performance fall off a cliff. OTOH Cypher, unlike GraphQL, is a very nice language for querying graphs.

32. pknopf ◴[] No.23489310[source]
> And I guarantee that a graphql isn't the solution.

I agree. Out of the pan and into the frier.

He had a good idea though.

33. pknopf ◴[] No.23489341{4}[source]
> The goal here is to generate a typed API across a bunch of microservices

You are describing gRPC.

replies(1): >>23489622 #
34. vertex-four ◴[] No.23489622{5}[source]
I'm describing a lot of things, JSON-Schema documented REST APIs being another one. The other thing about GraphQL is that you can make a query that contains multiple requests and allows the server to optimise how to process them, which is not something that REST or gRPC are very good at.
35. iso-8859-1 ◴[] No.23490426{7}[source]
The BDFL is a position that you have for life. It is embedded in the definition. So it is impossible to shed the title.
replies(1): >>23490460 #
36. dragonwriter ◴[] No.23490460{8}[source]
Like most “for life” positions (the papacy, US federal judges, the British monarchy, etc.), it can be shed though it is not regularly expected to be lost other than at the holder's decision.
37. Scarbutt ◴[] No.23490584{4}[source]
I hear the opposite, it's so awful they have to use Nix to keep it sane.
replies(1): >>23497028 #
38. kupaka ◴[] No.23494341{5}[source]
jfyi, The Onion hasn't run on Python for a few years now. In 2017, they migrated over to Gizmodo's Scala stack.
39. tome ◴[] No.23497028{5}[source]
Interesting. Where did you hear that?