Most active commenters
  • sgarland(5)

←back to thread

324 points onnnon | 12 comments | | HN request time: 0.002s | source | bottom
Show context
irskep ◴[] No.42729983[source]
I agree with most of the other comments here, and it sounds like Shopify made sound tradeoffs for their business. I'm sure the people who use Shopify's apps are able to accomplish the tasks they need to.

But as a user of computers and occasional native mobile app developer, hearing "<500ms screen load times" stated as a win is very disappointing. Having your app burn battery for half a second doing absolutely nothing is bad UX. That kind of latency does have a meaningful effect on productivity for a heavy user.

Besides that, having done a serious evaluation of whether to migrate a pair of native apps supported by multi-person engineering teams to RN, I think this is a very level-headed take on how to make such a migration work in practice. If you're going to take this path, this is the way to do it. I just hope that people choose targets closer to 100ms.

replies(11): >>42730123 #>>42730268 #>>42730440 #>>42730580 #>>42730668 #>>42730720 #>>42732024 #>>42732603 #>>42734492 #>>42735167 #>>42737372 #
fxtentacle ◴[] No.42730123[source]
I would read the <500ms screen loads as follows:

When the user clicks a button, we start a server round-trip and fetch the data and do client-side parsing, layout, formatting and rendering and then less than 500ms later, the user can see the result on his/her screen.

With a worst-case ping of 200ms for a round-trip, that leaves about 200ms for DB queries and then 100ms for the GUI rendering, which is roughly what you'd expect.

replies(7): >>42730497 #>>42730551 #>>42730748 #>>42731484 #>>42732820 #>>42733328 #>>42733722 #
1. sgarland ◴[] No.42732820[source]
> 200ms for DB queries

No. Just no. There’s an entire generation of devs at this point who are convinced that a DB is something you throw JSON into, use UUIDs for everything, add indices when things are slower than you expected, and then upsize the DB when that doesn’t fix it.

RAM access on modern hardware has a latency of something like 10 nanoseconds. NVMe reads vary based on queue depth and block size, but sub-msec is easily attainable. Even if your disks are actually a SAN, you should still see 1-2 msec. The rest is up to the DB.

All that to say, a small point query on a well-designed schema should easily execute in sub-msec times if the pages are in the DB’s buffer pool. Even one with a small number of joins shouldn’t take more than 1-2 msec. If this is not the case for you, your schema, query, or DB parameters are sub-optimal, or you’re doing some kind of large aggregation query.

I took a query from 70 to 40 msec today just by rewriting it. Zero additional indexing or tuning, just unrolling several unnecessary nested subqueries, and adding a more selective predicate. I have no doubt that it could get into the single digits if better indexing was applied.

I beg of devs, please take the time to learn SQL, to read EXPLAIN plans, and to measure performance. Don’t accept 200 msec queries as “good enough” because you’re meeting your SLOs. They can be so much faster.

replies(5): >>42733202 #>>42734430 #>>42735780 #>>42737433 #>>42737686 #
2. reissbaker ◴[] No.42733202[source]
I think 500ms P75 is good for an app that hits network in a hot path (mobile networks are no joke), but I agree that 200ms is very very bad for hitting the DB on the backend. I've managed apps with tables in the many, many billions of rows in MySQL and would typically expect single digit millisecond responses. If you use EXPLAIN you can quickly learn to index appropriately and adjust queries when necessary.
replies(1): >>42733370 #
3. gooosle ◴[] No.42733370[source]
500ms p75 is not good for the (low) complexity of the shopify app.

Also reporting p75 latency instead of p99+ just screams to me that their p99 is embarrassing and they chose p75 to make it seem reasonable.

4. charleslmunger ◴[] No.42734430[source]
>RAM access on modern hardware has a latency of something like 10 nanoseconds

What modern hardware are you using that this is true? That's faster than L3 cache on many processors.

replies(1): >>42737193 #
5. ezekiel68 ◴[] No.42735780[source]
Beg all you want. They're still going to dump JSON strings (not even jsonb) and UUIDs in them anyway, because, "Move fast and break things."

I lament along with you.

replies(1): >>42737207 #
6. sgarland ◴[] No.42737193[source]
Correction: DRAM latency is ~10 - 20 nsec on most DDR4 and DDR5 sticks. The access time as seen by a running program is much more than that.

As an actual example of RAM latency, DDR4-3200 with CL22 would be (22 cycles * 2E9 nsec/sec / 3200E6 cycles/sec) == 13.75 nsec.

7. sgarland ◴[] No.42737207[source]
“We’re disrupting!”

“Yeah, you’re disrupting my sleep by breaking the DB.”

8. refset ◴[] No.42737433[source]
> just unrolling several unnecessary nested subqueries, and adding a more selective predicate

And state of the art query optimizers can even do all this automatically!

replies(1): >>42738251 #
9. fxtentacle ◴[] No.42737686[source]
"All that to say, a small point query on a well-designed schema should easily execute in sub-msec times if the pages are in the DB’s buffer pool"

Shopify is hosting a large number of webshops with billions of product descriptions, but each store only has a low visitor count. So we are talking about a very large and, hence, uncacheable dataset with sparse access. That means almost every DB query to fetch a product description will hit the disk. I'd even assume a RAID of spinning HDDs for price reasons.

replies(1): >>42738208 #
10. sgarland ◴[] No.42738208[source]
Shopify runs a heavily sharded MySQL backend. Their Shop app uses Vitess; last I knew the main Shopify backend wasn’t on Vitess (still sharded, just in-house), but I could be wrong.

I would be very surprised if “almost every query” was hitting disk, and I’d be even more surprised to learn that they used spinners.

11. sgarland ◴[] No.42738251[source]
Sometimes, yes. Sometimes not. This was on MySQL 5.7, and I wound up needing to trace the optimizer path to figure out why it was slower than expected.

While I do very much appreciate things like WHERE foo IN —> WHERE EXISTS being automatically done, I also would love it if devs would just write the latter form. Planners are fickle, and if statistics get borked, query plans can flip. It’s much harder to diagnose when all along, the planner has been silently rewriting your query, and only now is actually running it as written.

replies(1): >>42743728 #
12. refset ◴[] No.42743728{3}[source]
Explicit query plan pinning helps a lot, alongside strong profiling and monitoring tools.