Samsung 990 Pro 2TB has a latency of 40 μs
DDR4-2133 with a CAS 15 has a latency of 14 nano seconds.
DDR4 latency is 0.035% of one of the fastest SSDs, or to put it another way, DDR4 is 2,857x faster than an SSD.
L1 cache is typically accessible in 4 clock cycles, in 4.8 ghz cpu like the i7-10700, L1 cache latency is sub 1ns.
The amount of complexity the architecture has because of those constraints is insane.
When I worked at my previous job, management kept asking for that scale of designs for less than 1/1000 of the throughput and I was constantly pushing back. There's real costs to building for more scale than you need. It's not as simple as just tweaking a few things.
To me there's a couple of big breakpoints in scale:
* When you can run on a single server
* When you need to run on a single server, but with HA redundancies
* When you have to scale beyond a single server
* When you have to adapt your scale to deal with the limits of a distributed system, i.e. designing for DyanmoDB's partition limits.
Each step in that chain add irrevocable complexity, adds to OE, adds to cost to run and cost to build. Be sure you have to take those steps before you decide too.
Even a very unoptimized application running on a dev laptop can serve 1Gbps nowadays without issues.
So what are the constraints that demand a complex architecture?
* Reading/fetching the data - usernames, phone number, message, etc.
* Generating the content for each message - it might be custom per person
* This is using a 3rd party API that might take anywhere from 100ms to 2s to respond, and you need to leave a connection open.
* Retries on errors, rescheduling, backoffs
* At least once or at most once sends? Each has tradeoffs
* Stopping/starting that many messages at any time
* Rate limits on some services you might be using alongside your service (network gateway, database, etc)
* Recordkeeping - did the message send? When?
The success that VCs are after is when your customer base doubles every month. Better yet, every week. Having a reasonably scalable infra at the start ensures that a success won't kill you.
Of course, the chances of a runaway success like this are slim, so 99% or more startups overbuild, given their resulting customer base. But it's like 99% or more pilots who put on a parachute don't end up using it; the whole point is the small minority who do, and you never know.
For a stable, predictable, medium-scale business it may make total sense to have a few dedicated physical boxes and run their whole operation from them comfortably, for a fraction of cloud costs. But starting with it is more expensive than starting with a cloud, because you immediately need an SRE, or two.
Look at the big successes such as youtube, twitter, facebook, airbnb, lyft, google, yahoo - exactly zero of them did this preventatively. Even altavista and babelfish, done by DEC and running on Alphas, which they had plenty of, had to be redone multiple times due to growth. Heck, look at the first 5 years of Amazon. AWS was initially ideated in a contract job for Target.
Address the immediate and real needs and business cases, not pie in the sky aspirations of global dominance - wait until it becomes a need and then do it.
The chances of getting there are only reasonable if you move instead of plan, otherwise you'll miss the window and product opportunity.
I know it ruffles your engineering feathers - that's one of the reasons most attempts at building these things fails. The best ways feel wrong, are counterintuitive and are incidentally often executed by young college kids who don't know any better. It's why successful tech founders tend to be inexperienced; it can actually be advantageous if they make the right "mistakes".
Forget about any supposedly inevitable disaster until it's actually affecting your numbers. I know it's hard but the most controllable difference between success and failure in the startup space is in the behavioral patterns of the stakeholders.
So the converse argument might be: don't bungle it up because you failed to plan. Provision for at least 10x growth with every (re-)implementation.
https://highscalability.com/friendster-lost-lead-because-of-...
MySpace was the one that took the lead over Friendster and it withered after it got acquired for $500 million by news corp because that was the liquidity event. That's when Facebook gained ground. Your timeline is wrong.
The MySpace switch was because of themes and other features the users found more appealing. Twitter had similar crashes with its fail whale for a long time and they survived it fine. The teen exodus of Friendster wasn't because of TTLB waterfall graphs.
Also MySpace did everything on cheap Microsoft IIS 6 servers in ASP 2.0 after switching from Coldfusion in Macromedia HomeSite, they weren't genuises. It was a knockoff created by amateurs with a couple new twists. (A modern clone has 2.5 mil users: see https://spacehey.com/browse still mostly teenagers)
Besides, when the final Friendster holdout of the Asian market had exponential decline in 2008, the scaling problems of 5 years ago had long been fixed. Faster load times did not make up for a product consumers no longer found compelling.
Also Facebook initially was running literally out of Mark's dorm room. In 2007, after they had won the war, their code got leaked because they were deploying the .svn directory in their deploy strategy. Their code was widely mocked. So there we are again.
I don't care if you can find someone who agrees with you on the Friendster scaling thing, almost every collapsed startup has someone that says "we were just too successful and couldn't keep up" because thinking you were just too awesome is the gentler on the ego than realizing a bunch of scrappy hackers just gave people more of what they wanted and either you didn't realize it or you thought your lack of adaption was a virtue.
You're a highly technical user. Non-technical people are weird - part of the MySpace exodus was the belief that it spread "computer viruses", really
There was more to the switches but I'd have to dredge it up probably through archive sites these days. The reasons the surveys supported I considered ridiculous but it doesn't matter it's better to understand consumer behavior - we can't easily change it.
Especially these days. It was not possible for me to be a teenager with high speed wi-fi when I was one 30 years ago. I've got near zero understanding of the modern consumer youth market or what they think. Against all my expectations I've become an old person.
Anyways, the freeform HTML was a major driver - it was geocities with less effort, which had also exited through a liquidity event and currently has a clone these days https://neocities.org/browse
Also Sometimes, it is poor communication. Just yesterday I saw some code that requests auth token before every request even though each bearer token comes with expires in (about twelve hours).
The third party API is the part that has the potential to turn this straightforward task into a byzantine mess, though, so I suspect that's the missing piece of information.
I'm comparing this to my own experience with IRC, where handling the same or larger streams of messages is common. And that's with receiving this in real time, storing the messages, matching and potentially reacting to them, and doing all that while running on a raspberry pi.