I helped build Ars Technica and spent the majority of my time trying to make the site fast. We used a content delivery network to cache static content close to anonymous readers and it worked very well for them. But the most valuable readers were not these, but the ones who paid for subscriptions. They wanted personalized content and features for interacting with the community – and we couldn't make those fast. Content delivery networks don't work for Ars Technica's best customers.
Running Docker apps close to users helps get past the "slow" speed of light. Most interactions with an app server seem slow because of latency between the hardware it's running on (frequently in Virginia) and the end user (frequently not in Virginia). Moving server apps close to users is a simple way to decrease latency, sometimes by 80% or more.
fly.io is really a way to run Docker images on servers in different cities and a global router to connect users to the nearest available instance. We convert your Docker image into a root filesystem, boot tiny VMs using a project called Firecracker (recently discussed here: https://news.ycombinator.com/item?id=22512196) and then proxy connections to it. As your app gets more traffic, we add VMs in the most popular locations.
We wrote a Rust based router to distribute incoming connections from end users. The router terminates TLS when necessary (some customers handle their own TLS) and then hands the connection off to the best available Firecracker VM, which is frequently in a different city.
Networking took us a lot of time to get right. Applications get dedicated IP addresses from an Anycast block. Anycast is an internet routing feature that lets us "announce" from multiple datacenters, and then core routers pick the destination with the shortest route (mostly). We run a mesh Wireguard network for backhaul, so in flight data is encrypted all the way into a user application. This is the same kind of network infrastructure the good content delivery networks use.
We got a handful of enterprise companies to pay for this, and spent almost a year making it simple to use — it takes 3 commands to deploy a Docker image and have it running in 17 cities: https://fly.io/docs/speedrun/. We also built "Turboku" to speed up Heroku apps. Pick a Heroku app and we deploy the slug on our infrastructure .. typical Heroku apps are 800ms faster on fly.io: https://fly.io/heroku/
We've also built some features based on Hacker News comments. When people launch container hosting on Hacker News, there's almost always a comment asking for:
1. gRPC support: apps deployed to fly.io can accept any kind of TCP connection. We kept seeing people say "hey I want to run gRPC servers on this shiney container runtime". So you can! You can specify if you want us to do TLS or HTTP for an app, or just do everything yourself.
2. Max monthly spend: unexpected traffic spikes happen, and the thought of spending an unbounded amount of money in a month is really uncomfortable. You can configure fly.io apps with a max monthly budget, we'll suspend them when they hit that budget, and then re-enable them at the beginning of the next month.
One of the best parts of building this has been seeing the problems that developers are trying to solve, often problems we didn't know about beforehand. My favorite is a project to re-encode MP3s at variable speeds for specific users (apparently the Apple Audiobook player has no option for playback speed). Another is "TensorFlow at the edge" — they trained a TensorFlow model to detect bots and run predictions before handling requests.
We're really happy we get to show this to you all, thank you for reading about it! Please let us know your thoughts and questions in the comments.
You seem to be addressing the same problems.
[1] https://en.wikipedia.org/wiki/Information-centric_networking [2] https://irtf.org/icnrg
The list of cities looks pretty random to me. In particular I am not seeing anything in the Northeast, New York, etc. In upstate I already have 30ms latency to AWS and Azure in Ohio without terrible tail latency.
Instead, we give everyone $10/mo of credits and have a really tiny VM that you can run full time for $2.67/mo.
I'm using bare AWS at the moment because a) they gave me $5k in credits for YC SUS, b) they own the physical servers, and c) I can trust that they'll be around a long time, so I'd rather get locked into AWS proper rather than a service that might be built on top of AWS (e.g. CloudFormation vs. Terraform).
But I get, better than I did two months ago, just how freaking hard it is to build something, anything. This is amazing work, and I couldn't do it. Kudos to you, and I look forward to hearing about your amazing success!
`ewr Parsippany, NJ (US)`
We _tend_ to do better than AWS on latency to your apps, and from upstate New York you'd probably be connecting to New Jersey. I would be Virginia is quicker than Ohio for you most of the time too.
I discovered this earlier when I was playing Titanfall and noticed a much lower ping to their Azure data center in Ohio. I confirmed it by setting up my own host in Azure.
I was thinking of switching to Azure but pretty soon AWS opened us-east-2 and I moved my stuff there.
edit: those regions are the same because it's the easiest set of cities to roll this out in :)
Any plans to launch any Datacenters in India? could not find any here - https://fly.io/docs/regions/#welcome-message
Most people we've worked with want to run run apps they've already written (or open source like https://github.com/h2non/imaginary).
Out of curiosity, what kind of customers/teams are asking you for gRPC support? Is this coming from your enterprise customers or from smaller teams?
I was looking at containerized PostgreSQL on AWS because I want to colocate a job scheduling tool (pg_cron) with the database process, but RDS doesn't support that extension. Apparently (or at least I hope), ecs-cli compose supports docker volumes through EBS, which is the same base as EKS persistent volumes. There's next to no information for ECS + EBS though, everybody uses EC2 or full on EKS.
I was just thinking, if you needed to handle excessive read load on small quantities of data, having separate data layers would enable you to autoscale db instances while still having the same volumes, instead of using an entirely separate caching layer which could introduce bugs and increase maintenance overhead. If you guys had native HA with docker exec access and passed savings to consumers that would be huge for me and my use cases.
I just checked one of the performance tools we use a lot and it's <3ms to connect to fly.io New Jersey from NYC. It's not the best test because datacenter-to-datacenter behaves differently than consumer internet and NYC isn't upstate New York. If you feel like testing I'm curious what you see to https://flyio-ui.fly.dev
Ironically, these also make you a prime acquisition target (because the product idea rocks), which renders your long-term future unclear.
I am building a search engine and this would let me derive your performance benefits using region-scoped databases and search indices.
https://news.ycombinator.com/item?id=19612577
It's usually small teams, individual devs who want gRPC. Even if it's within a large company, it's almost always one technical person.
It _might_, if you need a bunch of ipv4 addresses it'll add up fast. But you could always put your own router app in place to accept that port on one IP, find the right ipv6, and forward connections along.
I’m running a single central Postgres server on Heroku and planning to use the Redis edges to cache.
A few questions, if I may:
> We run a mesh Wireguard network for backhaul, so in flight data is encrypted all the way into a user application. This is the same kind of network infrastructure the good content delivery networks use.
Does it mean the backhaul is private and not tunneling through the public internet?
> fly.io is really a way to run Docker images on servers in different cities and a global router to connect users to the nearest avaible instance.
I use Cloudflare Workers and I find that at times they load-balance the traffic away from the nearest location [0][1] to some location half-way around the world adding up to 8x to the usual latency we'd rather not have. I understand the point of not running an app in all locations esp for low traffic or cold apps, but do you also "load-balance" away the traffic to data-centers with higher capacity? If so, is there a documentation around this? I'm asking because for my use-case, I'd rather have the app running in the next-nearest location and not the least-load location.
> The router terminates TLS when necessary and then hands the connection off to the best available Firecracker VM, which is frequently in a different city.
Frequently? Are these server-routers running in more locations than data centers that run apps?
Out of curiosity, are these server-routers eBPF-based or dpdk or...?
> Networking took us a lot of time to get right.
Interesting, and if you're okay sharing more-- is it that the anycast setup and routing that took time, or figuring out networking wrt the app/containers?
Thanks a lot.
[0] https://community.cloudflare.com/t/caveat-emptor-code-runs-i...
I definitely feel more confident about our Rust code. It's no silver bullet, but it prevents a lot of unsoundness with its compile-time guarantees.
I can't really compare to C++, but it's easy to write new code or refactor old code. It took some time to get there, though.
All in all, I would recommend Rust wholeheartedly. The ecosystem is growing and getting more mature every week. The community is very helpful in general, especially the tokio folks.
We're testing persistent storage privately with a few customers now and the results are exciting. My favorite is using minio as a private global s3 for caching.
What are you using for the index storage engine?
(Also I got permission denied when attempting to curl the script when writing to /usr/local/bin, I needed sudo. I'm on Ubuntu 19.10 Eoan Ermine. Not sure whether security implications for `curl | sh` outweigh convenience, but I trust you guys and my connection. :P)
Enterprise workloads are far more conservative than I am, those guys spend decades running the same servers. It's why they can focus on sales and customer success and rake in money, which is what actually puts food on the table for their kids.
The script is just picking the binary for your OS/arch and putting in PATH. We have instructions for doing it yourself here https://fly.io/docs/getting-started/installing-flyctl/#comma...
Or you can download straight from github: https://github.com/superfly/flyctl/releases
Hopefully we can get on snap soon!
For example in the Paris (eu-west-3) region, their availability zones are operated on hardware owned and managed by Telehouse, Interxion and Equinix.
I just assumed if they're creating their own chips, they're probably creating their own servers, datacenters, networks, etc. but I guess I shouldn't jump to conclusions.
Backhaul runs only through the encrypted tunnel. The Wireguard connection itself _can_ go over the public internet, but the data within the tunnel is encrypted and never exposed.
> I use Cloudflare Workers and I find that at times they load-balance the traffic away from the nearest location [0][1] to some location half-way around the world adding up to 8x to the usual latency we'd rather not have. I understand the point of not running an app in all locations esp for low traffic or cold apps, but do you also "load-balance" away the traffic to data-centers with higher capacity?
This is actually a few different problems. Anycast can be confusing and sometimes you'll see weird internet routes, we've seen people from Michigan get routed to Tokyo for some reason. This is especially bad when you have hundreds of locations announcing an IP block.
Server capacity is a slightly different issue. We put apps where we see the most "users" (based on connection volumes). If we get a spike that fills up a region and can't put your app there, we'll put it in the next nearest region, which I think is what you want!
CDNs are notorious for forcing traffic to their cheapest locations, which they can do because they're pretty opaque. We probably couldn't get away with that even if we wanted to.
> Frequently? Are these server-routers running in more locations than data centers that run apps?
We run routers + apps in all the regions we're in, but it's somewhat common to see apps with VMs in, say, 3 regions. This happens when they don't get enough traffic to run in every region (based on the scaling settings), or occasionally when they have _so much_ traffic in a few regions all their VMs get migrated there.
> Interesting, and if you're okay sharing more-- is it that the anycast setup and routing that took time, or figuring out networking wrt the app/containers?
Anycast was a giant pain to get going right, then Wireguard + backhaul were tricky (we use a tool called autowire to maintain wireguard settings across all the servers). The actual container networking was pretty simple since we started with ipv6. When you have more IP addresses than atoms in the universe you can be a little inefficient with them. :)
(Also I owe you an email, I will absolutely respond to you and I'm sorry it's taken so long)
We're also a LIR and want to build an anycast network (we anonymize streaming data), any helpful resources you can share on this?
Cool product btw! I think this will be a very interesting area in the coming years, the fact that you offer Docker containers is a good USP as compared e.g. to Cloudflare workers, we might even consider using your service ourselves if you provide service in Europe (our customers are mostly in Germany)!
Not sure I understand the use case of a single Docker image in a city outside of your entire backend services, especially the DB. If your Docker image talks to something else on AWS / GCP for example you add a lot of latency using public routes.
It looks more like: https://workers.cloudflare.com/
One of the things we want to do, though, is make "boring" apps really fast. My heuristic for this is "can you put a Rails app on fly.io without a rewrite?".
Many of these applications add a caching layer. Normally if someone wants to make a Rails app fast, they'll start by minimizing database round trips and cache views or model data. If somone has already done this work, fly.io might just work for this app since we have a global Redis service (https://fly.io/docs/redis/).
We have experimented with using CockroachDB in place of Postgres to get us even farther, but it doesn't work with most frameworks' migration tools.
We're also thinking of running fast-to-boot read replicas for Postgres, so people could leave their DB in Virginia but bring up replicas alongside their app servers.
If you've seen anyone do anything clever to "globalize" their database we're all ears.
Networks are crazy too, especially between continents, ownership of undersea cables is fascinating: https://en.wikipedia.org/wiki/Submarine_communications_cable
I'm guessing it's likely that this solution was adopted to quickly expand to a lot of countries due to data location and privacy questions being raised.
Building an anycast network is expensive. That's part of what we want to make accessible to devs. There are a couple of companies (like Packet, and possibly Vultr) you can lease servers from that will handle anycast. These tend to get you into the same ~16 regions, expanding past those can be difficult and even more expensive. That's what we're working on now.
The question I have tho: How do you take advantage of the gains from this if you still need one master strictly consistent db for writes?
Would a system design pattern to take advantage of fly.io be to have read only replicas on each geographic deploy or to only have region specific persistance? Apologies if this was already answered I read thru everything I saw. Thanks!
Read only replicas are a great first step for most applications. I'd probably do caching first, then replicas (which are kind of like caching).
Region specific persistence is one way to improve write latency, and I think the simplest for most apps. We've experimented with CockroachDB for this (it keeps ranges of rows where they're most busy), and you can actually deploy MongoDB this way.
- Our JS apps require customers to write new code to solve problems. That’s a tough sell for companies with existing code they need to make fast.
- The more people used JS apps the more unexpected things they wanted to do. Like tensorflow at the edge, audio and video encoding, game servers, etc. No way we could support any of that without moving down the stack.
- Reimplementing the service worker api was a slog we didn’t want to continue. Deno is fantastic and we’d rather just run those apps than compete with it
I like this, not having caps is a major problem with some of your competition for smaller projects/companies where the max caps are more important then availability.
I heard of more then one project which mad some mistake them self wrt. some code generating request in their client. Or had some other reason why they had insane usage spikes, causing them to basically go bankrupt in a mater of a view hours. Not even days. (Through one project got lucked out as if I remember correctly Amazone bailed them out to prevent bad press).
IMHO for the majority of server application availability is important but only up to a certain cost. After which unavailability for some time is better, even if you lose some customers. (Yes, like always there are exceptions).
One question: when I ran "flyctl deploy" it said "Docker daemon available, performing local build..."
If I turn off my local Docker, would it instead just upload the Dockerfile somewhere and perform the build for me?
If so, is there a way to force it to do that? I'd much rather upload a few hundred bytes of Dockerfile than build and push 100s of MBs of compiled image from my laptop.
It would make sense to be able to force that. Right now you'd have to stop Docker.
(Also I'm a huge fan of Django)
“fly.io is really a way to run Docker images on servers in different cities and a global router to connect users to the nearest available instance. We convert your Docker image into a root filesystem, boot tiny VMs using an Amazon project called Firecracker, and then proxy connections to it. As your app gets more traffic, we add VMs in the most popular locations.”
Exciting stuff! My best to you!
I had to read these parts in the doc to get how Fly solves the problem differently:
> Think of a world where your application appears in the location where it is needed. When a user connects to a Fly application, the system determines the nearest location for the lowest latency and starts the application there.
> Compare those Fly features with a traditional non-Fly global cloud. There you create instances of your application at every location where you want low latency. Then, as demand grows, you'll end up scaling-up each location because scaling down is tricky and would involve a lot of automation. The likelihood is that you'll end up paying 24/7 for that scaled-up cloud version. Repeat that at every location and it’s a lot to pay for.
Full disclosure: I am another YC founder. Fly did not ask me or encourage me to post this in any way.
I can't think of many back-end applications between purely static content (just use a CDN) and needs a database connection. Probably video game servers, where you don't need the game state to be (immediately) stored/accessed globally.
Your quickstart being called speedrun is too good for you alone to have it, so I'm stealing it the next chance I get.
Either way congratulations Kurt & team! (and no I still have not bought that Safari 911 or any 911 for that matter and yes professional help is being sought).
How suitable do you think this is for CPU-intensive work? I'm interested in having servers for scientific-computational work, which would be rather CPU-heavy. It would be great to offload some of this to a nearby browser for bits and pieces that desire low-latency.
I know locations are probably not super important to you guys as you see it as a starting point or something super flexible, but I always find myself drawn to the concrete stuff like that. Largely as a measure of how mature the product is.
Holding transactions open for long distance round trips is going to get you into trouble in a myriad of ways. It does not scale.
Instead, I'm trying to make sense of APIs to figure out what a product called Floozbobble.io does and it turns out that it's SaaS for making SaaS product factory factories, and I don't care about that, but then some other product called Dizmeple.cloud comes out that makes it easier for me to manage my database deployments, which I do care about, and I can't tell I would want it because it has no fucking description!
When did we start to prefer these crap marketing sites that take 12,000 spins on my scroll wheel to get through and still don't tell you anything?
We've been talking to a lot of startups doing communications tools, especially for remote work.
Lots of full stack apps benefit from app servers + redis cache in different regions. They need a database connection, but if they're already done the work to minimize DB round trips they might just work with no code changes.
There are also a bunch of folks doing really dynamic video and image delivery. Where an individual user gets an entirely unique blob of binary data.
I have some questions about the pricing.
Say I want to use micro-1x with hard_limit/soft_limit = 20 and I get 40 concurrent request for one hour, would it cost $2.67 (micro-1 price) * 2? ($5.34) (monthly cost) If that is the case, can I set a limit on how many instances I want to run at most?
Another question: is the price calculated per second or is it there just to compare it with other services? If it's per second, since you don't fully scale to zero, should I consider having always at least one vm active full time?
You're exactly right about the Heroku deploy. We convert your app's slug to a Docker image and launch the web process in it. DB & other dynos still run on Heroku.
We don't have any hard connection limits on the redis cache. It's usually not an issue anyway since apps are often distributed across many regions and many redis servers.
We had some minor reliability issues with the edge platform early on, but their support and responsiveness was excellent.
We are very happy with them!
That's not meant to be a snarky question, I genuinely don't understand what business problem that's going to be solved by saving at most 30 ms. Anything written in Rails/Django, talking to a DB, etc. is going to have request latency dominated by other parts of the stack.
It seems like no two apps have the same scaling needs, so if you have any questions or can't make something work let us know and we'll help!
They just updated it from being JS only to being able to run any Docker image. Cloudflare gives you a persistent key/value store and Fly provides a non-persistent Redis cache.
You don’t have to move your entire app but there are plenty of use-cases where you can move more logic to the edge.
Do you do anything to speed up latency from the edge to the database?
So you want to be able to upload your code and have someone manage the infrastructure and datastore.
Isn't that the definition of FaaS?
Nice. Here's their launch-hn: https://news.ycombinator.com/item?id=19163081
Also, depending on how tight the limits are for VM lifetime / bandwidth / outbound connections, I could see using these as a kind of virtual NIC / service mesh type thing for consumer-grade internet connections, to restore the inbound routing capabilities precluded by carrier-grade NAT and avoid their traffic discrimination, as well as potentially on-boarding to higher-quality transit as early as possible for use when accessing latency-sensitive services further 'interior' to the cloud.
They can also accept any kind of TCP traffic (and we're trialing UDP), so lots of interesting network services. This is especially interesting for people who want to do live video.
AND we have disks. So you can deploy Varnish, or nginx caches, etc. This is something we enable by hand per app.
The second example would be interesting to try. There's no real limit on VM lifetime or outbound connections, bandwidth is more of a budget problem. VMs are ephemeral, so they _can_ go away but we're all happier if they just run forever.
- image, video, audio processing near consumers
- game or video chat servers running near the centroid of people in a session
- server side rendering of single page js apps
- route users to regional data centers for compliance
- graphql stitching / caching (we do this!)
- pass through cache for s3, or just minio as a global s3 (we do this!)
- buildkite / GitHub Action agents (we do this!)
- tensorflow prediction / DDoS & bot detection
- load balance between spot instances on AWS/GCP
- TLS termination for custom domains
- authentication proxy / api gateway (we do this!)
- IoT stream processing near devices
CloudFlare Workers is a fantastic serverless function runtime, but we're a layer lower than that. You can actually build it on top of fly.edit: formatting
Our main Rails app does a bunch of things that can't run globally and it takes a long time to build while the customer facing app is a lightweight Rails app that consolidated several static and single page react apps into one less gross place.
I'm guessing that's? https://github.com/geniousphp/autowire
Looks like it uses consul - is there a separate wireguard net for consul, or does consul run over the Internet directly?
We've gotten so many comments about the docs, I wish we could open source something but it's tightly coupled to other things that aren't useful to anyone else.
I've gotten several friends to switch and they've all said the same thing. If you haven't given it a shot yet, there's a simple 1 click Heroku to Fly deployment you can use to give them a shot.
You should also checkout something like FaunaDB.
Don't host in such a way that you're paying for traffic... Hetzner, OVH and Packet all have dedicated servers where you don't pay for the traffic, inbound or outbound.
Edit: judging by other comments here, it might seem like US zone of Fly.io is in fact hosted in Packet so they are probably themselves not paying for the traffic. Maybe they are using Hetzner for the EU zone (or OVH for that matter).
Pairing this with a global sql (cough cockroach cough*) is literally the app platform I’ve been dreaming about.
I would like to see more documentation around push based architectures. That is I want to build a system where a process pushes to the Redis in fly but is not running itself in fly. Basically something that may be unrouteable for pulls.
In any case congrats fly team!
Do you consider zeit.co or netlify competitors? I saw in a comment that you're interested in making it dead easy to deploy a simple Rails app to the edge. These companies have gone deep on a different segment that's deploying web apps without DBs. Is your roadmap kind of routing around the JAMstack crowd, straight to supporting traditional full-stack apps? Seems much harder, but a more valuable prize if so.
As you said, they're both going deep for JavaScript apps (and doing an awesome job at it!) and we're focusing on being an awesome place to run full stack apps and exotic (non-http) servers.
We’d like to grt network prices down but we can’t run our service on ovh or Hetzner.
I think it's misleading to say that deficiencies of Heroku have anything do to with AWS though. It's really, really easy to set up anycast [https://aws.amazon.com/global-accelerator/] with ECS [especially if you're willing to pay for Fargate]. If your product does something meaningfully different than that, I'd love to know more.
NB: I'm in no way affiliated with AWS or Heroku, just have experience with both in the past.
Your push example is interesting. We don't have a way to connect to redis from outside fly, but you could certainly boot up a tiny app on fly that acts as a proxy from external apps into the fly redis.
I know Stackpath has been offering this kind of thing for a while. So how would your product compare to theirs? Since Stackpath has a well established cdn network already.
Stackpath is an amalgamation of many acquired companies. Their CDN is fine but nothing special. The computing services aren't great. Not very competitive on price and have reliability and latency issues. Their cloud storage is white-labeled Wasabi. I wouldn't recommend them as the first choice for anything.
Zeit Now version 1 was also a run your own container runtime but that has been deprecated: https://zeit.co/docs/v1/getting-started/deployment#docker-de...
I understand this might be due to Stackpath being a larger company and owning hardware instead of renting it. But the price for traffic and compute seem to be cheaper. There is also no mention of how much you charge for storage on the pricing page.
I’m looking to deploy my next app onto one of these platforms and would like to know the price differences!
Couldn't one simply use a traditional CDN where ever their customers are which would then allow the inbound network requests to jump on private interface routing to where ever the app truly lives quicker - essentially making for a more responsive "business logic" app feel? If all infrastructure was on the same cloud provider, say AWS.
I understand this approach is less dynamic in nature but would have been a solution for the Ars Technica problem presented I feel. If not, what am I missing? Thanks!
The problem is that everything useful a server side application does still requires round trips. Even for the most boring content, an 800ms delay is pretty normal if you have a spread out audience.
CPU based pricing is pretty close. We've heard our CPUs are higher performance, but haven't done any real testing. The people who run high CPU apps on us _tend_ to pay less because we scale up and down so quickly.
I wish you and your team the best of luck!
We charge for certificates because the infrastructure to make SSL work (even when the certificates themselves are free) is complicated.
Managing certificate creation can be tricky, we have to deal with all kinds of edge cases (like mismatched A and AAAA records breaking validation). We also generate both RSA and ECDSA certificates, have infrastructure for ALPN validation, and a whole setup for DNS challenges.
And then we have to actually use them. We run a global Vault cluster to store certificates securely, and then cache them in memory in each of our router processes.
The developers who use the certificates the most love paying us to manage certs, and one person who posted in the comments here was able to replace an entire Kubernetes cluster they were using to manage certificates for their customers.
When Let’s Encrypt invalidated millions of certificates a few weeks ago, none of our customers even noticed. That’s what they’re paying us for.
[0] You are missing a FAQs page.
It is a filter. If it keeps you away from them, the filter worked? Fwiw, a younger me would find this proposition very attractive.
They plan to add many more capabilities wrt db: https://news.ycombinator.com/item?id=22619613
Writes... remain expensive.
I don't recall us being at Hack Arizona, certainly not me. I googled it and all it yielded was this HN post.
Your comment couldn't be further from the truth. I can't speak for whoever used these words (if they did), but I think we have pretty great work/life balance.
We all have families of our own and recognize they are far more important than our business. These things happen, such is life. Your kid gets sick, you want to care for them. Time off is always paid and we encourage people to take some. People often find it hard to take time off, but we've been good at it.
Nobody, generally, works more than 40 hours a week. I say "generally" because these past few weeks have been more intense given the end of our YC adventure, demo day, virtually meeting with investors and this Launch HN post. In normal times, I might work a few hours on a weekend but only if that brings me joy.
... and of course we're very flexible on work schedules because we're a remote-only company. Some weeks this might mean working only a few hours here and there because of life activities or the need to take time off. Other weeks, it might be the opposite. We recognize and embrace that.
The only 'real' solution is proper alerting, but even then it's pretty easy to rack up a bill of several thousand dollar before anyone realizes what's going on.
We use that KeyCDN test pretty frequently with different results.
Those TLS handshake times aren’t great, I think that was probably the first load from Vault on certificates. You should see most handshakes at <30ms on there.
If you ever come to SA, never partner with Localweb ( biggest/major local server provider ) they are garbage.
If you want to try them out, you can create an app and then an email to either me or support at fly.io and we'll turn them on for you.
Surprising since NodeJS routinely comes up as the fastest runtime in Lambda benchmarks, especially for cold-starts: https://levelup.gitconnected.com/aws-lambda-cold-start-langu...
Cloudflare Workers KV has the simplest model, with a central-db that transparently and eventually only replicates read-only, hot-data specific to a DC but writes continue to incur heavy penalty in terms of operations-per-second, cost, and latency.
In our production setup, we back Workers KV with a single-region, source-of-truth DynamoDB [1] and employ DynamoDB Streams to push data to Workers KV [2], that is,
Writes (control-plane): clients -> (graphql) DynamoDB -> Streams -> Workers KV
Reads (data-plane): clients -> Workers KV
Reads (control-plane): clients -> (graphql) DynamoDB
[0] https://news.ycombinator.com/item?id=19307122
[1] We really should switch to QLDB once it supports Triggers.
[2] We do so mainly because we do not to be locked-down to Workers KV, especially at its very nascent stage.
We got accepted in High Performance Transaction systems last year for the innovations around CRDTs for strong eventual consistency (SEC) with low read and write latencies.
Im trying to figure out how to provide a simple light weight way for fly.io users to use our global DB in their apps. It would allow a full stack to run at the edge with the compute on fly.io and the data on Macrometa either directly on fly.io or a nearby PoP (same city). Will update