Episode #2: more than you ever wanted to know about metered billing systems
 ‌ ‌ ‌ 
News, tips, and behind the scenes technical mumbo jumbo  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏  ͏ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­  

Hey friends! Welcome to episode #2 — the one where we share more than you ever wanted to know about metered billing systems.

Also, a big thank you to all the kind feedback and messages from our last newsletter. Real humans do write this and we loved hearing from you. If you have questions, concerns, etc (about Fly.io only, there are limits to our abilities), or built something you want us to know about, just reply to this email. We’re here for ya!

Product Updates

It’s been a busy few weeks over on our Fresh Produce site.

One of the big features we’ve shipped is suspend+resume for Fly Machines. Unlike a regular stop, a suspended machine will restore memory from a snapshot when it starts back up, rather than booting from scratch. This is a game-changer if you need sub-250ms auto-starts but have a chonky app that takes its sweet time booting up.

Here’s a few more highlights:

Inside Fly-Ball

It’s a billing bonanza over here (hold on to your seats). We finally have a new, totally customizable billing system set up. You’ll find its origin story below, but for now, all you need to know is that we think it’s great and are already turning out some nice improvements to how we bill for stuff.

So far we’re using it to a) make sure we don’t go out of business, by b) tying our pricing more closely to your specific needs, so you can make the call on how you want to use us (and therefore, how much you pay).

For example:

Partner News

We have two new extensions in public beta!

Finally, if you haven’t checked out our Sentry extension yet, you should take a look. All Fly.io organizations can create a new Sentry organization that includes a free 12 month trial of the Sentry Team Plan. Even our friends who write bug-free Rust code will find something helpful in here.

Fly.io For Real Life

You can catch our DevRel team live streaming this Friday (7/26) AT 10am PDT, talking about how to pick a model and infrastructure for running AI workloads.

That’s not totally real life, we get it. Here’s where they’ll be truly IRL:

And Now A Word About Billing

Two gigantic engineering projects are “landing” at Fly.io this month. The first is Fly Machine Migrations, which you’ll read more about on our blog shortly. The second is billing.

When we launched Fly.io in 2020, we did what every other startup does: we hooked up Stripe. And so in the beginning Kurt said, “let there be billing!”, and there was founder-written code, and it was… fine, I guess.

See, Stripe has no problem dealing with normal startup billing problems. A $49/mo prorated per-seat license? Please. $99/mo with a $5.99/mo add-on? Of course Stripe can do recurring subscriptions. $49/mo plus $1 per widget? Yes! Stripe can do metered billing!

But Stripe cannot do Fly.io billing. At least not on its own.

Billing for Fly.io is hard in three directions:

  1. We meter at absurd granularity – by the CPU-second.
  2. Our metering is combinatorically explosive, because it’s broken out so many ways (in particular, regionally).
  3. We push events at insane frequency.

That last one is a killer.

When the billing people at Stripe think about SAAS startups, they’re thinking on the scale of a nice, comfortable neighborhood restaurant. Good tapas. 100 covers a night.

Fly.io looks more like McDonalds. Good french fries, at industrial scale, in zillions of places around the world. Stripe nopes right out of that. We literally lost revenue: their billing API rate limited us far enough that we had to drop events. We’re awesome, we know, but not so much that Stripe will make exceptions for us.

We’ve worked around this with aggregate stats. We couldn’t have Stripe directly tracking every Fly Machine you ran, so we just sum the seconds for all of them: 1 second of 4 Fly Machines gets recorded as 4 seconds. That’s functional, but not good!

At the end of your billing period, you get an invoice. The invoice says how much we charged you, and, more importantly, why. With aggregated stats, you’d see things like charges for 9,972,448 seconds of compute time. 9,972,448? Sounds wrong! To which our support team would have to reply, “no, it’s right, you had X Fly Machines with Y cores running for Z hours”.

It was a total nightmare, and it cost us a lot of trust.

And this is just nuts and bolts stuff. There’s lots of extra stuff we’ve wanted to do! We’d love to dole out discounts like back-in-the-day Heroku, or give people $500 coupons for putting up with a slow hiring process. For a company like us doing metered billing, Stripe’s not set up to do that. We’d love to build features that assume a single source of billing truth, so we can give users visibility. Our sampling screwed all that up.

All this is to say, we did the thing our Moms told us never to do. We were confused, in the back seat of the Oldsmobile Cruiser on the way to 2nd Grade, why Mom kept lecturing us about not doing our own billing system. Now we know.

Our billing system is straightforward in theory. We’re working with a new metered billing provider. If you’re familiar with Prometheus metrics, you’ve got a good mental model: we send them raw usage data according to some schema, apply some transformations, then aggregate and match it to a rate. Poof, a billable! This is the dream.

Remember the Programmer’s Credo: “we do these things not because they are easy, but because we thought they would be easy”. We unearthed corner cases in our subscription tracking we didn’t know about. We unearthed corner cases in our billing provider’s data model they didn’t know about. We managed to generate tens of thousands of dollars of API bills exporting draft invoices nobody used (we got this credited back). We sent some customer email blasts we didn’t expect to send.

But we got there! We think! And we’re in a much better place today.

We’re not a typical SaaS company, but basic business physics certainly apply to us. A good rule of thumb for SaaS margins is “70%”. If we charge you $1, $0.30 should be covering our costs.

To make those numbers work with our old billing contraption, we have to do silly things. For instance: Fly Machine compute time costs the same in every region, even though Brazil absolutely stomps us with up-front taxes any time we rack a machine. Having flat compute pricing is great if you’re running solely in São Paulo! But it’s not such a great deal if you’re running in Chicago and cross-subsidizing the Brazil apps.

We’re still using Stripe for what it’s good at: payment processing runs through Stripe. Stripe does our fraud detection. Stripe handles our refunds.

But the rest of it is us, which means we can do:

The thing about this situation is, as painful as it was getting here, it probably worked out as well as it possibly could have. We made the right call limping along with Stripe as long as we did. If we had built a billing system in 2022, it would have been wrong (2023 was a big learning year for us).


Phew, you made it! We hope you enjoyed reading about billing as much as we enjoyed working on it. 😬

Stay classy, devs.

— The Fly.io Team


UnsubscribeManage SubscriptionsView in Browser