Blogmark

Rate Limiting Using the Token Bucket Algorithm

via jbranchaud@gmail.com

https://en.wikipedia.org/wiki/Token_bucket
Software Development Rate Limiting

I was curious what it looked like to do metered access to a resource. Commonly when you talk about this topic, that resource is your own API that has a throughput ceiling. I was coming at this from the angle of an app with internal 3rd-party API calls that charge on a per-request basis. In that scenario I'd like to implement some level of spend control so that I don't wake up to a huge bill.

The Token Bucket Algorithm appears to be one common answer to this question.

Each consumer (maybe that is a user) of the app/API has a bucket of tokens and each token can be redeemed for one access of the limited resource. Each bucket can only fit so many tokens and you can decide how exactly you want to meter refilling the bucket. I think it is typically handled by periodically (e.g. "every X seconds") adding a token to each bucket that isn't full. The other extreme could be requiring a user to manually "refill" the bucket — i.e. recharge their account with more credits. Perhaps you may even want to mix in a heuristic that is guided by some global value like "% of max spend for the period."

I like the carnival analogy for the Token Bucket Algorithm described here: https://www.krakend.io/docs/throttling/token-bucket/#a-quick-analogy