How to Keep Your Services Secure With Traefik’s Rate Limiting

The internet can be a challenging environment for running applications.
When you expose a service to the public internet, it's crucial to assess the risks involved. Malicious actors may try to misuse your resources or even bring your service down.
You can't just make your service publicly available without protection and expect things to go smoothly. They won’t. That’s why safeguarding your service is essential. There are many threats to address. Configuring your server properly is a good start, but you also need to protect your application from harmful actions. Our Web Application Firewall (WAF) can assist with that.
However, today we're focusing on a different threat: attempts to overwhelm your resources or disrupt your service with an excessive number of requests. This is where rate limiting comes in.
What is Rate Limiting?
Rate limiting is the process of limiting the flow of requests that reach your servers. Think of it like a funnel, where a large pipe of water narrows into a smaller one, flowing at a much more manageable rate before reaching its destination:

In this analogy, each water molecule represents an HTTP request. A pipe of a certain size limits how many molecules can pass through at once. When the pipe narrows, fewer molecules can flow through, reducing the flow rate.
This illustrates how a rate limiter works to control the flow of requests. Additionally, we can fine-tune the control by limiting traffic based on characteristics like IP address, user, or other request details. You might also want to allow brief bursts of traffic without blocking them entirely.
When discussing rate limiting, the following algorithms are commonly mentioned:
- Token Bucket: The one used by Traefik, which we'll focus on below.
- Leaky Bucket: A first-in, first-out queue that releases traffic at a steady rate. It's less flexible than Token Bucket for handling traffic bursts.
- Generic Cell Rate Algorithm (GCRA): Similar to Leaky Bucket but ensures packets follow a set timing interval instead of draining at a fixed rate.
Two other types of algorithms often come up:
- Sliding Window
- Fixed Window
Unlike rate limiting, which regulates the flow and timing of requests, these two algorithms count the total number of requests within a specific period. They're better suited for enforcing strict usage quotas than managing request flow.
Token Bucket
The Token Bucket algorithm controls the flow of requests by using a metaphorical bucket that holds tokens. Tokens are generated at a constant rate and added to the bucket, which has a fixed capacity.
When a request arrives, the system checks if there are enough tokens in the bucket. Each request consumes one token. If enough tokens are available, the request proceeds, and a token is removed. If there aren’t enough tokens, the request is either delayed until more tokens are available or blocked if the delay would be too long.
The bucket can't hold more tokens than its maximum capacity, so once it's full, any new tokens are discarded. This prevents tokens from accumulating indefinitely and allows the system to handle short bursts of high traffic, as long as the bucket has enough capacity.
When the bucket is low or empty, the system slows down, giving time for more tokens to be generated. The token generation rate and the bucket size are key factors that determine how the system performs.

The behavior of the Token Bucket algorithm is determined by two key factors:
- Bucket size: This defines how many requests can be processed simultaneously.
- Rate: This controls how frequently new request opportunities become available as tokens are generated.
Rate Limiting in Traefik
Traefik allows you to define a ratelimit middleware that can be applied to your routers. Assigning this middleware ensures that the flow of incoming requests doesn’t exceed the configured rate.