How to Size Your Traefik Hub API Gateway Instances

When onboarding new customers, one common question we face is about the optimal resource configuration to set for a Traefik Hub API Gateway instance. Unfortunately, there is no one-size-fits-all answer to this question. Several factors need consideration, such as the infrastructure type, traffic volume, and how Traefik Hub API Gateway will be utilized.

Let’s explore why these factors are crucial and how to customize the Traefik Hub API Gateway deployment accordingly.

Define the size

From the tests we have done internally, we know a single Traefik Hub API Gateway instance (installed on 8 CPUs and 16 GB of RAM machine) is capable of routing over 72,000 HTTP Requests per Second (RPS) (and more with bigger configurations) if there is no security or operations to do on the requests. With the introduction of the experimental fastProxy option, performance can even reach up to 102,000 RPS with the same configuration. This suggests that when provided you allocate a sufficiently powerful machine, RPS should not be your primary concern in production—at least in theory.

In practice, the maximum RPS depends heavily on the type of traffic being managed. For example, if your Traefik Hub API Gateway instance is handling TLS connections or utilizing middleware (such as header management or rate limiting), the RPS capacity will decrease.

Additionally, the more you want Traefik Hub API Gateway to handle high RPS values, the larger, and the more expensive, the machines are. While these machines offer high performance, they come with high costs and a poor return on investment (ROI) since you likely won't use their full capacity most of the time—but you'll pay for them regardless.

On top of that, a key best practice is to favor scaling out (horizontally adding more smaller instances) over scaling up (vertically increasing the resources of a single instance). This approach not only makes your infrastructure more scalable but also more resilient: you reduce the risk of a Single Point of Failure (SPOF) and ensure that your system can handle traffic spikes more effectively.

For these reasons, we recommend deploying multiple smaller instances rather than a few powerful ones. The exact number and size of instances depend on your infrastructure setup as we will see below.

Define the number

When Traefik Hub API Gateway is deployed within an orchestrator (like Kubernetes), sizing becomes more straightforward. We recommend deploying enough instances (each with 2 vCPUs, and 4GB RAM) to handle up to 3,000 RPS per instance under normal traffic conditions.

The key factor here is observability: in such an environment, monitoring your containers or pods and ensuring they scale up when resource consumption reaches 70% of CPU or RAM capacity, and scaling down when traffic returns to normal.

However, despite Traefik Hub API Gateway being a cloud-native Ingress Controller primarily designed for container environments, it performs admirably on traditional bare-metal infrastructure as well.

In these environments, the challenge is finding the right balance between avoiding a Single Point of Failure (SPOF) and minimizing over-provisioning of instances. Deploying multiple small VMs with limited CPU and memory resources can handle normal traffic loads, but this approach may fall short during sudden traffic spikes if the combined RPS capacity is insufficient.

On the other hand, using a few larger VMs and scaling them up is both time-consuming and operationally intensive compared to simply adding replicas in a Kubernetes deployment. This makes it crucial to anticipate traffic peaks, ensuring each machine is capable of handling unexpected spikes and that scaling occurs promptly.

For bare-metal environments, we recommend deploying medium-sized VMs that can reliably handle around 50% of peak traffic, providing sufficient capacity while maintaining resilience.

Real-Life Example: Traefik Hub API Gateway in Kubernetes

In today’s landscape, securing API access is non-negotiable. Therefore, it's important to configure the Traefik Hub API Gateway in a way that accounts for the additional latency introduced by the TLS handshake and JWT authentication verification.

The use of the header middleware is also common, as it allows you to pass extra information to your backend services. However, adding headers introduces additional latency, which is why we include it in this test.

Finally, gathering metrics for observability is crucial for any production platform. Exposing traffic metrics through OpenTelemetry allows you to monitor and gain insights into system performance, but it too has an impact on Traefik Hub API Gateway’s overall performance.

Let's now apply the sizing recommendations we discussed earlier to two typical use cases with different requirements:

Up to 1,000 RPS per Instance

This scenario represents a common need for exposing APIs. Here, the traffic volume is moderate, with each Traefik Hub API Gateway instance handling up to 1000 RPS.

Instance Sizing: For this scenario, we recommend deploying several small instances (e.g., 1-2 vCPUs, and 1-2 GB RAM). Traefik Hub API Gateway can scale horizontally, allowing you to start with a modest setup and add instances as traffic grows.
Middleware Impact: Even though the traffic load is internal, enabling TLS for secure communication and using JWT Authentication middleware (as well as other middleware or operations on access logs) will introduce some overhead. Injecting headers also adds to the processing time, but at this scale, the latency impact should remain manageable.
Scaling Strategy: Ensure autoscaling triggers when resource usage (CPU, memory) hits 80%, and scale down once traffic normalizes. A few well-sized instances should be enough to handle this load efficiently while maintaining cost-effectiveness.
Recommendation: Start with 2-5 replicas to ensure resiliency, and monitor performance to adjust as needed.

1,000 to 5,000 RPS per Instance

This scenario is geared toward more demanding use cases, such as exposing APIs to external partners, customers, or commercial websites. With traffic ranging from 1000 to 5000 RPS per instance, the need for scalability and performance is more critical.

Instance Sizing: For this higher traffic volume, each Traefik Hub API Gateway instance should have more resources—typically around 2-4 vCPUs and 4-8 GB of RAM per instance. However, instead of using a few large machines, it’s best to deploy multiple medium-sized instances to balance performance and cost.
Middleware Impact: As with internal API exposure, TLS termination and JWT Authentication will still introduce latency, but at this higher traffic rate, the impact is more pronounced. Ensure the resources allocated to each instance are sufficient to handle the added load from security measures and header management. Also, exporting metrics in OpenTelemetry format will increase the processing load, so factor that into resource allocation.
Scaling Strategy: For this scenario, it is essential to configure autoscaling carefully to handle traffic spikes effectively. Ensure your system scales before traffic exceeds 80% of your instance capacity to avoid bottlenecks, and consider pre-emptively adding extra replicas during anticipated peak periods (e.g., during product launches or promotional events).
Recommendation: Start with 4-10 replicas to handle external traffic spikes effectively, ensuring a minimum of 20% buffer capacity for peak periods.

More than 5,000 RPS per Instance

For scenarios where APIs are exposed to high-traffic environments, such as public-facing commercial websites or services with millions of users, a more specialized setup is needed:

Instance Sizing: Deploy large instances with at least 4-8 CPUs and 8-16 GB of RAM. You might also consider using high-performance nodes or dedicated bare-metal instances if the traffic is extremely high.
Scaling Strategy: Each instance should be optimized to handle more than 5,000 RPS. You may need to use both horizontal scaling (increasing instance count) and vertical scaling (boosting instance capacity) to meet traffic demands.
Middleware Impact: At this level, even small latencies introduced by TLS handshakes, JWT, header manipulation, and metrics collection can add up quickly. It’s critical to identify each bottleneck that slows down the traffic (using Tracing) to optimize each layer of the stack for performance, and potentially offload some tasks to specialized systems (e.g., a dedicated authentication service).
Recommendation: Start with at least 10-20 replicas and monitor usage closely to ensure a quick reaction to traffic spikes. For very high loads, consider deploying Traefik Hub API Gateway in a high-availability setup, with global load balancing across multiple regions or data centers.

Conclusion

Properly sizing Traefik Hub API Gateway instances is key to ensuring secured API publication, whether you're dealing with internal or external traffic. By understanding the specific needs of your environment—such as traffic volume, middleware usage, and infrastructure type—you can deploy a cost-effective and scalable Traefik Hub API Gateway configuration that meets your performance goals.

Regardless of the scenario, monitoring resource consumption and scaling based on usage thresholds (like 80% of CPU or RAM) is crucial to avoiding performance bottlenecks. By following these guidelines, you’ll ensure your Traefik Hub API Gateway deployment is both resilient and optimized for your specific use case.

Useful Links

Traefik API Gateway Documentation
Traefik API Gateway Webpage
Our Community Forum

How to Size Your Traefik Hub API Gateway Instances

Define the size

Define the number

Real-Life Example: Traefik Hub API Gateway in Kubernetes

Up to 1,000 RPS per Instance

1,000 to 5,000 RPS per Instance

More than 5,000 RPS per Instance

Conclusion

Latest from Traefik Labs

Spring Cloud Gateway vs. Traefik Hub: When to Choose a Purpose-Built Gateway

Beyond the Models: Operationalizing Enterprise AI

Beyond the Model: The Infrastructure That Makes Enterprise AI Actually Work