Service Middlewares: A New Composition Layer In Traefik v3.7

For most of Traefik’s existence, every middleware has been attached to a router. That model works well for what it was built for: access control, rate limiting, CORS, redirects—things that describe the route’s policy. But some transformations have nothing to do with the route. They exist because a backend has its own contract, and every request reaching that backend needs to honor it. With one router pointing to one dedicated service, you can fudge that by hanging backend-contract middlewares off the router. The model breaks the moment a service is shared by several routers, or composed from several backends.

The composite case makes this structural. As soon as you reach for composite Traefik services (e.g., weighted splits for canary deployments, mirroring for shadow validation, and failover between heterogeneous upstreams) a single router middleware can no longer express what you need. If you want to add a header to the canary branch only, for example, a router middleware applies to both. Or maybe you want to strip credentials from mirrored traffic, but your router middleware would strip them from the primary too. What if you want each side of a failover to carry its own upstream conventions, but there's only one middleware chain and it runs before the service selection.

These aren’t edge cases you can work around with better routing rules. They’re things the router-middleware model simply cannot express.

Traefik v3.7 adds a second layer. Middlewares can now live on the service itself, executing after the routing decision and, crucially, after the service-selection decision in composite services. This post walks through the scenarios where that matters.

The Two-Layer Model

The historical pipeline was straightforward:

Request → Route match → Router middlewares → Service → Backend

Service middlewares add a second execution point:

Request → Route match → Router middlewares → Service → Service middlewares → Backend

The rule of thumb fits in two sentences: If the middleware concerns who can access or how the request arrives, it belongs on the router. If it concerns how the backend receives the request, it belongs on the service.

Authentication (user-facing), rate limiting meant to protect a route, and CORS are router middlewares. Backend-specific header adaptation, credential injection for an upstream API, and path rewriting that matches a backend’s expectations are service middlewares. The split is almost always obvious once you ask which layer owns the concern.

Three things follow from this model. First, the Middlewares field sits on the Service type itself, so it’s available uniformly whether the service is a plain LoadBalancer, a Weighted, a Mirroring, or a Failover. Second, in composite services (Weighted, Mirroring, Failover), service middlewares on a child execute only on the branch that actually receives the request, which is what makes every scenario below work. Third, the Kubernetes Gateway API expresses the exact same idea through backendRefs[].filters; what Traefik calls a service middleware, Gateway API calls a backendRef filter.

One practical note on where middlewares attach. In the Kubernetes CRD and the Gateway API, you can nest middlewares inline on a child of a composite (a weighted entry, a mirror, a failover branch). In the file provider, a weighted child or a mirror entry is just a reference (no inline middleware field) so per-branch behaviour is expressed by making the child a standalone service that carries its own middlewares. Same outcome, one level of indirection. Scenario 3 below uses this form.

Scenario 1: Canary with Per-Backend Transformation

You run a Weighted service that sends 90% of traffic to the stable version of your API and 10% to a canary. The canary was built against a newer internal contract and expects an extra header (let's say X-Feature-Flag: new-engine) that the stable backend doesn’t recognise and would silently ignore (or, worse, reject).

Attaching X-Feature-Flag to the router adds the header to both branches. Removing it from the router and writing application logic to forward it conditionally moves the problem somewhere else. Duplicating the router for each variant defeats the purpose of weighted traffic in the first place.

With service middlewares, the transformation lives on the canary child only:

apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: add-feature-flag
spec:
  headers:
    customRequestHeaders:
      X-Feature-Flag: new-engine
---
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
  name: api
spec:
  weighted:
    services:
      - name: api-stable
        port: 80
        weight: 90
      - name: api-canary
        port: 80
        weight: 10
        middlewares:
          - name: add-feature-flag

The stable branch is untouched. The canary branch fires the middleware after the weighted decision has been made, so it only applies to the 10% that actually reach the canary backend. Roll the canary back by flipping the weights to 100 / 0 and there's no middleware to remove, no route to change.

The Gateway API expresses this with its own vocabulary. The same idea, “apply this header modification to this backend only,” becomes a filter on a backendRef:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: api
spec:
  rules:
    - backendRefs:
        - name: api-stable
          port: 80
          weight: 90
        - name: api-canary
          port: 80
          weight: 10
          filters:
            - type: RequestHeaderModifier
              requestHeaderModifier:
                set:
                  - name: X-Feature-Flag
                    value: new-engine

Traefik v3.7 supports RequestHeaderModifier, ResponseHeaderModifier, RequestRedirect, and URLRewrite on backendRefs[].filters, plus ExtensionRef for Traefik-native middlewares. If you already standardise on Gateway API, this is the native spelling and you get per-backend transformations without leaving the spec. If you’re on the Traefik CRDs, the TraefikService version above gives you the same behaviour. Same idea, two spellings.

Scenario 2: Mirroring with Redaction

Traffic mirroring is how you validate a new implementation, a shadow analytics pipeline, or a benchmarking rig against real requests without risking production. Traefik’s Mirroring service duplicates incoming requests to one or more mirror backends while the primary response is what the client actually sees.

The problem is that mirrored requests are identical copies of the originals. Every Authorization header, every session cookie, every API key your primary backend receives also ends up on the mirror. If the mirror is run by a different team, lives in a sandbox environment, or is instrumented with verbose logging, you’ve just extended the trust boundary of your production secrets to places it probably shouldn’t reach.

Router middlewares can’t help: stripping Authorization from the router would also strip it from the primary backend and break production auth. Service middlewares, placed on the mirror branch only, solve it:

apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: scrub-secrets
spec:
  headers:
    customRequestHeaders:
      Authorization: ""
      Cookie: ""
      X-Api-Key: ""
      X-Shadow: "true"
---
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
  name: orders-with-shadow
spec:
  mirroring:
    name: orders-primary
    port: 80
    mirrors:
      - name: orders-shadow
        port: 80
        percent: 100
        middlewares:
          - name: scrub-secrets

The primary backend keeps its full request. The mirror receives the same request with credentials removed and an X-Shadow: true marker so its logs are unambiguous. The scrub is a single piece of configuration that can be audited in one place, which matters when the middleware is a security control, not a convenience.

Scenario 3: Failover Between Heterogeneous Providers

A Failover service sends traffic to a primary backend and falls back to a secondary when the primary is unhealthy. When the two backends are identical (i.e., two replicas of the same service) no per-branch transformation is needed. But when they aren’t identical, each side has its own contract.

Take payment processing, for example. Your primary is Cashew, your fallback is Doshly. Cashew expects Accept: application/vnd.cashew.v2+json and accepts charges at /v1/charges, while Doshly wants Accept: application/vnd.doshly.V3+json and routes payments through /v3/payments. A router middleware can only set one Accept header and one path. Whichever backend the failover selects, the other’s contract is wrong.

Service middlewares let each branch carry its own adaptation:

http:
  services:
    payments:
      failover:
        service: cashew
        fallback: doshly
        errors:
          status:
            - "500-599"

    cashew:
      loadBalancer:
        servers:
          - url: "https://api.cashew.io"
      middlewares:
        - cashew-version
        - cashew-path

    doshly:
      loadBalancer:
        servers:
          - url: "https://checkout.doshly.io"
      middlewares:
        - doshly-version
        - doshly-path

  middlewares:
    cashew-version:
      headers:
        customRequestHeaders:
          Accept: "application/vnd.cashew.v2+json"

    cashew-path:
      replacePath:
        path: "/v1/charges"

    doshly-version:
      headers:
        customRequestHeaders:
          Accept: "application/vnd.doshly.v3+json"

    doshly-path:
      replacePath:
        path: "/v3/payments"

The failover decision picks the right backend and the right contract in a single configuration. Switch upstreams by swapping which one is primary. The middleware chain follows automatically.

Scenario 4: Per-Backend Rate Limiting for a Fragile Upstream

Your search endpoint fans out to two backends through a Weighted service: an internal catalog that handles thousands of requests per second without flinching, and a partner’s public API with a hard quota of 100 req/sec. Cross it and you get a barrage of 429s plus a 30-second cool-down that hurts everyone downstream.

A router-level rate limit forces you to size for the slowest backend. Throttle at the partner’s 100 req/sec and you choke the internal catalog for no reason. Keep the router’s limit high and you push the partner over the cliff on every surge. Either way, when your tier is renegotiated and the quota moves, you have to find every router that touches this service to update it.

A service middleware puts the limit with the backend that actually has the constraint:

apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: partner-quota
spec:
  rateLimit:
    average: 80
    burst: 40
    # bucket by request Host rather than by client IP — only the
    # public search host routes to this backend, so every bucket
    # collapses to one, giving us a single global counter that
    # protects the upstream instead of fairly sharing among clients
    sourceCriterion:
      requestHost: true
---
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
  name: search
spec:
  weighted:
    services:
      - name: catalog-internal
        port: 80
        weight: 70
      - name: catalog-partner
        port: 80
        weight: 30
        middlewares:
          - name: partner-quota

The internal branch runs uncapped. The partner branch is protected from your own bursts, so you stop pushing it over the cliff and triggering 429-storms. Callers see a Traefik 429 you can absorb (cache, retry with backoff, degrade gracefully) instead of a partner-side rejection that takes half a minute to clear. When the quota changes, one object updates.

The same reasoning extends to circuitBreaker for unpredictable degradation: rate limit when you know the quota up front, circuit breaker when you have to react to symptoms. Both belong on the service that’s fragile, not on a router that doesn’t know which branch the request will end up on.

Scenario 5: External API with Credential Injection

Your platform proxies requests to an external geocoding API that authenticates with a custom header. Internally, three routes send traffic to this service: one from the public search page, one from the admin dashboard, one from a batch processing pipeline. Each route has its own access control, but the geocoding API always needs the same X-Api-Key injected.

Without service middlewares, you’d duplicate the credential-injection middleware on each router, and every new route to this service needs to remember to include it. Forget one, and requests arrive at the external API without authentication, producing silent 401s that are hard to trace.

With a service middleware, the injection lives on the service itself:

http:
  routers:
    geocoding-public:
      rule: "Host(`search.example.com`) && PathPrefix(`/geocode`)"
      middlewares:
        - public-rate-limit
      service: geocoding

    geocoding-admin:
      rule: "Host(`admin.internal`) && PathPrefix(`/geocode`)"
      middlewares:
        - admin-ipallowlist
      service: geocoding

    geocoding-batch:
      rule: "Host(`batch.internal`) && PathPrefix(`/geocode`)"
      middlewares:
        - batch-basicauth
      service: geocoding

  services:
    geocoding:
      loadBalancer:
        servers:
          - url: "https://api.geocoding-provider.com"
      middlewares:
        - inject-geo-key

  middlewares:
    inject-geo-key:
      headers:
        customRequestHeaders:
          X-Api-Key: "geo_prod_xxx"

Each route keeps its own authentication. The backend’s credential (i.e., how the upstream API wants to be spoken to) lives in one place and travels with the service. Add a fourth route without knowing that inject-geo-key exists; it works automatically.

One caveat: customRequestHeaders takes literal strings, not Secret references. Source the key through your secret-injection workflow (templated manifests, External Secrets Operator, sealed secrets) rather than committing the value as shown.

More Places This Fits

The same pattern shows up in dozens of situations. A few that don’t need a full walkthrough but are worth knowing about:

Circuit Breaker on a Fragile Backend

Some upstreams don’t fail at a predictable quota; they return 500s in bursts, intermittent failures that cascade if you keep sending traffic. A circuitBreaker as a service middleware trips when the error ratio crosses a threshold and gives the backend time to recover, while the other backends in the composite keep serving normally.

Plugin-Driven Body Transformations

Built-in middlewares don't touch request or response bodies; plugins do. As service middlewares, they slot into the same composition model. A plugin on a weighted child can rewrite JSON for the migrating branch only, leaving the legacy branch untouched until the migration finishes.

Header Translation During a Backend Migration

You’re shifting traffic from a legacy backend that expects Authorization: Bearer … to a new one with its own X-Api-Key (a backend-owned secret, independent of the client’s bearer). A Weighted service runs the split; the new backend carries a middleware that sets its X-Api-Key and strips the inbound Authorization. Roll forward by adjusting the weights. When the legacy service is decommissioned, the translation middleware goes with it.

Which Middleware Goes Where

Here's a short guide to keep handy. While there are no hard and fast rules, the guide below provides some default answers to the key question, "what is this middleware actually about?"

Middleware Concern	Layer
User authentication (who can access the route)	Router
Credential injection for the upstream API	Service
Rate limit to protect a route or a client	Router
Rate limit to protect a fragile backend	Service
CORS	Router
Strip prefix, path rewrite	Service
Request headers adapting to a backend’s contract	Service
Response headers shaping the client view	Router
User-facing redirects	Router
Host header rewriting for upstream routing	Service

When you’re unsure, ask whether the middleware would still make sense if you swapped the backend for an entirely different implementation. If the answer is “yes, the middleware is still needed” it belongs on the router. If the answer is “no, this is specific to how this backend wants to be spoken to” it belongs on the service.

Service middlewares are available from Traefik v3.7 onwards, in the file provider, the Kubernetes CRD (standalone IngressRoute services and all TraefikService variants), the Kubernetes Ingress provider (via the traefik.ingress.kubernetes.io/service.middlewares annotation), and the Gateway API (backendRefs[].filters). The full reference is in the Traefik documentation.