What Happens If The Connection Goes Down? Lessons Learned from Anthropic's Fable 5 and Mythos 5 Going Dark

A government switched off a frontier model overnight. Here is why that reframes the sovereignty argument, and why the open-weights ecosystem finally makes the alternative real.

Graphic showing Anthropic's Fable 5 and Mythos 5 being disconnected.

On Friday, June 12, at 5:21 pm Eastern, Anthropic received a letter. By the time most of the US East Coast was thinking about dinner, two of the most capable AI models in commercial use, Fable 5 and Mythos 5, were gone. Disabled for every customer, worldwide, within the span of an afternoon.

The trigger was an export control directive from the US government, citing national security authorities, ordering Anthropic to suspend access for any foreign national, whether inside or outside the United States, including the company's own foreign-national employees. Unable to cleanly segment access by nationality, Anthropic disabled the models for everyone. The company said it disagreed with the order and believed it was a misunderstanding. The models stayed dark anyway.

The shape of this matters more than the headline.

An outage would have ended. A price change or a deprecation notice would have come with a warning and a migration window, the kind of disruption every vendor relationship is built to absorb. This was different in kind. Someone outside your company, and outside your vendor's full control, decided that a model your product depended on would stop answering. Available to hundreds of millions of people that morning. Gone by that evening.

One detail makes this concrete for me in a way it might not be for a reader in San Francisco. I am a foreign national, working from Budapest, and the directive, read literally, was about people like me. The access I had at the start of my workday, under the strict terms of the order, was gone by the close of business. The reason came down to the passport I hold.

We Wrote the Question in April. In June We Got Our Answer.

Two months ago, we published The Five Levels of AI Sovereignty, a maturity model for figuring out how much control an organization actually has over its AI infrastructure. One of the first questions we tell people to ask is blunt: what happens if your internet connection goes down?

We meant it literally. Outages, air-gaps, and the network cable. We were thinking about availability zones and egress firewalls.

June 12 added a way to lose a model we hadn't explored. The connection was fine. The data center was fine. The vendor was willing. The model still went away, because the legal right to use it was revoked by someone with no stake in your uptime. Our framework has a dimension that catches this after the fact, the one we labeled vendor lock-in and escape velocity. We framed it around contracts and pricing renewals. A Friday-afternoon letter from the Commerce Department belongs in the same column, and it never occurred to us to put it there.

The lesson holds, sharpened. If the most capable thing in your stack is a model you do not control, your continuity is a function of decisions made in rooms you will never sit in.

The Reason This Argument Was Easy to Ignore Until Now

For most of the last two years, the honest counterargument to all of this was capability. You could run a model locally, and it would be meaningfully worse than the hosted frontier. For many teams, the math was uneven enough that they accepted the dependency and moved on. The hosted model was simply better, and the gap was wide.

That calculation changed this spring, and it changed fast.

The strongest open-weight models in the world right now are Chinese. Moonshot's Kimi K2.6 leads the open field on the Artificial Analysis Intelligence Index at around 54, ahead of every open competitor and inside the top tier overall. DeepSeek shipped V4 Pro and V4 Flash in April. Alibaba's Qwen 3.5 and 3.6 lines landed under permissive Apache licensing. Z.ai shipped GLM-5.2 on June 13, the day after the Fable 5 order, posting the highest open-weight SWE-bench Pro coding score to date and, of note, an Anthropic-compatible API.

Then, on June 4, Nvidia released Nemotron 3 Ultra, announced days earlier at Computex. It is a 550-billion-parameter mixture-of-experts model with roughly 55 billion parameters active per token, the most capable open-weight model a US lab has shipped, scoring about 48 on the same index. Nvidia published the weights, the post-trained checkpoints, the datasets, and the training recipes. It serves over 300 tokens per second on early endpoints, several times faster than comparably sized peers, and it was built deliberately for agentic work: tool calls, structured output, long-horizon planning that holds up over many turns.

The gaps are real. Nemotron 3 Ultra is the best US open model, and it still trails Kimi K2.6 by several points on the index. A 550B model wants a real GPU; this is data-center hardware, not a spare workstation. Self-hosting swaps an API bill for a hardware and operations bill, and that cost is worth modeling carefully before anyone commits.

Here is what survives all those caveats. Each of these models can run with zero external dependency. You pull the weights once. After that, there is no API to call, no license server to phone home to, no provider in the request path, and nobody in a position to send a letter that turns the model off. The capability gap is now small enough that, for a large class of workloads, closing the dependency is worth the few benchmark points you give up.

The Part Where the Gateway Earns Its Place

A swap like this only survives contact with a real codebase if the application does not know which model answered.

If your services call a model provider's SDK directly, switching from a hosted model to a local one means editing and redeploying every service that talks to a model. Routed through a gateway, the model becomes a setting. The address your applications send their prompts to can point at a hosted frontier model today and a self-hosted Nemotron or Kimi deployment tomorrow, and nothing downstream has to change. The application keeps speaking the same protocol. The gateway decides where the request actually lands.

The safety layer matters here just as much, and it is the dimension that quietly drops teams a level. If your jailbreak detection, PII redaction, and content filtering are API calls to a hosted moderation model, then your "sovereign" deployment has a cloud dependency baked into every single request. Move those checks to local models running behind the same gateway, and the dependency disappears. That is the AI Safety Architecture dimension we flagged as the one teams most often miss. After June 12, it is the one I would check first.

This is the same place we have been pointing to for a while. Model routing, fallback, token-cost governance, and safety are cross-cutting concerns that belong in the infrastructure layer rather than scattered through application code. Put them at the gate, and you can change models, run them air-gapped, and keep one audit trail across the three trust boundaries where AI traffic actually crosses. None of this is new architecture invented for the occasion. The AI Gateway here is part of Traefik Hub, built on the same Traefik foundation that teams already run in front of their traffic. The air-gapped deployment path is the one defense and healthcare teams have been asking us about for a year. June 12 simply gave everyone else a concrete reason to care.

The New Question Is Who Can Switch You Off

The sovereignty conversation has mostly been about data: where it lives, which jurisdiction's rules apply, and whether a region's name on a contract satisfies an auditor. Those questions still matter.

The Fable 5 directive added a different one. Can someone outside your organization turn off the model your product runs on without notice and without recourse? For a hosted frontier model, the answer is now demonstrably yes, and that someone does not have to be your vendor. For a model whose weights sit on a disk you own, behind a gateway you operate, the answer is no. That is the whole distinction, and as of this spring, it is finally buildable without surrendering so much capability that no one would choose it.

We built the maturity model to help teams find out where they really stand. If you read it in April and filed sovereignty under "important, not urgent," this is the part where the second word changes.

Frequently Asked Questions

Does running an open-weight model mean giving up the frontier?

For now, partly. The strongest open model, Kimi K2.6, trails the best closed models, and the best US open model, Nemotron 3 Ultra, trails Kimi. The gap is real and far smaller than it was a year ago. For many production workloads, it is small enough that removing the external dependency is the better trade. For the few tasks that genuinely need the absolute top of the frontier, a gateway lets you route only those to a hosted model and keep everything else local.

Isn't this just a hardware problem in disguise?

Partly, yes. A 550B model needs a serious GPU, and self-hosting trades an API bill for a hardware and operations bill. That is a real cost and worth modeling carefully. What you buy with it is continuity that no external party can revoke. Whether it is worth the spend depends on how much your product would suffer if the model went dark on a Friday afternoon.

We already route through a gateway. Are we covered?

You are covered for the swap, which is the hard part. If your apps reach models through a gateway rather than calling providers directly, changing the model is a configuration change instead of a cross-service rewrite. The piece most setups still miss is that the model is only one of three boundaries an agent crosses. Traefik runs all three as one control plane from a single binary: the API Gateway in front of your services, the AI Gateway in front of the models, and the MCP Gateway in front of the tools agents call. Routing the model locally closes one dependency. Running the safety checks, tool authorization, and policy for all three at the same gate, on your own infrastructure, is what keeps the entire chain off the public internet.