About William & Mary
William & Mary is a top public research university and one of only eight “Public Ivy” schools in the United States, known for its rigorous academic program. The W&M IT team manages the entire university’s technical infrastructure, ensuring smooth operations for approximately 8,000 students and 1,300 staff. The IT team of roughly 100 people administers campus networking, desktop support, business and administrative faculty support, end-user client support, research, and high-performance computing.
As a residential campus, W&M sees high application load and data volumes that peak at specific times in the academic calendar, especially during registration. They operate within a heterogeneous technical infrastructure and host a wide array of technical integrations. Using proprietary software such as their ERP system, which requires updates that adhere to industry-specific regulations, are central to their infrastructure. The team selects open source software when possible, as it enables developer collaboration with industry and other colleges and universities.
The University’s technical infrastructure has successfully transformed from running primarily on legacy systems to boasting a diverse technical stack, inclusive of bare metal, containers, and the public cloud. It now supports a wide range of services from behind traditional load balancers and VMs, in addition to auto-scaling containerized distributed computing platforms, using Traefik and Kubernetes.
Phil Fenstermacher, the Lead Linux Engineer, is part of the leadership team responsible for building and managing the platforms used by the students and staff at W&M. They are governed by a charter to provide functional solutions that adhere to ever-changing, strict regulatory demands of industry-specific software. There is an expectation to address these technological needs on a lean budget. As innovation continues to change rapidly, and as their workload has grown, the need to evolve from a virtualized environment to an environment facilitating the use of containers and Kubernetes was a priority.
As an initial step towards transitioning to a container environment, they implemented Docker Swarm and used the internal load balancer that ships with Swarm. Over time, the team at W&M needed a cloud-native reverse proxy as more workloads were containerized and adopted into this new environment. These features included the ability to use persistent sessions, while handling bursty, large workloads along with providing integrations with commercial software required for core business systems. The successful implementation of complex systems such as ERP, CMS, and procurement software was critical. Docker Swarm was easy to use, but these new requirements demanded a feature-rich and protocol-aware load balancer to meet their needs.
Inspired by an upcoming ERP update, which would require managing hundreds of production and non-production virtual machines, Phil’s team sought to leverage the inherent benefits of containers. Core requirements for this update forbade any rewrites of the commercial software, creating massive amounts of custom wrapper scripts, or changing the fundamental architecture of these applications. Finding a load balancer which seamlessly handled multiple integrations, and functioned well with Kubernetes and Docker Swarm, was the challenge.
They sought options to evaluate and came across Traefik.
Traefik checked many of the boxes: it is open source, highly cost-effective, has a great user community, walks in close step with the expanding Kubernetes community, but also works in the existing Docker Swarm environment. In the spirit of academic rigor, though, Phil’s team wanted to test a variety of solutions in the marketplace, such as NGINX, and HAproxy, making comparisons side by side with Traefik.
Phil’s team observed that Traefik ran smoothly out-of-the-box and surpassed alternatives in the ways that mattered. The other solutions required large amounts of manual configuration and lacked some service discovery capabilities. Also, features one would expect from a modern cloud-native load balancer were suspiciously absent. Traefik offered the functionality the team at W&M needed, including service discovery, persistent sessions, header modifications, Prometheus integration, a visually intuitive dashboard for monitoring, and easy deployment and operation.
“The simplicity of using Traefik for persistent sessions, by simply copy and pasting a line of code has been a game-changer.” - Phil Fenstermacher, Lead Linux Engineer
The University currently runs approximately 100 services through an on-premises deployment of Traefik on Docker Swarm, and another 30 services on Kubernetes in the cloud. A smaller on-premise Kubernetes cluster hosts everything from academic and enterprise applications. William & Mary’s small engineering team can easily manage the 150+ services and 400+ containers because Traefik’s configuration is kept alongside application configurations using labels in Docker Swarm or an Ingress object in Kubernetes. A similar configuration scheme for both orchestrators means the same engineers can support both without the overhead of another unique system.
W&M has used Traefik since version 1.3 and has since migrated to Version 2.2 on Kubernetes, which has support for native Ingress resource annotations. The latest version of Traefik makes their engineering workflow more straightforward, notably due to the flexibility of having both Ingress and Traefik's CRDs, which allow for less manual configuration and managing more complex settings with only a single solution.
“It's like the Traefik load balancer is almost the boring piece. We don't spend a lot of time talking about it. We use Traefik, it does what we need, what it's expected to do, and reliably.”
William & Mary selects only high-performance, reliable, and budget-savvy software solutions. They have chosen Traefik to manage the university's load balancing needs, as the software is easy to implement, maintain, and trust. Without Traefik, W&M faced the deployment of its new ERP system in a legacy environment using a traditional load balancer. Ultimately, this would have meant higher costs for both operation and maintenance, all with less predictability, functionality, and consistency.