Open Source for Open Knowledge
The rollout of single-sign-on (SSO) at the Wikimedia Foundation
July 22, 2021
Site Reliability Engineering Moritz Mühlenhoff and John Bond
This is the second part of a three part series on the rollout of single-sign-on (SSO) at the Wikimedia Foundation.
By Moritz Mühlenhoff, Staff Site Reliability Engineer, and John Bond, Staff Site Reliability Engineer
This is the second in a three-part series that will describe the introduction of Apereo CAS as an identity provider for single sign-on (SSO) with the services operated by the Wikimedia Foundation. The first post in this series covers the original landscape of Wikimedia’s web-based services, summarizes requirements for a new SSO provider, looks at existing FLOSS solutions, and explains why Apereo CAS was chosen.
This post will cover the following topics:
The Wikimedia Foundation operates two main data centers in the US (Virginia and Texas). Additionally,  we operate cache sites in Amsterdam (Netherlands), San Francisco (USA), Singapore, and soon, Marseille (France). Most of our services are operated in both main data centers to provide redundancy to major outages affecting a complete data center. This was also needed for the new IDPs. In addition to the main IDPs, we set up two staging hosts to allow for tests without production impact.
In addition to ~ 1500 bare metal servers, we also run several clusters of Ganeti, an open-source virtual machine management solution based on KVM and DRBD. Given the relatively small memory/CPU footprint of CAS, we decided to deploy the IDPs on a VM. 
Wikimedia’s servers run Debian, which provides us with OpenJDK 11 and Tomcat 9 in the current release (Buster/Debian 10).
All our servers are centrally managed via Puppet, an open-source configuration management system. We wrote a Puppet module and some profiles to manage the configuration of the IDP. The profiles use various Wikimedia-specific Puppet code. However, the core apereo_cas module works with the standard version of Puppet. You can find the module at​. We’d be happy to incorporate changes that are generally useful.
High availability
To provide a highly available IDP service, the following bits needed to be made redundant:

Schematic graphic by Moritz Mühlenhoff, CC BY-SA 4.0
The most common way to deploy Apereo CAS is via a WAR overlay, which is built using Gradle. Gradle keeps track of all upstream dependencies, and among other things, allows us to easily build a WAR file or a standalone JAR with an embedded Tomcat server. The standalone JAR is useful for local testing and we use the WAR file so that we can easily deploy the web application to Tomcat.
Our initial deployment forked the upstream cas-overlay-template repository into a local repository on (Wikimedia’s code collaboration platform). We do this for robustness in case the upstream repository is inaccessible, but also to track a few local modifications such as a custom Wikimedia theme to our IDP login page. 
In the early phase of deployment, our Puppet code checked out the overlay repository locally, built the WAR using Gradle, and deployed the latest version. While this was fine for the initial ramp-up, we soon moved to building Debian packages (.deb) for deployments. 
For this, we added support to the overlay repository to create a .deb out of the latest release. This allowed us to uncouple the build of a new version with the rollout (so that we can e.g. test a new deployment on the staging IDPs). The resulting cas.deb package ships the WAR file which depends on the Tomcat as shipped in Debian; the WAR file gets automatically deployed in the post-installation script of the .deb package. Time permitting, we plan to upstream the Debian packaging work. Interested parties can find the necessary bits in the Debian directory of our overlay repository
Second factor authentication
One of the drivers for implementing SSO at the Wikimedia Foundation was to have a centralized place where we could manage 2FA authentication policies and flows. Specifically, we wanted to deploy Universal 2nd Factor (U2F) as a second authentication factor. To allow for a gradual rollout of U2F we added an LDAP schema extension with custom attributes to select user attributes and added a Groovy script that enables U2F for any user who has it enabled in OpenLDAP via the mfa-method LDAP attribute mentioned above.
In the last part of this blog series, we’ll explain how services were integrated into the new SSO realm and what steps were taken to implement single-sign-out.
About this post
Featured image credit: TurkishHandmadePadlocks​, Muscol, CC BY-SA 3.0
This is the second in a three part series of posts. The first post can be found here.
single sign-on sso
Your email address will not be published. Required fields are marked *
Save my name, email, and website in this browser for the next time I comment.
Notify me of follow-up comments by email.
Notify me of new posts by email.
Wikimania 2021 Hackathon: 24 hours to experiment with other Wikimedians
June 2021 Datacenter Switchover
Privacy Policy | About
Wikipedia® and other Wikimedia project names and logos are registered trademarks of the Wikimedia Foundation, a non-profit organization.
Unless otherwise stated content is licensed under a CC BY-SA 4.0 international license.
Powered by VIP, Automattic Privacy Notice.
Learn more about the
Wikimedia Foundation
Follow us on Twitter @wikimediatech