Page MenuHomePhabricator

Switch deployment server to codfw (July 2021)
Closed, DeclinedPublic

Description

As part of the Datacenter-Switchover we should test failing over the deployment server from deploy1002 to deploy2002.

https://wikitech.wikimedia.org/wiki/Switch_Datacenter/DeploymentServer

Tentatively scheduling this for Tuesday July 13 at 15:00 UTC / 8:00 PDT - please speak up if that's a bad time.

If this does well, I'd like to make switching the deployment server a normal part of the switchover process (like on Wednesday).

Event Timeline

brennen subscribed.

Offhand, I think that's likely to be fine, but cc @thcipriani for awareness around train planning. (And added to the RelEng team etherpad.)

Offhand, I think that's likely to be fine, but cc @thcipriani for awareness around train planning. (And added to the RelEng team etherpad.)

@Legoktm poked me about this in IRC, but I didn't think about train. For clarity, @Legoktm should we halt deployments for this switchover?

For clarity, @Legoktm should we halt deployments for this switchover?

No one should be deploying *during* the switchover window but I don't think we need to cancel the entire train or anything.

I picked the time by by staring at this week's version of the deployment calendar, looking for a spot that is in the EU/NA overlap and had a bit of breathing room from other deployments. Ideally it would've been on Monday so it is less likely to impact a train if something goes wrong but we have SRE team and subteam meetings at that time. So Tuesday, which is still before the train goes out.

If this does well, I'd like to make switching the deployment server a normal part of the switchover process (like on Wednesday).

To be explicit, it would go like: Monday: Traffic/Services, Tuesday: MediaWiki, Wednesday: deployment server. Having it on a Wednesday does mean it could impact the train, but by the switchback I'd expect there not to be any issues. I think this would also remove the cognitive overhead for everyone who's currently in the split state of "codfw everything except wait not deploy"

taavi subscribed.

This was not done this time.