Page MenuHomePhabricator

Deploy buildkitd to trusted GitLab runners
Closed, ResolvedPublic3 Estimated Story Points

Description

RelEng has landed on buildkitd as the best option for building production bound images from protected GitLab CI workloads. (See T307599: Investigate alternatives to docker-in-docker for container image creation in GitLab and T307810: Investigate buildkitd instances as image builders for GitLab.)

Per discussion with @Jelto and the rest of RelEng, it sounds like there are security concerns with integrating any third party k8s cluster with our trusted GitLab runners, so instead of a k8s deployment, we will target running buildkitd in rootless mode via dockerd on those the trusted runners using a WMF packaged image from docker-registry.wikimedia.org.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
dduvall changed the task status from Open to In Progress.May 12 2022, 5:23 PM
dduvall triaged this task as High priority.
dduvall set the point value for this task to 3.

Change 791427 had a related patch set uploaded (by Dduvall; author: Dduvall):

[operations/docker-images/production-images@master] buildkitd: Provide buildkitd image for trusted GitLab runners

https://gerrit.wikimedia.org/r/791427

The image built from https://gerrit.wikimedia.org/r/791427 works and I'm working on the operations/puppet manifests. I'm not quite sure how to go about settings up the mTLS yet.

It's my understanding that the mTLS auth for builtkitd is both meant to secure the network communication and authorize clients. In the case of trusted runners, it seems like we've already de fact authorized the client workload. However, adding an extra auth layer and encrypting container network communication seems wise to me. I just don't know what the setup should look like. Specifically where should we store the CA and root cert and how should we go about generating and assigning new client certs?

@Jelto do you have thoughts?

Change 791427 merged by Dzahn:

[operations/docker-images/production-images@master] buildkitd: Provide buildkitd image for trusted GitLab runners

https://gerrit.wikimedia.org/r/791427

Change 791655 had a related patch set uploaded (by Dduvall; author: Dduvall):

[operations/puppet@production] WIP: Optionally provide buildkitd to GitLab runners

https://gerrit.wikimedia.org/r/791655

Mentioned in SAL (#wikimedia-operations) [2022-06-14T23:08:09Z] <mutante> disabling puppet in gitlab-runners (via cumin /disable-puppet) before deploying gerrit:791655 to provide gitlab-runners with buildkit and new docker network - T308271

Change 791655 merged by Dzahn:

[operations/puppet@production] Provide buildkitd to GitLab runners

https://gerrit.wikimedia.org/r/791655

Mentioned in SAL (#wikimedia-cloud) [2022-06-14T23:29:51Z] <mutante> - creating instance gitlab-runner-1001 since we did not have a test machine for gitlab-runners but need one to test things like gerrit:791655 before hitting prod T308271

Change 805434 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] Revert "Provide buildkitd to GitLab runners"

https://gerrit.wikimedia.org/r/805434

Mentioned in SAL (#wikimedia-operations) [2022-06-14T23:49:26Z] <mutante> gitlab-runner1002 - systemctl restart docker; run-puppet-agent ; systemctl start buildkitd - fails though T308271

Change 805434 merged by Dzahn:

[operations/puppet@production] Revert "Provide buildkitd to GitLab runners"

https://gerrit.wikimedia.org/r/805434

Mentioned in SAL (#wikimedia-operations) [2022-06-14T23:52:05Z] <mutante> gitlab-runner1001/1002 - clean revert not possible, icinga alerting about failed buildkitd service, manually deleting systemd unit and trying to clean up T308271

Change 806327 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gitlab::runner: set sysctl kernel.unprivileged_userns_clone = 1

https://gerrit.wikimedia.org/r/806327

Change 806341 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] docker::network: refresh service docker after adding a docker network

https://gerrit.wikimedia.org/r/806341

buildkitd is now running on all (6) gitlab-runners. It's 6 because the VMs 1001 and 2001 have been decom'ed earlier today and then there are 3 physical hosts per DC.

 sudo cumin 'gitlab-runner*' "systemctl status buildkitd | grep Active | cut -d: -f1" 
6 hosts will be targeted:
gitlab-runner[2002-2004].codfw.wmnet,gitlab-runner[1002-1004].eqiad.wmnet
Ok to proceed on 6 hosts? Enter the number of affected hosts to confirm or "q" to quit 6
===== NODE GROUP =====                                                                                                                                                                                             
(6) gitlab-runner[2002-2004].codfw.wmnet,gitlab-runner[1002-1004].eqiad.wmnet                                                                                                                                      
----- OUTPUT of 'systemctl status...ve | cut -d: -f1' -----                                                                                                                                                        
     Active

The following steps were needed to make it work:

I did this on all 6 hosts, buildkitd service is running, Icinga is happy.

Dzahn lowered the priority of this task from High to Medium.Jun 16 2022, 10:08 PM

It's deployed but we have some follow-ups. I guess lowering the prio a bit is appropriate for this state.

Thanks, @Dzahn ! Is there any follow-up work I can assist with? It looks mostly done and in review FWICT.

@dduvall Well, you could try to use it to build a docker image from a dockerfile. So far it's just "the buildkitd service is running" and follow-ups are about making sure it survives reboots and next time we setup a gitlab-runner it works more automatically. Don't worry about part. But I don't think anyone has actually let it build an image yet.

I think you're right. We haven't actually built an image on a trusted runner yet. I'll test that today.

Change 806327 merged by Dzahn:

[operations/puppet@production] base: create profile to allow unprivileged userns, use it on gitlab_runners

https://gerrit.wikimedia.org/r/806327

Change 806341 abandoned by Dzahn:

[operations/puppet@production] docker::network: refresh service docker after adding a docker network

Reason:

service not managed by puppet

https://gerrit.wikimedia.org/r/806341

We have the first Compiling the code... $ echo "Compile complete." ever today after subtask about DNS resolution was resolved. Special thanks to taavi for providing the needed firewall change.

https://phabricator.wikimedia.org/T311241#8034714

Yesterday I circled back to this task to verify that building a basic image via buildkitd and the blubber buildkit frontend on a trusted runner would work correctly. However, this build process failed immediately. A look at buildkitd seemed to point to DNS (over IPv6) failure within the buildkit graph solve.

Sep 08 21:30:26 gitlab-runner2002 docker[1359]: time="2022-09-08T21:30:26Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = failed to compile to LLB state: docker-registry.wikimedia.org/bullseye: failed to do request: Head \"https://docker-registry.wikimedia.org/v2/bullseye/manifests/latest\": dial tcp: lookup docker-registry.wikimedia.org on [2001:4860:4860::8888]:53: dial udp [2001:4860:4860::8888]:53: connect: cannot assign requested address\n"

The DNS part is a long story but basically: In some cases, when scheduling a container to solve an image build, buildkitd's OCI executor will populate the resolv.conf of the image build container with a set of default nameservers, specifically Google's nameservers (IPv4 first, then IPv6 servers).

Sep 08 21:30:14 gitlab-runner2002 docker[1359]: time="2022-09-08T21:30:14Z" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]"
Sep 08 21:30:14 gitlab-runner2002 docker[1359]: time="2022-09-08T21:30:14Z" level=info msg="IPv6 enabled; Adding default IPv6 external servers: [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]"

But really, this is not a DNS problem exactly but a networking one: we're disallowing all internet bound traffic from trusted runners. This is a mismatch between how we build images for production currently and what we're building out for GitLab CI.

@Jelto, what is needed to allow outbound connections from trusted runners?

I ran an even more minimal test to see if I could at least build a scratch image containing a single readme file (no remote resources to fetch), but even that failed due to a failure to fetch the internal image (docker.io/docker/dockerfile-copy) used to copy files across container filesystems.

Blubber file:

# syntax = docker-registry.wikimedia.org/wikimedia/blubber-buildkit:0.9.0
version: v4
variants:
  readme:
    base: ~
    copies:
      - from: local
        source: README.md

Job console:

#1 resolve image config for docker-registry.wikimedia.org/wikimedia/blubber-buildkit:0.9.0
#1 DONE 0.3s
#2 docker-image://docker-registry.wikimedia.org/wikimedia/blubber-buildkit:0.9.0@sha256:ab835a4cb57fcbfd249f360c562c8bbf73fe709e1bfa360094cd7e50e3897692
#2 resolve docker-registry.wikimedia.org/wikimedia/blubber-buildkit:0.9.0@sha256:ab835a4cb57fcbfd249f360c562c8bbf73fe709e1bfa360094cd7e50e3897692 done
#2 sha256:b6bc2f48899de1c7090c9fb747dfac6be02fe01571f221e2ba2dcc00520c3bee 115.26kB / 115.26kB 0.2s done
#2 sha256:ff9f943ae033913fd00af72818a5225e83332a421b3e14753f85dbc27b6d7485 0B / 9.72MB 0.2s
#2 sha256:ff9f943ae033913fd00af72818a5225e83332a421b3e14753f85dbc27b6d7485 5.24MB / 9.72MB 0.3s
#2 sha256:ff9f943ae033913fd00af72818a5225e83332a421b3e14753f85dbc27b6d7485 9.72MB / 9.72MB 0.5s done
#2 extracting sha256:ff9f943ae033913fd00af72818a5225e83332a421b3e14753f85dbc27b6d7485
#2 extracting sha256:ff9f943ae033913fd00af72818a5225e83332a421b3e14753f85dbc27b6d7485 0.2s done
#2 DONE 0.7s
#2 docker-image://docker-registry.wikimedia.org/wikimedia/blubber-buildkit:0.9.0@sha256:ab835a4cb57fcbfd249f360c562c8bbf73fe709e1bfa360094cd7e50e3897692
#2 extracting sha256:b6bc2f48899de1c7090c9fb747dfac6be02fe01571f221e2ba2dcc00520c3bee 0.1s done
#2 DONE 0.8s
#3 local://dockerfile
#3 transferring dockerfile: 260B done
#3 DONE 0.0s
#4 [internal] load build context
#4 transferring context: 158B done
#4 DONE 0.0s
#5 [internal] helper image for file operations
#5 resolve docker.io/docker/dockerfile-copy:v0.1.9@sha256:e8f159d3f00786604b93c675ee2783f8dc194bb565e61ca5788f6a6e9d304061
#5 resolve docker.io/docker/dockerfile-copy:v0.1.9@sha256:e8f159d3f00786604b93c675ee2783f8dc194bb565e61ca5788f6a6e9d304061 60.0s done
#5 ERROR: failed to do request: Head "https://registry-1.docker.io/v2/docker/dockerfile-copy/manifests/sha256:e8f159d3f00786604b93c675ee2783f8dc194bb565e61ca5788f6a6e9d304061": dial tcp 54.242.59.189:443: i/o timeout
------
 > [internal] helper image for file operations:
------
error: failed to solve: failed to load cache key: failed to do request: Head "https://registry-1.docker.io/v2/docker/dockerfile-copy/manifests/sha256:e8f159d3f00786604b93c675ee2783f8dc194bb565e61ca5788f6a6e9d304061": dial tcp 54.242.59.189:443: i/o timeout

This confirms that the issue is not limited to DNS but to all outbound traffic, and that we cannot build any images on trusted runners without network egress.

Also cc'ing task {T317341} in case this is related to any changes made on account of security review.

Change 831481 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] gitlab_runner: allow Trusted Runners to access wikimedia docker-registry

https://gerrit.wikimedia.org/r/831481

Change 831481 merged by Jelto:

[operations/puppet@production] gitlab_runner: allow Trusted Runners to access wikimedia docker-registry

https://gerrit.wikimedia.org/r/831481

Thanks @dduvall for testing the Trusted Runner buildkitd setup!

The DNS part is a long story but basically: In some cases, when scheduling a container to solve an image build, buildkitd's OCI executor will populate the resolv.conf of the image build container with a set of default nameservers, specifically Google's nameservers (IPv4 first, then IPv6 servers).

We don't allow access to Google nameserver for the Trusted Runners. Similar to the normal GitLab CI job containers we should try to configure dockers internal DNS 127.0.0.11 or wikimedia dns recdns.anycast.wmnet for buildkitd. Normal CI jobs seem to be happy with internal/wikimedia DNS. We have to test if it's enough to append --dns to the buildkitd systemd service ExecStart command.

But really, this is not a DNS problem exactly but a networking one: we're disallowing all internet bound traffic from trusted runners. This is a mismatch between how we build images for production currently and what we're building out for GitLab CI.

@Jelto, what is needed to allow outbound connections from trusted runners?

That's right, most outbound traffic is not allowed. All services which are used by Trusted Runners have to be added to profile::gitlab::runner::allowed_services. I prepared https://gerrit.wikimedia.org/r/831481 to allow Trusted Runners to access our Docker registry. Access means just allow https traffic to be able to pull, no credentials involved.

I re-run the failed test job and DNS resolution seems to work. However buildkitd also wants to use the official Docker registry to get some kind of [internal] helper image for file operations:. Access to the official Docker registry is not allowed for running containers. Only the gitlab-runner daemon is allowed to pull certain images from allowed groups/projects and run them as CI jobs.

I have some concerns allowing access by Trusted Runners to the Docker registry globally. Do you know if it's possible to mirror this subset of helper images? And is it possible to restrict images for the buildkitd instance? However allowing official Docker registry would be a similar change to https://gerrit.wikimedia.org/r/831481. But before before configuring that I'd like to do some more research and discussions what risks we are introducing here.

@Jelto, what is needed to allow outbound connections from trusted runners?

That's right, most outbound traffic is not allowed. All services which are used by Trusted Runners have to be added to profile::gitlab::runner::allowed_services.

Hrm, this is different than what we currently do for building images running in production now.

Since the code running on the trusted runners is trusted and reviewed (trusted runners are used to build code for production), anyone who could get their code running here could already check-in whatever code they wanted. So we didn't limit outbound connections from the trusted host (that way dependencies are fetched from CI hosts rather than by individuals on their laptop).

Is that all still true? Or are these "trusted runners" to be used for something different than runners executing trusted code?

We think it's better to maintain an explicit allow_list instead of opening it up to everything, in the spirit of defense in depth.

We have allowed our own in-house docker registry and kind of expected that would be used. It sounds like you need the official docker registry. If we'd add that to profile::gitlab::runner::allowed_services you would be unblocked, is that right?

Thanks @dduvall for testing the Trusted Runner buildkitd setup!

Happy to! :)

The DNS part is a long story but basically: In some cases, when scheduling a container to solve an image build, buildkitd's OCI executor will populate the resolv.conf of the image build container with a set of default nameservers, specifically Google's nameservers (IPv4 first, then IPv6 servers).

We don't allow access to Google nameserver for the Trusted Runners. Similar to the normal GitLab CI job containers we should try to configure dockers internal DNS 127.0.0.11 or wikimedia dns recdns.anycast.wmnet for buildkitd. Normal CI jobs seem to be happy with internal/wikimedia DNS. We have to test if it's enough to append --dns to the buildkitd systemd service ExecStart command.

Sadly I don't think --dns works with a custom Docker network, but I'll do some local testing to verify that. I'm leaning toward abandoning the custom network (which I introduced, sorry) and finding another way to achieve a consistent addressable buildkitd server, but I'll try some other workarounds first.

But really, this is not a DNS problem exactly but a networking one: we're disallowing all internet bound traffic from trusted runners. This is a mismatch between how we build images for production currently and what we're building out for GitLab CI.

@Jelto, what is needed to allow outbound connections from trusted runners?

That's right, most outbound traffic is not allowed. All services which are used by Trusted Runners have to be added to profile::gitlab::runner::allowed_services. I prepared https://gerrit.wikimedia.org/r/831481 to allow Trusted Runners to access our Docker registry. Access means just allow https traffic to be able to pull, no credentials involved.

For some reason I thought this was only for internal services. Thanks for clarifying that!

I re-run the failed test job and DNS resolution seems to work. However buildkitd also wants to use the official Docker registry to get some kind of [internal] helper image for file operations:. Access to the official Docker registry is not allowed for running containers. Only the gitlab-runner daemon is allowed to pull certain images from allowed groups/projects and run them as CI jobs.

I have some concerns allowing access by Trusted Runners to the Docker registry globally. Do you know if it's possible to mirror this subset of helper images? And is it possible to restrict images for the buildkitd instance?

I don't think it's possible, but I can do more research. However, see below responses.

However allowing official Docker registry would be a similar change to https://gerrit.wikimedia.org/r/831481. But before before configuring that I'd like to do some more research and discussions what risks we are introducing here.

Like @thcipriani mentioned, trusted runners will only be operating on MRs targeting protected branches that were reviewed by people with the right access (ostensibly people we trust), so it doesn't seem riskier over what we assume in our current model of review and deployment.

We think it's better to maintain an explicit allow_list instead of opening it up to everything, in the spirit of defense in depth.

If we were to move to an allow-list model, we'd basically have to audit and analyze every gate-and-submit and post-merge job we currently run to see what external networks they access, and add those to the list in order to achieve a full migration from Gerrit/Zuul/Jenkins. I'm assuming this would at least include public NPM registries, PyPI, but is probably much more broad.

We have allowed our own in-house docker registry and kind of expected that would be used. It sounds like you need the official docker registry. If we'd add that to profile::gitlab::runner::allowed_services you would be unblocked, is that right?

We still expect to constrain base images to those in our registry, but in this case buildkitd uses some official Docker images for internal functions such as copying files, etc.

We think it's better to maintain an explicit allow_list instead of opening it up to everything, in the spirit of defense in depth.

If we were to move to an allow-list model, we'd basically have to audit and analyze every gate-and-submit and post-merge job we currently run to see what external networks they access, and add those to the list in order to achieve a full migration from Gerrit/Zuul/Jenkins. I'm assuming this would at least include public NPM registries, PyPI, but is probably much more broad.

In the firewall configuration section of the the GitLab runner Security Evaluation it says, "Requests to the internet are allowed." Why change now?

Changing to an explicit allow_list for egress would block us from moving to GitLab and force us to first redesign the service pipeline we built together.

In the firewall configuration section of the the GitLab runner Security Evaluation it says, "Requests to the internet are allowed." Why change now?

Changing to an explicit allow_list for egress would block us from moving to GitLab and force us to first redesign the service pipeline we built together.

For some reason I thought this was only for internal services. Thanks for clarifying that!

Sorry I was wrong about the firewall rules for Trusted Runners. The docs are right, only the internal network should be blocked. Access to external services should work. The firewall implementation uses 10.0.0.0/8 for blocking the internal services. I'm currently debugging why traffic to external services (internet) is affected here. I'm not entirely sure if it's due to the firewall rules, the dedicated network or dns. I'll try to find that out and update the task with a comment.

So I'd like to unblock this by troubleshooting and properly implementing the firewall with: allow-list for internal services and unrestricted access to internet services. However I think it makes sense to have some short discussion maybe in the next IC meeting about this, if that works for you.

For some reason I thought this was only for internal services. Thanks for clarifying that!

Sorry I was wrong about the firewall rules for Trusted Runners. The docs are right, only the internal network should be blocked. Access to external services should work. The firewall implementation uses 10.0.0.0/8 for blocking the internal services. I'm currently debugging why traffic to external services (internet) is affected here. I'm not entirely sure if it's due to the firewall rules, the dedicated network or dns. I'll try to find that out and update the task with a comment.

Oh phew! That's good news on our end. :) Thanks for looking into that. I'll try to do a little debugging today as well, or possibly tackle the DNS issue.

So I'd like to unblock this by troubleshooting and properly implementing the firewall with: allow-list for internal services and unrestricted access to internet services. However I think it makes sense to have some short discussion maybe in the next IC meeting about this, if that works for you.

Definitely! Thanks, @Jelto.

Some updates from troubleshooting the networking issue:

I see two independent problems here:

  • firewall restrictions for the internal network 10.0.0.0/8 have no effect
  • outgoing network access doesn't "work"

The former issue needs a fix (I'll mention this in T295481 or a subtask) but is not blocking this task currently.

The later is a more general problem. GitLab Runner are configured with private IPs only (wmnet). This machines do not have general internet access as far as I understand now. The same "issues" are present on the gitlab-runner host machine, access to internal services works but external services won't. So this is expected behavior of machines with private addresses. It is possible to use our internal http_proxy to access external resources. With this curls and docker pulls are working.

I tried to configure buildkit and buildctl to use this proxy by setting http_proxy, https_proxy and no_proxy environment variables. With this configuration the job is able to access the helper images. However the job fails with some permission denied error some lines later, which doesn't sound network related. Note: gitlab-runner1004 is currently paused and used for this testing.

I have doubts if it is suitable to configure this proxies for all buildtools we will be using. For each and every tool and package manager this proxies have to be configured individually(?).

I think Trusted Runner need outgoing internet access. I assume this is also the reason why contint machines also have public addresses. I'm not sure if we are able to supply all 6 Trusted Runners with a public IP, but maybe we can create one or two Trusted Runner with a dedicated tag for tools like buildkit. I try to dig a little deeper about that in T295481.

I think Trusted Runner need outgoing internet access. I assume this is also the reason why contint machines also have public addresses. I'm not sure if we are able to supply all 6 Trusted Runners with a public IP, but maybe we can create one or two Trusted Runner with a dedicated tag for tools like buildkit. I try to dig a little deeper about that in T295481.

The contint machines are a weird case. They're runners, they serve the jenkins web UI, and they have to talk over ssh to WMCS: they're strange beasts from a networking point of view.

Thanks for digging into this <3

I second that: Thanks for digging in :)

I tried to configure buildkit and buildctl to use this proxy by setting http_proxy, https_proxy and no_proxy environment variables. With this configuration the job is able to access the helper images. However the job fails with some permission denied error some lines later, which doesn't sound network related. Note: gitlab-runner1004 is currently paused and used for this testing.

The permission error seems to be related to the buildkit user not owning its own home directory within the buildctl image. I'll submit a patch and rebuild the image.

I have doubts if it is suitable to configure this proxies for all buildtools we will be using. For each and every tool and package manager this proxies have to be configured individually(?).

That does seem quite onerous to expect all build tools to both rely only on http and support proxies, but this is an option to at least unblock us on testing other aspects of the pipeline such as JWT auth and publishing, so thank you very much!

I think Trusted Runner need outgoing internet access. I assume this is also the reason why contint machines also have public addresses. I'm not sure if we are able to supply all 6 Trusted Runners with a public IP, but maybe we can create one or two Trusted Runner with a dedicated tag for tools like buildkit. I try to dig a little deeper about that in T295481.

I think all will need egress. Do we not have the option of SNATing?

Change 832374 had a related patch set uploaded (by Dduvall; author: Dduvall):

[integration/config@master] dockerfiles: Create and chown home directory for buildctl user

https://gerrit.wikimedia.org/r/832374

Change 832460 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] buildkitd: add option to enable proxy settings for buildkitd

https://gerrit.wikimedia.org/r/832460

Change 832460 merged by Jelto:

[operations/puppet@production] buildkitd: add option to enable proxy settings for buildkitd

https://gerrit.wikimedia.org/r/832460

Change 832374 merged by jenkins-bot:

[integration/config@master] dockerfiles: Create the buildctl user home directory

https://gerrit.wikimedia.org/r/832374

Change 835162 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] gitlab_runner: enable unprivileged_userns_clone in WMCS

https://gerrit.wikimedia.org/r/835162

Change 835162 merged by Dzahn:

[operations/puppet@production] gitlab_runner: enable unprivileged_userns_clone in WMCS

https://gerrit.wikimedia.org/r/835162

Change 839457 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/puppet@production] buildkit: add no_proxy for wmf domains

https://gerrit.wikimedia.org/r/839457

Change 839457 merged by Jelto:

[operations/puppet@production] buildkit: add no_proxy for wmf domains

https://gerrit.wikimedia.org/r/839457

Cross-post from T308501#8290690:

it seems we have our first successful buildkitd build job executed on Trusted Runners.

Change 841557 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[operations/puppet@production] Restart buildkitd if its config files change

https://gerrit.wikimedia.org/r/841557

Change 841557 merged by Jbond:

[operations/puppet@production] Restart buildkitd if its config files change

https://gerrit.wikimedia.org/r/841557