Upstream https://issues.jenkins.io/browse/JENKINS-65590
Tentative patch https://github.com/jenkinsci/gearman-plugin/pull/13
Reported by @Tarrow:
The number of waiting gearman jobs seems to keep growing since 1800-ish UTC yesterday. See https://grafana.wikimedia.org/d/000000322/zuul-gearman?orgId=1&from=now-24h&to=now I'm not even sure if this is a problem really hence why I stuck my head in here. anecdotally some people here at WMDE are claiming more jobs are timing out than usual.
On contint2001:
$ zuul-gearman.py status|sort -k2 -g |tail -n2 stop:contint2001.wikimedia.org 391 1 1 set_description:contint2001.wikimedia.org 3365 0 1
Zuul is sending stop and set_description which are never run?
My bet is that the Gearman plugin is unable to accomplish those tasks due to a lack of permission? But that would not explain why it started happening on a Sunday (May 2nd, ~ 18:00 UTC).