On January 19th around 14:45, contint1001.wikimedia.org started being slow with high CPU / memory / disk io.
The Jenkins controller can't ssh to it:
Jan 19 14:56:31 contint2001 jenkins[998]: [01/19/22 14:56:31] SSH Launch of contint1001 on 208.80.154.17 failed in 65,004 ms Jan 19 15:17:32 contint2001 jenkins[998]: [01/19/22 15:17:32] SSH Launch of contint1001 on 208.80.154.17 completed in 1,206,232 ms Jan 19 15:20:19 contint2001 jenkins[998]: [01/19/22 15:20:19] SSH Launch of contint1001 on 208.80.154.17 completed in 53,227 ms Jan 19 15:46:31 contint2001 jenkins[998]: [01/19/22 15:46:31] SSH Launch of contint1001 on 208.80.154.17 failed in 65,029 ms Jan 19 15:48:31 contint2001 jenkins[998]: [01/19/22 15:48:31] SSH Launch of contint1001 on 208.80.154.17 failed in 65,004 ms Jan 19 15:50:31 contint2001 jenkins[998]: [01/19/22 15:50:31] SSH Launch of contint1001 on 208.80.154.17 failed in 65,004 ms Jan 19 15:52:31 contint2001 jenkins[998]: [01/19/22 15:52:31] SSH Launch of contint1001 on 208.80.154.17 failed in 65,004 ms Jan 19 15:54:31 contint2001 jenkins[998]: [01/19/22 15:54:31] SSH Launch of contint1001 on 208.80.154.17 failed in 65,005 ms Jan 19 16:10:15 contint2001 jenkins[998]: [01/19/22 16:10:15] SSH Launch of contint1001 on 208.80.154.17 failed in 65,004 ms
From htop, the host has 293 tasks, 2158 threads.