Page MenuHomePhabricator

GlobalRename has become very slow (March 13, 2021)
Closed, DeclinedPublic

Description

Normally, renaming an user takes between few seconds to a couple of minutes. For some reason today global renames are being extremelly slow. Is there any DB or JobQueue issues?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I don't know if it matters but two days ago, from 9:30 (UTC), the queue has been totally stopped for more than an hour

Generally we have a latency issue:

image.png (345×1 px, 74 KB)

https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?orgId=1

It's slowly recovering but I don't know what's the root cause is.

Pppery renamed this task from GlobalRename has become very slow (March 13, 2020) to GlobalRename has become very slow (March 13, 2021).Mar 13 2021, 7:33 PM

Just to mention that global rename seems to be usable again since a couple of days ago. I'm not sure if this is related to the task @Daimona (auto)linked above but this incident looks resolved. @Ladsgroup, feel free to close if you agree.

Just to mention that global rename seems to be usable again since a couple of days ago. I'm not sure if this is related to the task @Daimona (auto)linked above but this incident looks resolved. @Ladsgroup, feel free to close if you agree.

To be honest, yesterday around 16:15 (UTC) they were still very slow :( btw if there is nothing else to do here, you can close!

so let me try to explain why this happens. Every rename triggers lots of jobs tot be queued (and be eventually ran). The priority of these jobs is set to "low" as they are low priority comparing to for example cleanups after deletions (to avoid lingering private data around) and even if it takes an hour or so to finish, it shouldn't be a big deal. We can increase its priority or its concurrency but tbh we shouldn't. The reason you see this problem now and not a month ago is that we upgraded our jobrunners to new debian (buster) and it's slower as it has better security meaning we slowly are reaching our capacity (and adding capacity is not super easy). This will be hopefully get better but until then the global rename will be like the canary in a coal mine.

If it's really a big problem, let us know and we can bump its priority.