Page MenuHomePhabricator

Investigate cache issues after WDQS UI deployments
Closed, ResolvedPublic

Description

Several times now, we’ve had reports of a broken Wikidata Query GUI (https://query.wikidata.org/) after merging a new version in the wikidata/query/gui-deploy repo (which gets automatically deployed with the next Puppet run). Apparently a cache purge via purgeList.php can solve it (T301457#7700452), but we should look into why that’s even necessary.

Event Timeline

I think I understand what’s going on… this is the cache header we send for /index.html and /embed.html:

cache-control: max-age=3600, must-revalidate

Because the HTTP caching spec was written by three knights who only speak in riddles, must-revalidate only means that the response must be revalidated when it’s stale; within that max-age (one hour), it can be freely reused. So people might see the old index.html and embed.html files, referencing JS/CSS files that no longer exist, for up to an hour (or two, three, four hours, depending on how many cache layers there are). Ideally, the old JS/CSS files would be cached during that time as well, but apparently that’s not always happening.

I think what we want instead is no-cache, which (obviously!) doesn’t mean that the response can’t be cached at all, but only that it must be validated (e.g. If-Modified-Since) before the cached response can be reused. This seems like a better fit for the /index.html and /embed.html pages. For the JS/CSS files, we also send max-age=3600, must-revalidate, and in that case I think that’s fine, since those files are basically content-addressed via the hash in their file name (we could perhaps go even stronger, e.g. immutable and long max-age – but we don’t have to).

Change 799297 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/puppet@production] query_service: don’t cache index files

https://gerrit.wikimedia.org/r/799297

Change 799297 merged by RLazarus:

[operations/puppet@production] query_service: don’t cache index files

https://gerrit.wikimedia.org/r/799297

Status update: the Puppet change has been merged, but apparently the Apache config isn’t reloaded automatically. @RLazarus indicated on Gerrit that @EBernhardson would be the right person for that.

Status update: the Puppet change has been merged, but apparently the Apache config isn’t reloaded automatically. @RLazarus indicated on Gerrit that @EBernhardson would be the right person for that.

@Lucas_Werkmeister_WMDE thanks for the heads up, I'll get the configs reloaded on Monday.

Mentioned in SAL (#wikimedia-operations) [2022-06-21T18:48:18Z] <ryankemper> T301461 ryankemper@miscweb1002:~$ sudo systemctl reload apache2

Change 807200 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/puppet@production] query_service: fix syntax error

https://gerrit.wikimedia.org/r/807200

Mentioned in SAL (#wikimedia-operations) [2022-06-21T18:56:33Z] <ryankemper> T301461 ryankemper@miscweb1002:~$ sudo systemctl reload apache2 failed due to syntax error, patch here: https://gerrit.wikimedia.org/r/c/operations/puppet/+/807200

Change 807200 merged by Ryan Kemper:

[operations/puppet@production] query_service: fix syntax error in apache config

https://gerrit.wikimedia.org/r/807200

@RKemper thanks, that would be great!

Sorry for the delay here, we had some confusion as to whether the correct file was modified in the previous patch (spoiler alert: it was!). Just reloaded the config now and tests are now passing after a quick patch:

ryankemper@cumin1001:~$ httpbb /srv/deployment/httpbb-tests/query_service/test_wdqs.yaml --hosts wdqs1012.eqiad.wmnet
Sending to wdqs1012.eqiad.wmnet...
PASS: 4 requests sent to wdqs1012.eqiad.wmnet. All assertions passed.

We should be good here, although I'll defer to @Lucas_Werkmeister_WMDE for any further testing

Thanks! Header looks good to me now:

$ curl -sI https://query.wikidata.org | grep -i cache-control
cache-control: no-cache
$ curl -sI https://query.wikidata.org/js/wdqs.min.3e953121da5bfd51eb0d.js | grep -i cache-control
cache-control: max-age=3600, must-revalidate

I’ll deploy some WDQS UI and query builder updates later, and hopefully they won’t cause any cache issues this time.

Lucas_Werkmeister_WMDE claimed this task.
Lucas_Werkmeister_WMDE moved this task from Backlog to Done on the Wikidata Query UI board.

Looks like it’s working, I didn’t notice any errors with today’s WDQS UI deployment and haven’t heard of any user problems either.