This URL should load the beta version of ORES deployments. Currently it spins. It could be that the wsgi service is not responding.
Description
Related Objects
Event Timeline
Mentioned in SAL (#wikimedia-releng) [2021-04-17T07:23:56Z] <Majavah> restart uwsgi-ores on deployment-ores01 for T280420
I tried to restart it, without success:
Apr 17 07:24:18 deployment-ores01 systemd[1]: uwsgi-ores.service: Failed with result 'timeout'.
killed the uwsgi process and restarted, but it seems that any http query to port 8081 hangs. I see the following in the logs:
2021-04-17 07:24:20,082 WARNING ores.scoring_context: Loading model arwiki_goodfaith with sub-process 2021-04-17 07:24:20,194 WARNING revscoring.scoring.environment: Differences between the current environment and the environment in which the model was constructed environment were detected: - revscoring_version '2.8.0' mismatch with original environment '2.8.2' - python_build ('default', 'Sep 27 2018 17:25:39') mismatch with original environment ('default', 'Apr 5 2021 09:00:41') - version '#1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20)' mismatch with original environment '#1 SMP Debian 4.19.171-2~deb9u1 (2021-02-08)' - release '4.9.0-11-amd64' mismatch with original environment '4.19.0-0.bpo.14-amd64' - platform 'Linux-4.9.0-11-amd64-x86_64-with-debian-9.12' mismatch with original environment 'Linux-4.19.0-0.bpo.14-amd64-x86_64-with-debian-9.3'
Maybe not releated, but worth to note :)
Celery fails with:
Apr 13 17:31:19 deployment-ores01 systemd[1]: Started Celery workers. Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: Process Process-30: Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: Traceback (most recent call last): Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: self.run() Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: self._target(*self._args, **self._kwargs) Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/ores/scoring_context.py", line 278, in load_model_and_queue Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: model = Model.from_config(config, key) Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/venv/lib/python3.5/site-packages/revscoring/scoring/models/model.py", line 131, in from_config Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: return Class.load(stream) Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/venv/lib/python3.5/site-packages/revscoring/scoring/models/model.py", line 104, in load Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: model = pickle.load(f) Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/drafttopic/feature_lists/euwiki.py", line 7, in <module> Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: filename="euwiki-20200501-learned_vectors.50_cell.10k.kv", mmap='r') Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/venv/lib/python3.5/site-packages/revscoring/datasources/meta/vectorizers.py", line 80, in load_gensim_kv Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: speficies file path of the binary") Apr 13 17:31:43 deployment-ores01 celery-ores-worker[23526]: FileNotFoundError: Please make sure that 'filename' specifies the word vector binary name in default search paths or 'path' speficies file path of the binary Apr 17 10:46:58 deployment-ores01 systemd[1]: Stopping Celery workers... Apr 17 10:46:58 deployment-ores01 systemd[1]: Stopped Celery workers. Apr 17 10:46:58 deployment-ores01 systemd[1]: Started Celery workers. Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: Process Process-31: Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: Traceback (most recent call last): Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: self.run() Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: self._target(*self._args, **self._kwargs) Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/ores/scoring_context.py", line 278, in load_model_and_queue Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: model = Model.from_config(config, key) Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/venv/lib/python3.5/site-packages/revscoring/scoring/models/model.py", line 131, in from_config Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: return Class.load(stream) Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/venv/lib/python3.5/site-packages/revscoring/scoring/models/model.py", line 104, in load Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: model = pickle.load(f) Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/drafttopic/feature_lists/euwiki.py", line 7, in <module> Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: filename="euwiki-20200501-learned_vectors.50_cell.10k.kv", mmap='r') Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: File "/srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/venv/lib/python3.5/site-packages/revscoring/datasources/meta/vectorizers.py", line 80, in load_gensim_kv Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: speficies file path of the binary") Apr 17 10:47:24 deployment-ores01 celery-ores-worker[10152]: FileNotFoundError: Please make sure that 'filename' specifies the word vector binary name in default search paths or 'path'
Also this is the status of uwsgi and celery after the restart:
www-data 10152 1.9 11.3 1585260 923940 ? Ss 10:46 0:10 /srv/deployment/ores/deploy-cache/revs/257a349d02347537c1cbb5d6a4a367ccaf08a3cb/venv/bin/python3 /srv/deployment/ores/deploy/venv/bin/celery worker --app ores_c elery.application --loglevel ERROR www-data 10237 0.2 0.0 0 0 ? Z 10:47 0:01 \_ [celery] <defunct> www-data 10338 1.0 3.4 937220 283452 ? Ss 10:49 0:04 /usr/bin/uwsgi --die-on-term --ini /etc/uwsgi/apps-enabled/ores.ini www-data 10416 0.3 0.0 0 0 ? Z 10:49 0:01 \_ [uwsgi] <defunct>
@Halfak can it be something related to your last change? Maybe it is missing something?
elukey@deployment-ores01:/srv/deployment/ores/deploy$ sudo find -name euwiki* ./submodules/articlequality/articlequality/feature_lists/euwiki.py ./submodules/articlequality/model_info/euwiki.wp10.md ./submodules/articlequality/tuning_reports/euwiki.wp10.md ./submodules/articlequality/models/euwiki.wp10.random_forest.model ./submodules/drafttopic/model_info/euwiki.articletopic.md ./submodules/drafttopic/model_info/euwiki.drafttopic.md ./submodules/drafttopic/drafttopic/feature_lists/euwiki.py ./submodules/drafttopic/models/euwiki.drafttopic.gradient_boosting.model ./submodules/drafttopic/models/euwiki.articletopic.gradient_boosting.model ./submodules/assets/word2vec/euwiki-20201201-learned_vectors.50_cell.10k.kv <====== this is not "euwiki-20200501-learned_vectors.50_cell.10k.kv"
Aha! It does seem like there is a mismatch here. I'm not sure why it appears that the submodules are not being updated. That might be a red herring. This code and these assets should be in alignment and they are not. I'll go digging. Thanks @elukey
I confirmed that some code was not updated for these models and that is causing the issue. I have a change in progress that should resolve the issue. I'd like to keep this task open until we can get ores-beta back online.