Page MenuHomePhabricator

Analytics-ClustersComponent
ArchivedPublic

Members (3)

Details

Description

Superseded by Shared-Data-Infrastructure

Wikimedia's Big Data platform https://wikitech.wikimedia.org/wiki/Analytics/Cluster

Analytics projects aim to give the wiki movement a data services platform: Providing insight into community activity (WMF Data Engineering team).

Recent Activity

Feb 5 2024

awight added a comment to T268784: Configure superset cache .

We're curious to know whether caching can be turned on after the superset 3 upgrade? Having trouble finding the newest task about this...

Feb 5 2024, 11:49 AM · Analytics-Clusters, Product-Analytics

Jan 16 2024

gerritbot added a comment to T273642: Add analytics-presto.eqiad.wmnet CNAME for Presto coordinator failover.

Change 709713 merged by Btullis:

[operations/puppet@production] Switch presto from Puppet to PKI certificates

https://gerrit.wikimedia.org/r/709713

Jan 16 2024, 11:06 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters

Dec 19 2023

gerritbot added a comment to T273642: Add analytics-presto.eqiad.wmnet CNAME for Presto coordinator failover.

Change 709737 abandoned by Btullis:

[operations/puppet@production] Add presto keytabs to the cluster coordinator replica role

Reason:

No longer needed.

https://gerrit.wikimedia.org/r/709737

Dec 19 2023, 10:38 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters

Oct 19 2023

Maintenance_bot removed a project from T255026: Upgrade schema[12]00[12] to Debian Buster: Patch-For-Review.
Oct 19 2023, 11:11 AM · Analytics-Kanban, Analytics-Clusters

Aug 16 2023

Aklapper set the color for Analytics-Clusters to Red.
Aug 16 2023, 3:09 PM

Aug 10 2023

gerritbot added a comment to T278424: Upgrade the Hadoop coordinators to Debian Buster.

Change 947857 merged by Btullis:

[operations/puppet@production] Create component/libmysql-java for bullseye

https://gerrit.wikimedia.org/r/947857

Aug 10 2023, 4:10 PM · Analytics-Kanban, Patch-For-Review, Analytics-Clusters
gerritbot added a comment to T278424: Upgrade the Hadoop coordinators to Debian Buster.

Change 947857 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Create component/libmysql-java for bullseye

https://gerrit.wikimedia.org/r/947857

Aug 10 2023, 2:41 PM · Analytics-Kanban, Patch-For-Review, Analytics-Clusters

Jul 12 2023

Maintenance_bot removed a project from T257572: Set up a testing environment for the AQS Cassandra 3 migration: Patch-For-Review.
Jul 12 2023, 12:10 PM · Data-Engineering-Kanban, Data-Engineering, Analytics-Kanban, Analytics-Clusters, Cassandra
gerritbot added a comment to T257572: Set up a testing environment for the AQS Cassandra 3 migration.

Change 679295 abandoned by Hnowlan:

[analytics/aqs@master] Add docker-compose environment with cassandra

Reason:

Not needed

https://gerrit.wikimedia.org/r/679295

Jul 12 2023, 11:48 AM · Data-Engineering-Kanban, Data-Engineering, Analytics-Kanban, Analytics-Clusters, Cassandra

May 16 2023

Maintenance_bot removed a project from T185581: Upgrade spark2 .deb to spark 2.2.1: Patch-For-Review.
May 16 2023, 11:31 AM · User-Elukey, Analytics-Clusters, Analytics-Kanban

Apr 27 2023

elukey closed T275896: Review ROCm deployment procedures and current packages, a subtask of T231067: Install Debian Buster on Hadoop, as Declined.
Apr 27 2023, 8:38 AM · Analytics-Clusters

Mar 29 2023

BTullis closed T229347: Rebuild spark2 for Debian Buster, a subtask of T222253: Upgrade Spark to 2.4.x, as Resolved.
Mar 29 2023, 10:03 AM · Analytics-Kanban, Analytics-Clusters

Mar 28 2023

Maintenance_bot removed a project from T268985: Improve user experience for Kerberos by creating automatic token renewal service: Patch-For-Review.
Mar 28 2023, 9:30 AM · Data-Engineering, Data-Engineering-Kanban, Analytics-Kanban, User-MoritzMuehlenhoff, Analytics-Clusters

Feb 24 2023

Aklapper archived Analytics-Clusters.
Feb 24 2023, 3:29 PM
Aklapper removed a hashtag from Analytics-Clusters: #analytics-cluster.
Feb 24 2023, 3:29 PM

Feb 23 2023

Dzahn merged T330360: too many puppet failures (puppet errors on stat hosts) into T330394: Scap issues with stat hosts.
Feb 23 2023, 7:19 PM · Analytics-Clusters, Scap
Dzahn added a comment to T330394: Scap issues with stat hosts.

Looks like it is. merging in as duplicate. also see T326668

Feb 23 2023, 7:18 PM · Analytics-Clusters, Scap
Dzahn added a comment to T330394: Scap issues with stat hosts.

Is T330360 a duplicate of this?

Feb 23 2023, 7:15 PM · Analytics-Clusters, Scap
hashar added a comment to T330394: Scap issues with stat hosts.

Very nice fix @nfraison thank you!

Feb 23 2023, 5:15 PM · Analytics-Clusters, Scap
nfraison closed T330394: Scap issues with stat hosts as Resolved.
Feb 23 2023, 3:30 PM · Analytics-Clusters, Scap
Maintenance_bot removed a project from T330394: Scap issues with stat hosts: Patch-For-Review.
Feb 23 2023, 3:30 PM · Analytics-Clusters, Scap
gerritbot added a comment to T330394: Scap issues with stat hosts.

Change 891555 merged by Nicolas Fraison:

[operations/puppet@production] provider_scap3: update the query to execute as the deploy_user

https://gerrit.wikimedia.org/r/891555

Feb 23 2023, 3:21 PM · Analytics-Clusters, Scap
nfraison added a comment to T330394: Scap issues with stat hosts.

Seems to work fine
Before trying to redeploy analytics/hdfs-tools/deploy

Info: Unable to serialize catalog to json, retrying with pson
Info: Applying configuration version '(cb9d9be2dc) Muehlenhoff - Switch puppetdb to profile::java'
Error: Execution of '/usr/bin/scap deploy-local --repo analytics/hdfs-tools/deploy -D log_json:False' returned 70: 
Error: /Stage[main]/Profile::Analytics::Hdfs_tools/Scap::Target[analytics/hdfs-tools/deploy]/Package[analytics/hdfs-tools/deploy]/ensure: change from 'absent' to 'present' failed: Execution of '/usr/bin/scap deploy-local --repo analytics/hdfs-tools/deploy -D log_json:False' returned 70:  (corrective)
Notice: /Stage[main]/Profile::Analytics::Refinery::Repository/Scap::Target[analytics/refinery]/Package[analytics/refinery]/ensure: created (corrective)
Notice: /Stage[main]/Profile::Analytics::Hdfs_tools/File[/usr/local/bin/hdfs-rsync]: Dependency Package[analytics/hdfs-tools/deploy] has failures: true
Warning: /Stage[main]/Profile::Analytics::Hdfs_tools/File[/usr/local/bin/hdfs-rsync]: Skipping because of failed dependencies
Notice: /Stage[main]/Profile::Airflow/Airflow::Instance[analytics_test]/Scap::Target[airflow-dags/analytics_test]/Package[airflow-dags/analytics_test]/ensure: created (corrective)
Info: Stage[main]: Unscheduling all events on Stage[main]
Notice: Applied catalog in 60.36 seconds
Feb 23 2023, 2:57 PM · Analytics-Clusters, Scap
jbond added a comment to T330394: Scap issues with stat hosts.

seems we mostly end up with same patch @jbond https://gerrit.wikimedia.org/r/891555 / https://gerrit.wikimedia.org/r/891557

I would be interested in reading/seeing how you do manual test of it. We could use an-test-client1001 for this test

Feb 23 2023, 2:48 PM · Analytics-Clusters, Scap
gerritbot added a comment to T330394: Scap issues with stat hosts.

Change 891557 abandoned by Jbond:

[operations/puppet@production] scap - provider: update scap provider to run git with correct user

Reason:

abanndon in favour of 891555 which also has the tests

https://gerrit.wikimedia.org/r/891557

Feb 23 2023, 2:41 PM · Analytics-Clusters, Scap
nfraison added a comment to T330394: Scap issues with stat hosts.

seems we mostly end up with same patch @jbond https://gerrit.wikimedia.org/r/891555 / https://gerrit.wikimedia.org/r/891557

Feb 23 2023, 2:33 PM · Analytics-Clusters, Scap
jbond added a comment to T330394: Scap issues with stat hosts.

Running the /usr/bin/git -C /srv/deployment/analytics/hdfs-tools/deploy tag --points-at HEAD manually as the user owning the folder (analytics-deploy) works:
scap/sync/2020-02-28/0001

Feb 23 2023, 2:30 PM · Analytics-Clusters, Scap
gerritbot added a comment to T330394: Scap issues with stat hosts.

Change 891557 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] scap - provider: update scap provider to run git with correct user

https://gerrit.wikimedia.org/r/891557

Feb 23 2023, 2:28 PM · Analytics-Clusters, Scap
gerritbot added a project to T330394: Scap issues with stat hosts: Patch-For-Review.
Feb 23 2023, 2:27 PM · Analytics-Clusters, Scap
gerritbot added a comment to T330394: Scap issues with stat hosts.

Change 891555 had a related patch set uploaded (by Nicolas Fraison; author: Nicolas Fraison):

[operations/puppet@production] provider_scap3: update the query to execute as the deploy_user

https://gerrit.wikimedia.org/r/891555

Feb 23 2023, 2:27 PM · Analytics-Clusters, Scap
MoritzMuehlenhoff added a comment to T330394: Scap issues with stat hosts.

Or we could ensure that the first call to get state is also run as the user owning the folder?

Feb 23 2023, 2:14 PM · Analytics-Clusters, Scap
hashar added a comment to T330394: Scap issues with stat hosts.

The git security update for safe.directory is intended exactly for that use case. A deployer could inject in the git repository some hook (as the deployment user), then when Puppet runs git as root on the repo, it might run some config or hook as root resulting in a privilege escalation.

Feb 23 2023, 1:51 PM · Analytics-Clusters, Scap
nfraison added a comment to T330394: Scap issues with stat hosts.

Or we could ensure that the first call to get state is also run as the user owning the folder?

Feb 23 2023, 1:45 PM · Analytics-Clusters, Scap
hashar added a comment to T330394: Scap issues with stat hosts.

Running the /usr/bin/git -C /srv/deployment/analytics/hdfs-tools/deploy tag --points-at HEAD manually as the user owning the folder (analytics-deploy) works:
scap/sync/2020-02-28/0001

While running it as root failed

nfraison@stat1008:/srv/deployment/analytics/hdfs-tools/deploy$ sudo /usr/bin/git -C /srv/deployment/analytics/hdfs-tools/deploy tag --points-at HEAD
fatal: detected dubious ownership in repository at '/srv/deployment/analytics/hdfs-tools/deploy-cache/revs/0c6e3ca61c094338d821ae7c73e244f1abb5b8bc'
To add an exception for this directory, call:

	git config --global --add safe.directory /srv/deployment/analytics/hdfs-tools/deploy-cache/revs/0c6e3ca61c094338d821ae7c73e244f1abb5b8bc

From puppet log it seems that this one is run as root and the second one as the owning user.
@hashar is that expected?

Feb 23 2023, 1:43 PM · Analytics-Clusters, Scap
nfraison claimed T330394: Scap issues with stat hosts.
Feb 23 2023, 1:39 PM · Analytics-Clusters, Scap
nfraison updated subscribers of T330394: Scap issues with stat hosts.

Running the /usr/bin/git -C /srv/deployment/analytics/hdfs-tools/deploy tag --points-at HEAD manually as the user owning the folder (analytics-deploy) works:
scap/sync/2020-02-28/0001

Feb 23 2023, 1:38 PM · Analytics-Clusters, Scap
nfraison added a comment to T330394: Scap issues with stat hosts.

Here is the result of the debug puppet logs for one of those scap target

Debug: Executing: '/usr/bin/git -C /srv/deployment/analytics/hdfs-tools/deploy tag --points-at HEAD'
Debug: scap pkg [analytics/hdfs-tools/deploy] root=/srv/deployment/analytics/hdfs-tools, user=analytics-deploy
Debug: Executing with uid=494: '/usr/bin/scap deploy-local --repo analytics/hdfs-tools/deploy -D log_json:False'
Notice: /Stage[main]/Profile::Analytics::Hdfs_tools/Scap::Target[analytics/hdfs-tools/deploy]/Package[analytics/hdfs-tools/deploy]/ensure: created (corrective)
Debug: /Package[analytics/hdfs-tools/deploy]: The container Scap::Target[analytics/hdfs-tools/deploy] will propagate my refresh event
Feb 23 2023, 1:35 PM · Analytics-Clusters, Scap
nfraison added a comment to T330394: Scap issues with stat hosts.

Updating .config file in /srv/deployment/analytics/hdfs-tools/deploy-cache in order to have git_server set to deploy1002.eqiad.wmnet instead of deploy1001.eqiad.wmnet

Feb 23 2023, 1:34 PM · Analytics-Clusters, Scap
Maintenance_bot removed a project from T222253: Upgrade Spark to 2.4.x: Patch-For-Review.
Feb 23 2023, 1:11 PM · Analytics-Kanban, Analytics-Clusters
jbond reopened T229347: Rebuild spark2 for Debian Buster, a subtask of T222253: Upgrade Spark to 2.4.x, as Open.
Feb 23 2023, 12:50 PM · Analytics-Kanban, Analytics-Clusters
jbond created T330394: Scap issues with stat hosts.
Feb 23 2023, 12:36 PM · Analytics-Clusters, Scap

Feb 14 2023

gerritbot added a comment to T278424: Upgrade the Hadoop coordinators to Debian Buster.

Change 888718 merged by Btullis:

[operations/puppet@production] Try libmariadb-java with sqoop on bullseye

https://gerrit.wikimedia.org/r/888718

Feb 14 2023, 10:17 AM · Analytics-Kanban, Patch-For-Review, Analytics-Clusters

Feb 13 2023

gerritbot added a comment to T278424: Upgrade the Hadoop coordinators to Debian Buster.

Change 888718 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Try libmariadb-java with sqoop on bullseye

https://gerrit.wikimedia.org/r/888718

Feb 13 2023, 3:05 PM · Analytics-Kanban, Patch-For-Review, Analytics-Clusters

Feb 10 2023

Aklapper edited Description on Analytics-Clusters.
Feb 10 2023, 5:40 PM

Jan 27 2023

Maintenance_bot removed a project from T268801: Deprecate the 'researchers' posix group: Patch-For-Review.
Jan 27 2023, 10:32 AM · Analytics-Clusters, Analytics-Kanban

Jan 6 2023

EChetty moved T276088: Configuration Management for Kafka settings from Backlog to Event Platform on the Data-Engineering-Planning board.
Jan 6 2023, 12:15 PM · Data-Platform-SRE, Data-Engineering, serviceops-radar, Event-Platform, Analytics-Radar, SRE

Dec 14 2022

Maintenance_bot removed a project from T262189: Create a cookbook to automate the bootstrap of new Hadoop workers: Patch-For-Review.
Dec 14 2022, 3:31 PM · Analytics-Kanban, Analytics-Clusters

Dec 5 2022

akosiaris changed the status of T276088: Configuration Management for Kafka settings from Open to Stalled.

Cool, thanks for that write up @Ottomata. I am gonna switch to Stalled, to reflect noone is currently working on it but it's still a want to do.

Dec 5 2022, 5:28 PM · Data-Platform-SRE, Data-Engineering, serviceops-radar, Event-Platform, Analytics-Radar, SRE
Ottomata added a comment to T276088: Configuration Management for Kafka settings.

I would like to see config management for Kafka topics one day, ideally integrating with EventStreamConfig (or whatever we use to declare and manage streams (and their schemas).

Dec 5 2022, 5:02 PM · Data-Platform-SRE, Data-Engineering, serviceops-radar, Event-Platform, Analytics-Radar, SRE
akosiaris added a comment to T276088: Configuration Management for Kafka settings.

@Ottomata, @elukey any updates on this? Should we keep it open/close it/stall it?

Dec 5 2022, 4:55 PM · Data-Platform-SRE, Data-Engineering, serviceops-radar, Event-Platform, Analytics-Radar, SRE