Page MenuHomePhabricator

Upgrade Grafana to 8.x
Closed, ResolvedPublic

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2021-06-07T10:38:06Z] <godog> downgrade grafana to 7.4.2 on grafana2001 - T282863

The upgrade worked, though it requires stopping puppet to stop overwriting the sqlite database. I've reenabled puppet and reverted grafana to 7.4.2 for now

fgiunchedi renamed this task from Upgrade Grafana to 8 to Upgrade Grafana to 8.1.Aug 6 2021, 9:47 AM
fgiunchedi updated the task description. (Show Details)
lmata triaged this task as Medium priority.Sep 30 2021, 9:48 PM

Upgraded to 7.5.11 today for CVE-2021-39226. We'll want >= 8.1.6 when we get around to this.

Change 726858 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add grafana import hook to pick the latest Grafana 7.x version

https://gerrit.wikimedia.org/r/726858

Change 726858 merged by Muehlenhoff:

[operations/puppet@production] Add grafana import hook to pick the latest Grafana 7.x version

https://gerrit.wikimedia.org/r/726858

colewhite renamed this task from Upgrade Grafana to 8.1 to Upgrade Grafana to 8.x.Nov 22 2021, 9:36 PM

Change 740682 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] profile: turn off grafana db sync ahead of 8.x upgrade

https://gerrit.wikimedia.org/r/740682

Just a note to indicate that given the recent Grafana 8 vulnerability we should make sure to upgrade to the latest 8 version

Change 740682 merged by Cwhite:

[operations/puppet@production] profile: turn off grafana db sync ahead of 8.x upgrade

https://gerrit.wikimedia.org/r/740682

Mentioned in SAL (#wikimedia-operations) [2022-01-03T21:50:57Z] <cwhite> manually upgrade to grafana 8 on grafana-next (T282863)

Grafana 8 is running on grafana-next.

Logging in takes the user back to the primary active server running Grafana 7 via grafana-rw.wm.o.

Another data point: today while investigating T298945 I ran into this on grafana2001's logs:

t=2022-01-17T10:43:03+0000 lvl=warn msg="Could not render image, no image renderer found/installed. For im
age rendering support please install the grafana-image-renderer plugin. Read more at https://grafana.com/docs/grafana/latest/administration/image_rendering/" logger=rendering

Change 757774 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] hiera: set domainrw to grafana-next-rw in codfw

https://gerrit.wikimedia.org/r/757774

Change 757775 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] graphite: add grafana-next-rw to cors origins

https://gerrit.wikimedia.org/r/757775

Change 757776 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] idp, grafana: configure grafana-next-rw for sso

https://gerrit.wikimedia.org/r/757776

Change 757777 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] hiera: add grafana-next-rw to grafana public_aliases

https://gerrit.wikimedia.org/r/757777

Change 757778 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] hiera: configure mapping and cache rules for grafana-next-rw

https://gerrit.wikimedia.org/r/757778

Change 757780 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/dns@master] wikimedia.org: add grafana-next-rw

https://gerrit.wikimedia.org/r/757780

Hi @fgiunchedi I have a question about the alerts in Grafana 8.X. The performance team has alerts created in Grafana and in 8.3 Grafanas new alert system become default. I've [tried it](T298477#7595658) and it will add some functionality for us that is great. I will miss being able too see exactly when alerts fires in the dashboards though. However, when do you think would be a good time too migrate to the new?

Hi @fgiunchedi I have a question about the alerts in Grafana 8.X. The performance team has alerts created in Grafana and in 8.3 Grafanas new alert system become default. I've [tried it](T298477#7595658) and it will add some functionality for us that is great. I will miss being able too see exactly when alerts fires in the dashboards though. However, when do you think would be a good time too migrate to the new?

Thank you @Peter for trying out the new Grafana alerting! I haven't tried it myself yet but it looks like it'll be the future. It looks like to me we should be migrating after the upgrade to 8 is completed (i.e. decouple the two, my understanding is that turning on the new alerting is a one way street).

On the alerting specifics, I agree it is a loss not to be able to visualize thresholds and alerts directly on graphs. It seems to me that the new alerting will be good for mixed-source alerts (i.e. graphite + prometheus) and for prometheus-only alerts might as well use the operations/alerts.git repository (self service) and write prometheus-native alerting rules, what do you think ?

@fgiunchedi sounds good, please ping me later on when you feel we can go ahead!

Change 757780 merged by Cwhite:

[operations/dns@master] wikimedia.org: add grafana-next-rw

https://gerrit.wikimedia.org/r/757780

Change 757774 merged by Cwhite:

[operations/puppet@production] hiera: set domainrw to grafana-next-rw in codfw

https://gerrit.wikimedia.org/r/757774

Change 757775 merged by Cwhite:

[operations/puppet@production] graphite: add grafana-next-rw to cors origins

https://gerrit.wikimedia.org/r/757775

Change 757776 merged by Cwhite:

[operations/puppet@production] idp, grafana: configure grafana-next-rw for sso

https://gerrit.wikimedia.org/r/757776

Change 757777 merged by Cwhite:

[operations/puppet@production] hiera: add grafana-next and grafana-next-rw to grafana public_aliases

https://gerrit.wikimedia.org/r/757777

Change 757778 merged by Cwhite:

[operations/puppet@production] hiera: configure mapping and cache rules for grafana-next-rw

https://gerrit.wikimedia.org/r/757778

Change 761022 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] ssl: add regenerated grafana cert

https://gerrit.wikimedia.org/r/761022

Change 761022 merged by Cwhite:

[operations/puppet@production] ssl: add regenerated grafana cert

https://gerrit.wikimedia.org/r/761022

FYI if I go to a dashboard or show a specific panel and then click login, after the IDP login I get redirected back to the homepage, not the dashboard/panel I was having open. This doesn't happen with the current grafana that correctly redirects back to the last page before clicking login.

Change 763329 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] grafana-next: set grafana codfw base domain to grafana next

https://gerrit.wikimedia.org/r/763329

Change 763334 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/debs/grafana-plugins@master] remove deprecated piechart plugin

https://gerrit.wikimedia.org/r/763334

Change 763335 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/debs/grafana-plugins@master] update grafana-image-renderer to 3.3.0

https://gerrit.wikimedia.org/r/763335

Change 763337 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/debs/grafana-plugins@master] update grafana-simple-json-datasource to 1.4.2

https://gerrit.wikimedia.org/r/763337

Change 763329 merged by Cwhite:

[operations/puppet@production] grafana-next: set grafana codfw base domain to grafana next

https://gerrit.wikimedia.org/r/763329

FYI if I go to a dashboard or show a specific panel and then click login, after the IDP login I get redirected back to the homepage, not the dashboard/panel I was having open. This doesn't happen with the current grafana that correctly redirects back to the last page before clicking login.

This wasn't a Grafana 8 problem, but it should be fixed now.

Good catch! It seems some dashboards are using the grafana-piechart-panel which is deprecated. piechart ships in grafana core now. Dashboards can be changed to use the core plugin on production prior to upgrade.

Updated panels to use piechart v2 plugin on production with the exception of Cassandra Storage (-next). There appears to be a bug preventing piechart rendering fixed in Grafana 8.

Change 763618 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/debs/grafana-plugins@master] use grafana api for worldmap plugin artifact

https://gerrit.wikimedia.org/r/763618

Change 763334 merged by Cwhite:

[operations/debs/grafana-plugins@master] remove deprecated piechart plugin

https://gerrit.wikimedia.org/r/763334

Change 763335 merged by Cwhite:

[operations/debs/grafana-plugins@master] update grafana-image-renderer to 3.4.0

https://gerrit.wikimedia.org/r/763335

Change 763337 merged by Cwhite:

[operations/debs/grafana-plugins@master] update grafana-simple-json-datasource to 1.4.2

https://gerrit.wikimedia.org/r/763337

Change 763618 merged by Cwhite:

[operations/debs/grafana-plugins@master] use grafana api for worldmap plugin artifact

https://gerrit.wikimedia.org/r/763618

The dashboards I visit regularly (linkrecommendation, special-homepage-and-suggested-edits, growth-team-product-kpis) look good to me.

Mentioned in SAL (#wikimedia-operations) [2022-03-01T17:52:06Z] <cwhite> completed grafana upgrade in eqiad T282863

Packages uploaded to reprepro.

Upgrade process:

sudo systemctl stop grafana-server

# take a db backup
sudo cp /var/lib/grafana/grafana.db ~/grafana.db-$(date -I)

# clean up old plugins completely
sudo apt remove --purge grafana-plugins
sudo rm -rf /var/lib/grafana/plugins/*

# install
sudo apt update
sudo apt install grafana grafana-plugins
sudo run-puppet-agent
colewhite claimed this task.

Change 767608 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] aptrepo: update grafana version to <8.4

https://gerrit.wikimedia.org/r/767608

Mentioned in SAL (#wikimedia-cloud) [2022-03-03T08:49:25Z] <taavi> deploying cloudmetrics grafana to grafana 8, T282863

Change 767608 merged by Cwhite:

[operations/puppet@production] aptrepo: update grafana version to <8.4

https://gerrit.wikimedia.org/r/767608