Page MenuHomePhabricator

Investigate setting up HTTPS directly on beta appservers
Open, MediumPublic

Description

This is related to T206003: Beta Cluster: Parsoid config request failures from the MediaWiki API.

<Krenair> _joe_, when you say 'an https interface to mediawiki'
<Krenair> you're not talking about the nginx/varnish layer being able to talk HTTPS are you?
<Krenair> i.e. https://en.wikipedia.beta.wmflabs.org/wiki/Main_Page
<_joe_> Krenair: no I mean TLS on the application servers in beta
<_joe_> we have it in production, we should add it there too, even though it's not as necessary
<Krenair> _joe_, what certs do you use for that in prod?
<_joe_> just to allow people to test connecting apps via TLS, which is a sane practice
<Krenair> puppet?
<_joe_> we generate certs with the puppet CA
<Krenair> so, extra certs from the puppet CA that aren't the host's normal puppet certs?
<_joe_> if you ping me in a couple weeks (post-switchback), I can show you how it's done or do it myself
<_joe_> yes

I might look into this before then to try to bring beta closer to prod setup again. It may involve completion of the apache config consolidation, should look at where Giuseppe got to with the prod apache changes.

Event Timeline

Removing task assignee due to inactivity, as this open task has been assigned to the same person for more than two years (see the emails sent to the task assignee on Oct27 and Nov23). Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.
(See https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.)

Now that CFSSL is a thing, I took a look at this as the certs are much easier to generate. I added the following hiera to deployment-mediawiki11:

profile::envoy::ensure: present
profile::mediawiki::webserver::has_tls: true
profile::pki::client::ensure: present
profile::services_proxy::envoy::enabled_listeners:
- dummy-to-workaround-requirements
profile::services_proxy::envoy::listeners:
- name: dummy-to-workaround-requirements
  port: 24107
  service: dummy-to-workaround-requirements
  timeout: 1s
  upstream: example.com
profile::tlsproxy::envoy::access_log: true
profile::tlsproxy::envoy::capitalize_headers: true
profile::tlsproxy::envoy::cfssl_label: deployment-prep.eqiad1.wikimedia.cloud
profile::tlsproxy::envoy::global_cert_name: deployment-mediawiki11.deployment-prep.eqiad1.wikimedia.cloud
profile::tlsproxy::envoy::retries: false
profile::tlsproxy::envoy::services:
- port: 80
  server_names:
  - '*'
profile::tlsproxy::envoy::sni_support: 'no'
profile::tlsproxy::envoy::ssl_provider: cfssl
profile::tlsproxy::envoy::tls_port: 443
profile::tlsproxy::envoy::upstream_response_timeout: 203.0
service::catalog:
  dummy-to-workaround-requirements:
    description: this only exists to workaround https://github.com/wikimedia/puppet/blob/d780c818b698f2f255f51f105c3d815099291c39/modules/profile/manifests/services_proxy/envoy.pp#L37
    encryption: false
    ip:
      eqiad:
        default: 192.0.2.1
    lvs:
      class: high-traffic1
      conftool:
        cluster: dummy-to-workaround-requirements
        service: dummy-to-workaround-requirements
      depool_threshold: '.5'
      enabled: true
      monitors:
        IdleConnection:
          max-delay: 300
          timeout-clean-reconnect: 3
      scheduler: wrr
    port: 24108
    sites: []
    state: production

So Envoy will only be installed if profile::envoy::ensure is present, but if that's set to present you need to have at least one services_proxy enabled listener. Not sure if intentional or not.

It created some envoy configs that look correct to my untrained eye and the envoyproxy service is running, but it's not accepting requests.. @Joe any ideas?

In T206158#7048731, @Majavah wrote:

So Envoy will only be installed if profile::envoy::ensure is present, but if that's set to present you need to have at least one services_proxy enabled listener. Not sure if intentional or not.

That was worked around via https://gerrit.wikimedia.org/r/c/operations/puppet/+/683837

It created some envoy configs that look correct to my untrained eye and the envoyproxy service is running, but it's not accepting requests..

[2021-04-30 11:44:39.252][25901][critical][main] [source/server/config_validation/server.cc:60] error initializing configuration '/tmp/.envoyconfig/envoy.yaml': Invalid path: /etc/cfssl/ssl/deployment-prep_eqiad1_wikimedia_cloud__deployment-mediawiki11_deployment-prep_eqiad1_wikimedia_cloud_server/deployment-prep_eqiad1_wikimedia_cloud__deployment-mediawiki11_deployment-prep_eqiad1_wikimedia_cloud_server.pem

It's not liking the certificate paths, which are correct as the file exists. Not sure why, the validation in envoy doesn't validate lengths or anything like that.

So this is now working on deployment-mediawiki11 with the help of jbond to get CFSSL working with envoy. Next step is to get caches and other hosts to trust it.

Mentioned in SAL (#wikimedia-releng) [2021-04-30T14:13:30Z] <Majavah> deployment-cache-text: trying out useusing HTTPS for backend traffic to deployment-mediawiki11 T206158

Mentioned in SAL (#wikimedia-releng) [2021-04-30T14:15:41Z] <Majavah> revert above as it's not working, T206158

taavi triaged this task as Medium priority.May 16 2021, 9:52 AM
taavi removed taavi as the assignee of this task.Sep 10 2022, 7:33 AM