Difference between revisions of "SRE"
no edit summary
[[SRE/Infrastructure Foundations|SRE Infrastructure Foundations]] - Automation and Networking (cumin, netbox, puppet, spicerack).
The team focuses on building and maintaining our base platform (“metal cloud”) that forms the foundations upon which nearly everything else in our infrastructure builds upon. On top of our bare metal deployments, their responsibilities include (but are not limited to) configuration management systems, infrastructure automation, orchestration tooling, infrastructure security and network operations.
[[Observability|SRE Observability]] - Monitoring and Logging (Prometheus/Grafana and ElasticSearch, plus some Kafka).
The Observability team, or "o11y" for short, works across SRE and Technology to provide teams with tools, platforms and insights into how systems and services are performing. It leverages technologies such as Grafana, Kibana/Logstash, Prometheus, AlertManager and more.
[[SRE/Service Operations|SRE Service Operations]] - MediaWiki Operations and Supporting Services (Kubernetes, memcached, redis, Infrastructure for: Gitlab, OTRS, Phabricator).
The Service Operations team takes care of public and “user-visible” services alongside Technology and Product teams. This means, for example, our MediaWiki platform, but also the newer (micro)services that comprise our stack. It also includes miscellaneous services and components that we rely upon (think Phabricator, mail systems, OTRS, etc…). The team is also building our new SOA service infrastructure based on Kubernetes.
[[Traffic|SRE Traffic]] - Caching and DNS (ATS, varnish, GeoDNS, wikidough).
The Traffic team is responsible for the critical first layer of high-traffic infrastructure which now spans much of the globe, including our TLS termination and caching layers (ATS, Varnish), load balancing, DNS and our own network.
Content is available under CC BY-SA 3.0
unless otherwise noted.