Awesome Open Source
Awesome Open Source

How they SRE

PRs Welcome CI

Alt

A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

Introduction

How They SRE is a curated knowledge repository of best practices, tools, techniques, and culture of SRE adopted by the leading technology or tech-savvy organizations.

Many organizations regularly come forward and share their best practices, tools, techniques and offer an insight into engineering culture on various public platforms like engineering blogs, conferences & meetups. The content is curated from these avenues and shared in this repository.

Note to readers: This list refers to some of the articles, posts, videos, tools, and techniques published before 2015. Please use such material with caution as there may be recent advances in technology and practices which offer better alternatives and perspectives.

Topics

  • Site Reliability Engineering
  • Hiring and Building SRE teams
  • SRE Culture
  • DevOps
  • Monitoring & Observability
  • Alerting
  • Incident Response & Post-Mortem
  • On-Call
  • Testing in Production
  • Chaos Engineering
  • Automation
  • Performance

Organizations

Achievers

Blog Posts

Airbnb

Blog Posts

Algolia

Blog Posts

Asana

Blog Posts

ASOS

Blog Posts

Atlassian

Blog Posts

BackMarket

Blog Posts

Baidu

Videos

Basecamp

Blog Posts

Books

Bloomberg

Videos

Booking.com

Blog Posts

Videos

Capital One

Blog Posts

Major incidents & analysis reports

Videos

DBS

Blog Posts

Videos

DeepSource

Blog Posts

Dropbox

Blog Posts

Videos

eBay

Blog Posts

Video

Etsy

Blog Posts

Videos

Expedia

Blog Posts

Facebook

Videos

Fastly

Videos

GitHub

Blog Posts

Major incidents & analysis reports

Videos

GitLab

Blog Posts

GoCardless

Blog Posts

Major incidents & analysis reports

Gojek

Blog Posts

Google

Blog Posts

Videos

Grab

Blog Posts

Grammarly

Blog Posts

Heroku

Blog Posts

Indeed

Blog Posts

Videos

Khan Academy

Blog Posts

LinkedIn

Blog Posts

Videos

Mercari

Blog Posts

Microsoft

Videos

MIRO

Blog Posts

Monzo

Blog Posts

Videos

Netflix

Blog Posts

Major incidents & analysis reports

Videos

PayPal

Videos

Pinterest

Blog Posts

Videos

Postman

Blog Posts

Red Hat

Blog Posts

Scribd

Blog Posts

Shopify

Blog Posts

Videos

Sky Betting and Gaming

Blog Posts

Slack

Blog Posts

Videos

Slalom Build

Blog Posts

Soundcloud

Blog Posts

Spotify

Blog Posts

Videos

Squarespace

Blog Posts

Videos

Stack Overflow

Blog Posts

Videos

Stripe

Blog Posts

Videos

Target

Blog Posts

Teads

Blog Posts

Trivago

Blog Posts

Uber

Blog Posts

Videos

VGW

Blog Posts

Videos

Wikimedia Foundation

Videos

Yelp

Blog Posts

Videos

Zerodha

Blog Posts

SRECon Mix Playlist

Videos


Resources

Books

Events

Other Goodies

Awesome Lists

SRE Resources from various organizations

Newsletters

Credits

Other How They... repos

Contribute

Contributions welcome! Read the contribution guidelines first.

License

CC0

To the extent possible under law, Unmesh Gundecha has waived all copyright and related or neighboring rights to this work.


If you decide to use this anywhere please give a credit to @upgundecha on twitter, also If you like my work, check out other projects on my Github.


Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
javascript (70,301
monitoring (626
devops (582
best-practices (183
observability (77
incident-response (71
alerting (49
sre (40
chaos-engineering (29
reliability (21
site-reliability-engineering (17