10 Learnings From Running Production Infrastructure at Google • Christof Leng • GOTO 2023

This presentation was recorded at GOTO Amsterdam 2023. #GOTOcon #GOTOams Christof Leng - Lead for Google’s SRE Engagement Model and SRE Review Programs @ChristofLeng ORIGINAL TALK TITLE Ten Things We’ve Learned From Running Production Infrastructure at Google RESOURCES Christof ABSTRACT Google’s production infrastructure might be one of the most complex machines that humanity has built so far. It is constantly changing and evolving. Site Reliability Engineers (SREs) are the specialists to manage and improve the architectures, tooling, and operational procedures that enable Google to keep its products reliable, scalable, efficient, and agile. This talk will discuss a number of fundamental organizational principles that Google SRE has learned over the years. [...] TIMECODES 00:00 Intro 05:26 Culture 07:00 1. Reliability can’t be taken for granted 10:42 2. Cattle vs. Pets 14:11 3. Blamelessness 16:15 4. Measure what matters 19:22 A word on Ops 20:16 5. Failure modes 21:59 6. No heroes 25:58 7. Automation 27:55 Change is constant 28:05 8. Change is No. 1 reason for outages 30:42 9. Outages are inevitable 34:30 10. No haunted graveyards 36:53 What did we learn? 38:10 Outro Download slides and read the full abstract here: RECOMMENDED BOOKS Murphy, Beyer, Jones & Petoff • Site Reliability Engineering • Beyer, Murphy, Rensin, Kawahara & Thorne • The Site Reliability Workbook • Adkins, Beyer, Blankinship, Lewandowski, Oprea & Stubblefield• Building Secure and Reliable Systems • Nora Jones & Casey Rosenthal • Chaos Engineering • Russ Miles • Learning Chaos Engineering • #SRE #SiteReliabilityEngineering #ChaosEngineering #AtGoogle #ProductionInfrastructure #ChristofLeng #OrganizationalPrinciples #OrganizationalCulture #Change #Simplicity Looking for a unique learning experience? Attend the next GOTO conference near you! Get your ticket at Sign up for updates and specials at SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
Back to Top