Why Is My App SLOw? Defining Reliability in Platform Engineering • Jez Humble • GOTO 2023

This presentation was recorded at GOTO Aarhus 2023. #GOTOcon #GOTOaar Jez Humble - SRE at Google Cloud & Lecturer at UC Berkeley @JezHumble RESOURCES ABSTRACT Platform engineering is all fun and games until platform customers start complaining about their apps running slowly. Is it the app code or the platform? This talk looks at how Google’s Serverless SRE team detects platform-level latency regressions before users, measures the impact of regressions, and tracks performance over time. We’ll discuss the limitations of SLOs in this context and how to take a statistical approach that gives a customer-centric picture of the performance of our platform instead. [...] TIMECODES 00:00 Intro 02:08 Serverless platform is amazing 05:56 “My app is slow“ 08:14 The platform is slow 09:29 Total (end-to-end) latency distribution 10:54 Request delivery latency 12:37 Goal 13:39 Reliability in practice 17:09 Applying to the model 18:39 Stationarity 19:31 2-Sigma Technique 24:30 Mechanics 26:41 Overload score 28:08 Impact analysis 29:03 FAQ 31:56 Backtesting 33:21 Limitations 35:13 Other applications 35:20 Streamlined diagnosis 37:06 Approximate cohort A/B testing 37:33 Conclusions 40:02 Outro Download slides and read the full abstract here: RECOMMENDED BOOKS Nicole Forsgren, Jez Humble & Gene Kim • Accelerate • Kim, Humble, Debois, Willis & Forsgren • The DevOps Handbook • Jez Humble & David Farley • Continuous Delivery • Jez Humble, Joanne Molesky & Barry O’Reilly • Lean Enterprise • #SLO #SRE #ChaosEngineering #Serverless #PlatformEngineering #2Sigma #GoogleCloud #JezHumble Looking for a unique learning experience? Attend the next GOTO conference near you! Get your ticket at Sign up for updates and specials at SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.
Back to Top