Software Engineering Radio is a podcast targeted at the professional software developer. The goal is to be a lasting educational resource, not a newscast. SE Radio covers all topics software engineering. Episodes are either tutorials on a specific topic, or an interview with a well-known character from the software engineering world. All SE Radio episodes are original content — we do not record conferences or talks given in other venues. Each episode comprises two speakers to ensure a lively listening experience. SE Radio is brought to you by the IEEE Computer Society and IEEE Software magazine.
SE-Radio-Episode-276-Björn-Rabenstein-on-Site-Reliability-Engineering
Björn Rabenstein discusses the field of Site Reliability Engineering (SRE) with host Robert Blumen. The term SRE has recently emerged to mean Google's approach to DevOps. The publication of Google's book on SRE has brought many of their practices into more public discussion. The interview covers: what is distinct about SRE versus devops; the SRE focus on development of operational software to minimize manual tasks; the emphasis on reliability; Dickerson's hierarchy of reliability; how reliability can be measured; is there such a thing as too much reliability?; can Google's approach to SRE be applied outside of Google?; Björn's experience in applying SRE to Soundcloud - what worked and what did not; how can engineers best apply SRE to their organizational situation?; the importance of monitoring; monitoring and alerting; being on call, responding to incidents; the importance of documentation for responding to problems; they wrap up with a discussion of why people from non-computer science backgrounds are often found in devops and SRE.