A show about getting your best ideas into the world and seeing what happens. We talk about code, ops, infrastructure, and the people that make it happen. Gerhard Lazu and friends explore all things DevOps, infra, and running apps in production. Whether you’re cloud native, Kubernetes curious, a pro SRE, or just operating a VPS… you’ll love coming along for the ride. Ship It honors the makers, the shippers, and the visionaries that see it through. Some people search for ShipIt or ShipItFM and can’t find the show, so now the strings ShipIt and ShipItFM are in our description too.

Learning from incidents

September 30, 2021 55:36 54.04 MB Downloads: 0

Things go wrong all the time. We all make mistakes. And that is okay. What is not okay, is to think that it won’t happen, or that there will be someone else around when it does. In that moment, it doesn’t matter who wrote that module, package or microservice. But there is a better way to think about this, and there is an approach that makes people actually look forward to incidents.

It all starts with thinking of incidents as opportunities to learn, and then share those learnings with everyone, so that you can all improve. In this episode, Gerhard is joined by Stephen Whitworth and Chris Evans, incident.io co-founders, and former Staff Engineers at Monzo.

They get it, we get it, and now you can get it too.

Discuss on Changelog News

Changelog++ members save 4 minutes on this episode because they made the ads disappear. Join today!

Sponsors

  • Fly.io – Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.
  • PlanetScale – PlanetScale is the only serverless database platform you can start in an instant and scale indefinitely with unlimited connections. Never think about database servers again. Everything you want to control is available through the beautifully designed PlanetScale CLI. Learn more and start your database in seconds at planetscale.com
  • Honeycomb – Guess less, know more. When production is running slow, it’s hard to know where problems originate: is it your application code, users, or the underlying systems? With Honeycomb you get a fast, unified, and clear understanding of the one thing driving your business: production. Join the swarm and try Honeycomb free today at honeycomb.io/changelog
  • FireHydrant – The reliability platform for teams of all sizes. With FireHydrant, teams achieve reliability at scale by enabling speed and consistency from a service deployment to an unexpected outage. Try FireHydrant free for 14 days at firehydrant.io

Featuring

Notes and Links

Gerhard, Chris & Stephen

Something missing or broken? PRs welcome!