DevOps Days London 2022
DevOps Days London 2022
Written by Nick Otter.
Intro
We (8x8) attended DevOps Days London 2022 this week. Here are my personal takeaways. For fun and laziness - I am only listing each takeaway as a one liner, to be explored in more depth outside of this post.
Platform as a Product
- Use PACT (code-first consumer-driven contract testing tool) to test Infrastructure.
- SLA contracts for platform changes that have to pass Dev tests before deployment.
- Platform resources are not considered implicit, there is a feature roadmap.
- User personas for Dev teams, try and group dev needs as user personas.
- Platform repos are completely open to Dev, innersource mentality - dev can PR immediate changes they need if there is too much of a backlog of work.
- Understand Thin platform team or Fat platform team.
Issues with Embedded DevOps
- A dedicated DevOps engineer per Dev team creates Siloing and Code sprawl. Tribal knowledge is required to unify all DevOps individuals embedded in Dev teams.
DevOps transfomations
- Evangelise Operations orchestrating DevOps not Dev movement to DevOps which squeezes out Operations.
- Understand Operations DevOps vs. Cloud Engineer/Platform engineer vs. Embedded DevOps.
Working past ‘DevOps’ as a job title
- DevOps as a job title is far too tenuous.
- ‘DevOps engineer’ may mean working as Operations, much time spent working on Production with some dev tooling.
- ‘DevOps engineer’ may mean acting as an informal Platform Architect.
- ‘DevOps engineer’ may mean being a Cloud Engineer.
DevOps metrics of success
- DORA.
- SLA contracts for platform changes that have to pass Dev tests before deployment.
- CALMS - Culture, Automation, Lean, Measurement, and Sharing.
- DiRT - Disaster Recovery Test, regular firedrills.
Terraform
- Be wary of Terraform module dependencies.
Incidents
- Weekly firedrills, DiRT weekly type events.
- Service can’t go live without thorough documentation and incidence response steps.
- Documentation should pass the ‘3am test’.
- Diverse pool of engineers on call at the same time, primary can escalate to SME.
- Who can fix it in 10mins? Don’t chase your losses.
Misc
- ‘Centralised platform architecture to collaborative platform architecture’
- ‘Domain based boundaries for services’
- ‘CUPID idiomatic vs. SOLID’
- ‘An architectural quantum’
- Using Pulumi, single language, single framework with Dev
- Application driven systems, decoupled from cloud provider
- Feature management DevOps.
- Team topologies.
- Flow aligned, complex subsystems.
- Keep deployment configuration in Dev codebase as much as possible
- ‘Open source license auditor, audits most dependended upon open source projects’
Thanks. This was written by Nick Otter.