Writing
effective devops (o’reilly media, 2016)
In Effective Devops, Jennifer Davis and I discuss individual collaboration, inter-team affinity, tool usage, and scaling challenges as the four pillars of effective devops transformations. We collected a variety of case studies and stories showing how organizations have been able to successfully change and maintain cultures of collaboration, trust, and high-performance engineering.
Rather than trying to prescribe a one-size-fits-all solution, we describe key ideas and practices that can be tailored to fit your teams and companies at any stage of the organizational life-cycle. Practical examples and troubleshooting scenarios will give readers the tools that they need to tackle challenges such as increasing trust and communication among a growing team, helping multiple teams with conflicting priorities to work together in pursuit of business goals, and finding tools and processes that work for their organization in an ever-changing landscape of vendors and best practices.
Aimed at decision-makers throughout the organization, both managers and individual contributors alike, this book is an actionable, technology-agnostic guide to understanding the cultural movement that is devops.
I have also contributed articles to industry publications including SysAdvent, Code as Craft, InfoQ, and The Pastry Box Project. My articles have spanned topics ranging from technical discussions on database upgrades, monitoring best practices, and ELK administration to cultural topics like burnout, culture design, and building organizational resilience. If you are interested in having me write something for your publication, please reach out.
Selected articles include:
How to build organizational resilience (2021)
In the Resilience issue of Increment, I look at what makes systems resilient to unknown unknowns, as opposed to just robust to a known set of failure modes. With this as an example, I then present a process for designing and implementing lasting culture changes.
Putting the Dev in Devops (2016)
This Code as Craft post describes the process around bringing software engineering best practices to a suite of infrastructure operations tools. Aimed at ops practitioners who would like to add rigor and robustness to their tooling and practices, it is a practical example of making changes in the face of existing constraints.
Crafting a Resilient Culture: Or, How to Survive an Accidental Mid-Day Production Incident (2019)
Telling the story of the Apache SNAFU that won me Etsy’s three-armed-sweater award, this article discusses the human side of incident response. Anyone interested in resilience engineering will find real-world examples of how engineering cultures can be changed to optimize for resilience and organizational learning.
How to put the “plus” in STAFF+ Engineer (2023)
Written for GitHub’s ReadME Project, this post discusses what it means to be a staff-level engineer, from helping define goals and guide your team towards them to improving decision-making and other collaborative skills that can make you a force-multiplier within your organization.
How to Create Sustainable On-Call Rotations (2022)
This LeadDev article looks at the human side of on-call rotations. Aimed at either engineering managers or on-call engineers themselves, it looks at strategies for creating an on-call system that can achieve your engineering goals without burning out your staff in the process.