The Origin and Key Principles of DevOps
In Part 1 of this series we took a look at some common symptoms of organizations that could benefit from the adoption of DevOps principles. In this post we’ll dig a little deeper into DevOps and answer questions like: Where did DevOps originate? How does DevOps define itself? And what are some of DevOps core tenants? Let’s try to answer the first question, where did DevOps originate?
The Origin of DevOps
DevOps as a term originated in 2009 following a talk at the O’Reilly Velocity Conference titled “10+ Deploys per Day: Dev and Ops Cooperation at Flickr.” John Allspaw and Paul Hammond walked through some of the pains in the current software development lifecycle, identifying familiar contentious scenarios that had become all too common between development and operations teams. “It’s not our machines, it’s the developers code!”. “We can’t test our code because operations can’t get us a production environment!” John and Paul made the case that the only rational way forward is to integrate development and operations into a more cohesive unit. This talk is widely accepted as the birth of the term DevOps and the beginning of a movement in IT that is still very much with us today. For a more complete DevOps history checkout The Origin of DevOps: What’s in a Name?
Even though DevOps was coined as a term over 10 years ago, it has become a confusing buzz word in recent years and there are a lot of misconceptions about how DevOps is actually defined. There are job postings for DevOps Engineers, a seemingly infinite number of tools in the space, and even companies offering DevOps as a service. With all of the misinformation out there it can be hard to understand DevOps. In fact, it may be easier to start with some common misconceptions before moving onto exploring a more formal definition of DevOps.
What DevOps is Not
- A rebranding of System Administrators and/or Operations Teams. DevOps requires more than a title change to be effective. It requires a shift in thinking. Many DevOps professionals become frustrated when they think they are joining an organization that subscribes to a DevOps philosophy, only to find out they are expected to do systems administration work.
- A separate team or role within the organization. While it is possible to start your DevOps journey with a single team, completing the transformation requires a cultural shift that is company wide. Companies that are hiring for a DevOps role are often either very early in their DevOps journey, or do not completely understand the movement.
- Beholden to a particular toolchain. While many professionals that subscribe to DevOps philosophies are versed in technologies for cloud, configuration management, and automation, they know that these technologies are just tools that can easily be exchanged with one another.
- Combining Developers and Operations into a single team and crossing your fingers. While DevOps advocates decreasing the void between developers and operations, simply putting them on the same team will not achieve the desired outcome if the two disciplines do not know how to work together.
DevOps is a young movement and many organizations are still wrestling with its definition and the value it provides. Consequently the mistakes above are not uncommon in this field. Let’s take a look at what DevOps actually is, and how the community defines itself. While this article provides a good overview of DevOps, I highly encourage anyone looking to continue their journey in this space to read The Phoenix Project and The DevOps Handbook. I consider these books must reads for any new hire at Callibrity.
What DevOps Is
- A way of doing work. DevOps is a philosophy that permeates the entire organization from top to bottom. Often the journey to DevOps starts with pockets of highly motivated people or teams adopting DevOps principles. These teams are rewarded by bringing great value to the organization more quickly and reliably than other teams. At this point the benefits of DevOps are measurable and other teams in the organization begin to take notice. This new way of doing work then spreads like a virus to ‘infect’ the entire company.
- Solves the core IT conflict of stability vs. innovation. In traditional organizations, there is a constant struggle between those that want to add new features to software to provide more value to customers, and those that want to keep things running smoothly so as not to disrupt the stability that customers have come to rely on. DevOps aims to remove this conflict by allowing the organization to move quickly while also maintaining stability.
- A combination of lessons learned from a variety of process improvement movements. These include methodologies such as The Toyota Way, Lean Manufacturing, The Goal, Lean Six Sigma, and Agile. Each of these movements has revolutionized their respective industries and become gold standards for operational excellence. DevOps has combined these methodologies and extracted the core principles into what we call The Three Ways.
The Three Ways are instrumental to any DevOps practitioner and deserve a deeper dive. Let’s break down the three ways and provide some examples of each one.
The Three Ways
The First Way - Flow
The First Way says we need to accelerate the flow of work through the organization. This is often done by visualizing how work is done using a process like value stream mapping. Value stream mapping involves creating a visual representation of how work flows through an organization from beginning to end. In software we often think of the start being ideation of a new feature, and the end being the new feature running in production and available to customers. When creating a value stream map you should involve all parties that take part or have a stake in the work being done. This helps to ensure everyone has the same base level understanding and no critical parts of the work are missed.
The First Way promotes the idea of Systems Thinking, meaning we should try to look at an entire system when considering solutions, not just an individual part. It is not valuable to spend an excessive amount of time perfecting the development process if things grind to a halt once we try to deploy to production. In this case the deployment process is a bottleneck within our value stream. Once a value stream map has been completed it becomes very easy to identify such bottlenecks in an organization. A bottleneck is not always time due to time spent doing work. Sometimes the bottleneck may be due to the amount of time pending work spends sitting in a queue after a handoff, or waiting for approval in an overburdened change management process.
Another way to improve the flow of work is to automate whenever possible and decrease Work In Progress (WIP) to manageable levels. Organizations can see great productivity gains by automating tedious repetitive tasks like building applications, deploying code, and provisioning infrastructure. Decreasing WIP allows teams to focus on doing a smaller number of tasks at a time. This increased focus leads to better quality work, decreasing the likelihood that a piece of work is sent backwards in the pipeline. Ideally, all work flows from left to right and is never sent backward. Anytime work is sent back in the pipeline, someone must context switch and stop the flow of any work that is currently in progress.
The first way defines 4 types of work within an organization.
- Business Projects - These are projects that contribute directly to the bottom line of the business. Usually these are new products or new features.
- Internal IT Projects - Internal projects have the goal of making the organization more efficient. Examples of these types of projects usually involve internal tools or automation.
- Changes - DevOps defines changes as work. An example of making changes is deploying code to production or tweaking configuration.
- Unplanned Work - This is the type of work organizations should always strive to minimize to zero if possible. Examples of unplanned work are production incidents and work being sent backwards in the value stream. Unplanned work is detrimental to the productivity of an organization.
The Second Way - Feedback Loops
In order to accelerate the flow of work in an organization, we need to introduce and amplify feedback loops throughout the value stream. The faster a developer or any member of the team receives feedback on broken builds, failing tests, or features that don’t meet requirements; the faster the issue can be rectified and the work can continue to flow. We want to create feedback loops that inform from right to left in the value stream. You will often hear the term ‘shift left’, which means shifting the feedback as far left or as early in the value stream as possible. This is an extremely important concept because the earlier problems are discovered, the less expensive they are to fix.
To create some of these feedback loops organizations often rely on Continuous Integration pipelines that provide immediate feedback on problems with new code. This works by running tests and creating builds whenever developers check new code into the repository. Developers are then alerted of the status of their latest check in. As organizations become more mature they can move to Continuous Deployment, which is having a build ready to deploy at any time. Ultimately you can move to a Continuous Delivery model, in which code flows from a developers machine all the way to production without any manual intervention.
Feedback loops should not only exist in the workflow pipeline, but also in production environments. It is extremely important to have good metrics around how your products are behaving in production so that you can respond to your customers needs. From a software and operations perspective, we are usually concerned about the performance of our production applications. A good set of metrics to use for monitoring production workloads are the 4 Golden Signals. These consist of Error Rate, Latency, Traffic, and Saturation. Production monitoring allows you to obtain real time feedback on problems and respond to issues before your customer even knows they exist.
The Third Way - Continuous Improvement
The natural state of all processes is entropy, and software development is no different. The Third Way calls for creating a culture of continuous experimentation, learning, and improvement within an organization. An organization should look to carve out time for employees to experiment outside of their daily duties, Google is famous for allowing employees 20% of their time for non business project work.
Another key tenant of The Third Way is shared responsibility. A healthy culture avoids pointing fingers or playing the blame game when things go wrong. In fact, a DevOps organization will look at failures as opportunities for improvement. Organizations like Netflix accelerate the pace of production failures using tools like chaos monkey, which purposely takes down infrastructure in an effort to force improvement in software resiliency. This allows them to survive entire AWS region outages.
The Third Way also subscribes to the idea that repetition is mastery. An organization must practice certain scenarios in order to get good at them. An example of this is production incidents. All organizations suffer production incidents, but how the organization responds can vary greatly. The Third Way advocates for a concept called Game Days in which production like outages are created and the team can practice responding in an orderly fashion. The goal of this effort is to reduce Mean Time To Resolution for outages, which is another key metric of highly functioning DevOps organizations.
Today we explored the key principles of DevOps, some misconceptions, and its origin. Keep an eye out for Part 3 of this series where we will take a look at some DevOps best practices, discuss how you can begin a DevOps transformation in your organization, and take a look at a case study.