What do Google’s DevOps Research and Assessment (DORA) and Rollbar have to do with each other? DORA identified four key metrics to measure DevOps performance and identified four levels of DevOps performance from Low to Elite. One way for a team to become an Elite DevOps performer is by focusing on Continuous Code Improvement.
What is DORA?
The DevOps Research and Assessment (DORA) team is a Google research group that is best known for its work on measuring and understanding DevOps practices and capabilities across the IT industry. The group produced an annual State of Devops Report (2014-2019) as well as an ROI whitepaper providing insights into DevOps transformations.
The DORA team identified four key metrics that indicate software development and delivery performance. To find out more about the findings we recommend reading a book called “Accelerate: The Science of Lean Software and Devops: Building and Scaling High Performing Technology Organizations” - co-authored by Nicole Forsgren who is a DORA team lead.
What is Continuous Code Improvement?
Continuous Code Improvement is an approach to maintaining and updating any software application that allows for faster deployments, fewer errors and quicker fixes to problems. Companies that follow this approach have a compact feedback loop to know when there's a code issue that needs to be fixed, fix it, and go back to writing and running code.
The continuous code improvement feedback loops consists of:
- Visibility — real-time identification that there is an issue
- Grouping — recognizing error patterns to provide trustworthy signals
- Automated Response — proactive issue remediation through automated workflows
- Error Resolution — contextual and metadata that enables quick code fix
What Are the Four Key Metrics DORA Identified?
|Aspect of Software Delivery Peformance*||Elite||High||Medium||Low|
For the primary application or service you work on, how often does your organization deploy code to production or release it to end users?
|On-demand (multiple deploys per day)||Between once per day and once per week||Between once per week and once per month||Between once per month and once every six months|
Lead Time for Changes
What is your lead time for changes (i.e. how long does it take to go from code committed to code successfully running in production)?
|Less than one day||Between one day and one week||Between one week and one month||Between one month and six month|
Time to Restore Service
How long does it generally take to restore service when a service incident or a defect that impacts users occurs (e.g. unplanned outage or service impairment)?
|Less than one hour||Less than one day||Less than one day||Between one week and one month|
Change Failure Rate
What percentage of changes to production or released to users result in degraded service (e.g. lead to service impairment or service outage) and subsequently require remediation (e.g. require a hotfix, rollback, fix forward, patch)?
These key DORA metrics are used to help a DevOps organization understand where it stands and how it can improve. The first and most important aspect is to understand where your team is today. From there, setting the path to become an Elite performer and improving your DORA metrics will be much easier as you have a solid baseline to work from.
What Are the Benefits of Becoming an Elite Performer?
Before we start running down the path to becoming an Elite Performer, lets understand what the real benefits are of becoming one.
The DORA research highlighted these key benefits of Elite or High Performing teams.
|Increase the speed of your deployments||Improve the stability of your software||Build security in from the start|
|The best teams deploy 208x more frequently and have lead times 106x faster when compared to low performers.||High performers don't trade off speed and stability. The best teams recover from incidents 2.604x faster and have change fail rates 7x lower.||High performers spend 50% less time fixing security issues compared to low performers.|
Beginning the Journey with Continuous Code Improvement
Let's take a look at each of the four DORA metrics to understand what’s needed to start the journey to become an Elite performer and how Continuous Code Improvement and Rollbar can help.
1. Lead Time For Changes
Elite teams do this in less than one day according to DORA. Ensuring a fast and smooth delivery pipeline is critical to reducing the lead time for any change, be it large or small. Working on smaller, manageable pieces of code allows teams to focus on features and capabilities that are important to the end users (customers).
Rollbar provides the capability to monitor errors occurring in the build pipeline as code changes progress through the testing cycle into a pre-production and ultimately a production environment. Developers get notified in real-time, allowing them to monitor this key DORA metric and can begin making fixes before the automated test suites finish. This greatly reduces the average lead time for new features versus having to wait for testing cycles to finish and only then get to review any issues.
2. Deployment Frequency
Elite teams deploy on-demand (multiple times per day). This deployment frequency can be implemented if you have confidence that your team will be able to identify any error or defect in real-time, quickly do something about it (rollback, feature flag toggle, etc) and improve this DORA metric.
The best way to do this is directly inside the application code itself. External monitoring cannot give you the real-time insight into code execution including handled and unhandled exceptions.
Implementing Rollbar’s SDK into your application allows for real-time error detection with the ability to act on any error in real-time. Rollbar notifies you of any errors occurring early in the deployment phase, so that releases can be hot fixed, paused, or rolled back easily.
Knowing that any error will be caught immediately gives teams the confidence to deploy more frequently, getting more functionality out into the hands of users.
3. Time to Restore Service
Elite teams do this in less than one hour. The first and most important aspect of this DORA metric is knowing about the problem before your customers do - measured as Mean Time To Awareness (MTTA). From there, it’s about how quickly you can resolve the issue - measured as Mean Time To Resolution (MTTR).
The ability to know about any errors and the impact of an error is critical. Rollbar not only provides real-time visibility into errors, but also identifies whether it's an error that's never been seen before, a reactivated error, or one that’s minor and has been occurring for a while.
Equally as important, Rollbar provides you with all insights you need to understand the impact as well as the contextual information and metadata (stack trace, local variables, exact line of code, etc) to fix the error quickly and address this important DORA metric.
The next phase is using this real-time capability and information to reduce the time to restore even further. Rollbar provides automated workflow capabilities to rollback a release (restore) or toggle a feature on or off that could be causing the specific error.
4. Change Failure Rate
Elite teams have a 0-15% change failure rate. A reduction in failed deployments will have a substantial impact on a team’s overall productivity. Spending less time on hotfixes and patches and more time on building great products is what everyone wants.
This requires insight into the quality of your applications code and how many new errors are introduced by version as well as errors that have reappeared. Rollbar gives you insight into each deployed version and the errors, warnings or messages that have been captured in each release. This allows development teams to track their change failure rate over time as each deployment moves into production.
Improve Your Code Quality and DORA Metrics with Rollbar
If you want to become an Elite performer you need Rollbar on your side, the Continuous Code Quality Platform, today! Contact our team and we will show you the path to becoming an Elite performer.