Hemant Jain is the founder and owner of Rapidera Technologies, a full-service software development shop and happy Rollbar customer. He and his team focus a lot on modern software delivery techniques and tools. Prior to Rapidera, he managed large enterprise development projects.
Tools like Rollbar have changed the way development teams are recording and managing their exceptions. What used to be a very personal developer-by-developer activity can now be a team-wide tool for greater transparency, and increased application quality.
But many still treat exception monitoring as a developer activity, and they are not leveraging its benefits across all environments, from development to stage and integration, to systems testing and production. In this post and another on QA environments, I will review why exception monitoring in all environments is so beneficial, and some best practices for setting it up.
We are trying to standardize with Rollbar for exception monitoring across environments and clients. It helps our clients have visibility and thus better input into the application and development processes, and it’s a good way for us to ensure quality prior to delivering releases to customers.
But even after release, the tool has been extremely useful for the following reasons:
1. It is needed to support CD and canary releases:
More and more we are asked to consider using continuous delivery (CD) and canary release processes. While in many cases it is not possible or a good fit, when we do get an opportunity to implement CD, exception monitoring is the only way to support it, because code goes from developer to source repo and directly to prod as long as the basic tests show up green. We know very little about the code, and do not have the eyeballs on exceptions that we normally would. This way, an exception in prod is just one more trigger to let us know that a release should be rolled back. We do the same for our server monitoring, so of course we should as well with code. We also use it to help with more supporting data in A/B testing of releases.
2. We cannot always ensure environment parity:
We do our best to ensure parity between all environments, but it’s simply not possible. Different applications have different requirements in terms of infrastructure. Our development environment is fairly static, and it is local. Production can be on different configurations, and even different clouds. So we are dealing with a hybrid scenario where for each client the production environment is unique. These variables can easily cause issues where the differences in the production environment compared to dev causes issues in code—things such as a mismatch on frameworks and other artifacts. When this happens, sometimes exception monitoring is the only way to know.
3. It is a Key Performance Indicator - accountability all the way to production:
I believe that all developers have increasing responsibility for what happens in production. But in our case, because we own the development processes of our clients, there is no question our developers are accountable for code quality all the way up to production. And the analytics we receive from Rollbar help us gauge how well our Dev groups are doing with code quality, where it matters the most—the user. This allows us to quantify the impact directly and leave no question as to how what happens in Dev impacts users.