Debugging Ruby errors in a production environment can be one of the single most frustrating experiences as a developer. More often than not, the error reports are vague, and identifying the underlying causes can be difficult at best. That said, there are a few common steps that can be followed toward identifying and resolving errors that crop up in production.
More information is always better. Using the methods described in Where Are Ruby Errors Logged?, the first step toward diagnosing any issue is to increase the log level. This allows you to see everything that is happening before and after a problem occurs. There is a good chance that the problems you are experiencing have warnings or messages associated with them that don't necessarily make it into the log files by default.
Once you've increased the log level, the next step is to start retaining logs. Identifying the request parameters, user, browser, and any other information surrounding a given error can be incredibly valuable. While accomplishing this may seem difficult from within the context of a server, it can be easily done through the use of Rollbar, and allows you to start establishing a timeline of events without worrying about the log files being rolled over.
Once you've determined the log lines that relate to the problem at hand, the next step is to attempt to replicate the circumstances of the error in a development environment. Before we can do this, we first need to establish some testing guidelines. This involves doing things like mimicking the production database, the user accounts involved, and even the operating system. Everything is fair game here.
Once you've established the circumstances that you think might throw the exception or error you are hunting down, it's time to test them. Never test exceptions in production. Development and staging environments are designed to be breakable without any impact on end users, so always always always try to break your code in a safe environment. To make this step easier, many frameworks offer features that can help with information gathering in a development environment. For example, Ruby on Rails provides a debug command that will render objects in a human-readable format, right on the screen.
If you were unable to replicate the problem in Step 4, then it's back to the drawing board. Not every error is easy to reproduce, and may have time-based constraints or something else making it difficult to replicate in a non-production environment. Jump back to Step 3, adjust your test parameters, and try it all over again.
Whenever exceptions are raised, a backtrace is usually included in the exception instance. But what is a backtrace? In essence, it is a rundown of every file and function that is called leading up to the error. To be clear, a backtrace doesn't include the files and functions that are touched before the error occurred—only the chain of methods that are called as the error happened. This allows you to "trace back" the stack of operations that are performed when an error happened in order to identify exactly what went wrong, and where.
As an example, let's take a look at the backtrace that is returned from the following (incredibly simplistic) code:
def do_the_thing raise "a thing happened!" end do _the_thing
do_the_thing() is executed, an exception is immediately thrown. This results in the following backtrace:
RuntimeError: a thing happened! from (irb):2:in `do_the_thing' from (irb):4 from /home/zach/.rvm/rubies/ruby-2.3.3/bin/irb:11:in `<main>'
As you can see, rather than simply returning the exception message, reading the backtrace in reverse order shows that the exception was raised in IRB, but was triggered by a call to
do_the_thing(). For more complicated stack traces, this can be invaluable as it gives us a lot of post-mortem information, including the file and line the exception happened on.
Ultimately, debugging errors in a Ruby application comes down to the amount of information you can eke out of your logs. Rollbar empowers you to not only identify what is happening, but when, where, to whom, and how often. Rather than having to sort through pages of raw text logs, Rollbar aggregates data from individual exceptions and errors and creates a dashboard with as much information as possible for analyzing these issues.
When properly configured, these exceptions can be tied directly to user accounts, and tracked over time across an easy-to-read graph—with deployment markers to boot. While this doesn't necessarily tell you exactly what an issue is, it comes as close as possible to doing so by providing you with more information than you could possibly ask for.