How software triage is changing with AI agents

"Imagine spending hours manually sifting through error logs, trying to pinpoint the root cause of a critical issue. This is a common challenge in software development."

This is a problem that many engineers experience today and this process can be greatly improved an automated using AI agents. These AI agents will require as much context about the issue or problem as possible to be effective and have any opportunity to help improve this process. The question then is how can we provide these AI agents with the relevant context and real production data to enable this reality.

With this question in mind we started to investigate how engineering teams can get closer to this reality of AI agents understanding what issues are the top issues and then to focus on each one, diagnosing the root cause and providing a fix. Finally we can use this to modify the code using an AI agent.

Let’s get into how we can achieve this…

How to identify Critical Issues

The first problem we face is to understand what is actually important. This can be a simple question like “what are the current trending errors in my project” or “get the top errors in the production environment”.

Today this is a manual job where multiple datasets feed into a ticketing system or customer reported issues ultimately get bumped up and solved. This however is a very manual process and it is time consuming to execute.

If we can utilize production data directly from our application code this process can be streamlined. Let’s look at a real example.

In this example I have the application instrumented with an error monitoring tool (Yes, we are using Rollbar) that gathers real-time data from the application code.

Next, we need to connect our tooling (VS Code or Cursor) to utilize this rich data. The way to do this is to use an MCP server. *We will cover how to connect an MCP Server with your tools like VS Code or Cursor a bit later.

Assuming you have your connection to the AI Agent setup a basic question like “list all the rollbars happening that’s ongoing” or “get the top items in the production environment happening right now, return the item counter”

Now you will receive the top issues (real errors) in your production code that are affecting the most customers and happening most often.

Root Cause Analysis (RCA)

With this data we can now uncover the root cause with the full context as well as a suggested way to fix this.

We can now use a statement like “get item 1025 and diagnose the root case”. What is critical here is that that in this simple statement the AI will receive a large payload with the full stack trace, custom payload data and any other production data you provide via the monitoring tool like Rollbar.

Let’s have a look at what the AI agent returns as the root cause and potential solution.

Here you can see the data as it appears in Rollbar for this item 922 which the AI agent has access to investigate and utilize for the root cause investigation and suggesting code fixes.

The AI Agent has now also suggested some code changes that an engineer can review and accept or use to help resolve the issue. The root cause analysis and how to solve this has now been dramatically reduced in time and an engineer has a great head start to solve the issue.

Implementing the MCP (Model Context Protocol) server

The method used to connect an AI agent with your Rollbar account is using what is called an MCP Server. The MCP server is the connection between the data from Rollbar in this case and the LLM that can utilize this data to help triage and debug the root cause of issues and suggest or even create the resolution (code changes) to the issue.

Here is the MCP server you can use with your existing Rollbar account today.

*Watch our YouTube Video here

https://github.com/rollbar/rollbar-mcp-server

The configuration to setup the MCP Server is straighforward!