Insights

Improving Bug Hunting [Part 1]

An overview of ideas around the activity of finding and solving bugs by Mário Constantino

in Software development, IT strategyBy Mário Constantino, Tech Lead

Introduction

At some point in their careers, all software developers have had to deal with bugs reported by users. This can come in the form of an email from customer management or through a system where users can open support tickets that are picked up by the development team. However, this work is generally disliked by developers, who prefer to work on new projects instead of facing their own mistakes by solving bugs.

In this series, I will explore some ideas around the activity of finding and solving bugs, explain why developers dislike it, and suggest ways we can improve it. That's the whole point!

A TYPICAL DEBUG SESSION

It's worth noting that despite the frustration that comes with solving production bugs - we're not doing enough to improve the efficiency of the process.

One morning, our team was assigned a new ticket to fix a bug. Although I wasn't personally involved in solving bugs for that system, my colleague and I jumped in to assist our colleagues who handle these issues. To illustrate the situation, I'll use the example of a typical e-commerce application.

In this case, we have the typical components of an application. There are multiple systems for different domains within the product. Let's examine each of them:

  • Customer System - This system manages the concept of a customer, including configuration around the customers.
  • Products System - This is where products are stored and their configuration is managed.
  • Carts System - Users generally add items to their cart. This system handles that logic.
  • Calculation System - This system takes in the user's cart with the products and calculates the total cost and available delivery dates, among other things.
  • Orders System - This system is self-explanatory, as it handles the processing of orders.
STBIN%

 

organigrammario

Figure 1 - How information flows within a typical e-commerce system

Note that even if you don't have a separate system for each concept, you will usually end up having all the concepts within a single system.

Here's a more organized and concise version of the steps:

  • 1. Identify the problem: The Order ID is not returning any results in the Order database.
  • 2. Check if the problem is with the Calculation system: Review the logs of the Calculation system to see if any errors occurred during the calculation of the order.
    • Identify the path to the log file.
    • Search for any errors related to the Order ID or the calculation process.
    • Keep in mind any time differences between the server timezone and your own timezone.
    • Look for any Null Reference Exceptions or other issues in the logs.
  • 3. If the problem is not with the Calculation system, check the Product configuration.
    • Check the UI for the Products configuration, but keep in mind it may not be user-friendly.
    • Use SQL to query the products table in the database.
    • Join with the configuration table to view the relevant configuration information.
    • Review the results for any issues.
  • 4. Check the Customer configuration:
    • Use SQL to query the customer database for the relevant customer information.
    • Join with the configuration table to view the relevant configuration information.
    • Compare with other customers to identify any discrepancies.
    • Check when the configuration was last changed and if it coincides with the issue.
  • 5. Once the issue has been identified, take steps to fix it and prevent it from happening again in the future.

AND WHAT ENDED TO BE THE PROBLEM?

The discovery of the missing configurations during the data migration in the Customer's system was a minor issue, but the time and resources spent in identifying and resolving it were substantial. The team spent almost an hour discussing, interpreting values, and consulting with colleagues to get to the root cause of the problem. Despite their efforts, they were unable to identify the issue until they discovered the mishandled configurations.

This experience is not uncommon in complex systems, as even small issues can take a significant amount of time and resources to resolve. However, it is important to have a streamlined process in place to identify and resolve such issues quickly and efficiently.

Upon reflection, the team realized that they were not as efficient as they could have been and spent more time than necessary on this issue. They identified that there were opportunities to improve their process, particularly in identifying where time was spent and implementing changes to address similar issues more effectively in the future.

To that end, the team plans to analyze their current process and identify areas where improvements can be made. They will evaluate the tools they use, the methods they employ, and the roles and responsibilities of team members to identify opportunities for streamlining their process.

CONCLUSION

The team recognizes that there is no magic bullet to cover all scenarios, but they are committed to implementing changes that can address 85% of issues more efficiently. By doing so, they hope to reduce the time and resources required to identify and resolve issues, ultimately increasing the efficiency of their work and improving the overall performance of their complex system

Powered by ChronoForms - ChronoEngine.com

Get in touch