Moneyball for software engineering, part 2

What if Billy Beane managed a software team?

scoreboard by popofatticus, on FlickrBrad Pitt was recently nominated for the Best Actor Oscar for his portrayal of Oakland A’s general manager Billy Beane in the movie “Moneyball,” which was also nominated for Best Picture. If you’re not familiar with “Moneyball,” the subject of the movie and the book on which it’s based is Beane, who helped pioneer an approach to improve baseball team performance based on statistical analysis. Statistics were used to help find valuable players that other teams overlooked and to identify winning strategies for players and coaches that other teams missed.

Last October, when the movie was released, I wrote an article discussing how Moneyball-type statistical analysis can be applied to software teams. As a follow-up, and in honor of the recognition that the movie and Pitt are receiving, I thought it might be interesting to spend a little more time on this, focusing on the process and techniques that a manager could use to introduce metrics to a software team. If you are intrigued by the idea of tracking and studying metrics to help find ways to improve, the following are suggestions of how you can begin to apply these techniques.

Use the data that you have

The first puzzle to solve is how to get metrics for your software team. There are all kinds of things you could “measure.” For example you could track how many tasks each developer completes, the complexity of each task, the number of production bugs related to each feature, or the number of users added or lost. You could also measure less obvious activities or contributions, such as the number of times a developer gets directly involved in customer support issues or the number of times someone works after hours.

In the movie “Moneyball,” there are lots of scenes showing complicated-looking statistical calculations, graphs, and equations, which makes you think that the statistics Billy Beane used were highly sophisticated and advanced. But in reality, most of the statistics that he used were very basic and were readily available to everyone else. The “innovation” was to examine the statistics more closely in order to discover key fundamentals that contributed to winning. Most teams, Beane realized, disregarded these fundamentals and failed to use them to find players with the appropriate skills. By focusing on these overlooked basics, the Oakland A’s were able to gain a competitive edge.

To apply a similar technique to software teams, you don’t need hard-to-gather data or complex metrics. You can start by using the data you have. In your project management system, you probably have data on the quantity and complexity of development tasks completed. In your bug-tracking and customer support systems, you probably have data on the quantity, rate, and severity of product issues. These systems also typically have simple reporting or export mechanisms that make the data easy to access and use.

Looking at this type of data and the trends from iteration to iteration is a good place to start. Most teams don’t devote time to examining historical trends and various breakdowns of the data by individual and category. For each individual and the team as a whole, you can look at all your metrics, focusing, at least to start, on fundamentals like productivity and quality. This means examining the history of data captured in your project management, bug tracking, and customer support systems. As you accumulate data over time, you can also analyze how the more recent metrics compare to those in the past. Gathering and regularly examining fundamental engineering and quality metrics is the first step in developing a new process for improving your team.

Establish internal support

If you saw the movie “Moneyball,” you know that much was made of the fact that some of the old experienced veterans had a hard time getting on board with Beane and his new-fangled ideas. The fact that Beane had statistics to back-up his viewpoints didn’t matter. The veterans didn’t get it, and they didn’t want to. They were comfortable relying on experience and the way things were already done. Some of them saw the new ideas and approaches — and the young guys who were touting them — as a threat.

If you start talking about gathering and showing metrics, especially individual metrics that might reveal how one person compares to another, some people will likely have concerns. One way to avoid serious backlash, of course, is to move slowly and gradually. The other thing you might do to decrease any negative reaction is to cultivate internal supporters. If you can get one, two, or a few team members on board with the idea of reviewing metrics regularly as a way to identify areas for improvement, they can be a big help to allay the fears of others, if such fears arise.

How do you garner support? Try sitting down individually with as many team members as possible to explain what you hope to do. It’s the kind of discussion you might have informally over lunch. If you are the team manager, you’ll want to explain carefully that historical analysis of metrics isn’t designed to assign blame or grade performance. The goal is to spend more time examining the past in the hopes of finding incremental ways to improve. Rather than just reviewing the past through memory or anecdotes, you hope to get more accuracy and keener insights by examining the data.

After you talk with team members individually, you’ll have a sense for who’s supportive, who’s ambivalent, and who’s concerned. If you have a majority of people who are concerned and a lack of supporters, then you might want to rethink your plan and try to allay concerns before you even start.

Once you begin gathering and reviewing metrics as a team, it’s a good idea to go back and check in periodically with both the supporters and the naysayers, either individually or in groups. You should get their reactions and input on suggestions to improve. If support is going down and concern is going up, then you’ll need to make adjustments, or your use of metrics is headed for failure. If support is going up and concern is going down, then you are on the right track. Beane didn’t get everyone to buy into his approach, but he did get enough internal supporters to give him a chance, and more were converted once they saw the results.

Codermetrics: Analytics for Improving Software Teams — This concise book introduces codermetrics, a clear and objective way to identify, analyze, and discuss the successes and failures of software engineers — not as part of a performance review, but as a way to make the team a more cohesive and productive unit

Embed metrics in your process

While there might be some benefit to gathering and looking at historical metrics on your own, to gain greater benefits, you’ll want to share metrics with your team. The best time to share and review metrics is in meetings that are part of your regular review and planning process. You can simply add an extra element to those meetings to review metrics. Planning or review meetings that occur on a monthly or quarterly basis, for example, are appropriate forums. Too-frequent reviews, however, may become repetitive and wasteful. If you have biweekly review and planning meetings, for example, you might choose to review metrics every other meeting rather than every time.

To make the review of metrics effective and efficient, you can prepare the data for presentation, possibly summarizing key metrics into spreadsheets and graphs, or a small number of presentation slides (examples and resources for metric presentation can be found and shared at codermetrics.org). You will want to show your team summary data and individual breakdowns for the metrics gathered. For example, if you are looking at productivity metrics, then you might look at data such as:

  • The number and complexity of tasks completed in each development iteration.
  • A breakdown of task counts completed grouped by complexity.
  • The total number of tasks and sum of complexity completed by each engineer.
  • The trend of task counts and complexity over multiple development iterations.
  • A comparison of the most recent iteration to the average, highs, and lows of past iterations.

To start, you are just trying to get the team in the habit of looking at recent and historical data more closely, and it’s not necessary to have a specific intent defined. The goal, when you begin, is to just “see what you can see.” Present the data, and then foster observations and open discussion. As the team examines its metrics, especially over the course of time, patterns may emerge. Individual and team ideas about what the metrics reveal and potential areas of improvement may form. Opinions about the usefulness of specific data or suggestions for new types of metrics may come out.

Software developers are smart people. They are problem-spotters and problem-solvers by nature. Looking at the data from what they have done and the various outcomes is like looking at the diagnostics of a software program. If problems or inefficiencies exist, it is likely the team or certain individuals will spot them. In the same way that engineers fix bugs or tune programs, as they more closely analyze their own metrics, they may identify ways to tune their performance for the better.

There’s a montage in the middle of the movie “Moneyball” where Beane and his assistant are interacting with the baseball players. It’s my favorite part of the movie. They are sharing their statistics-inspired ideas of how games are won and lost, and making small suggestions about how the players can improve. Albeit briefly, we see in the movie that the players themselves begin to figure it out. Beane, his assistant, the coaches and the players are all a team. Knowledge is found, shared, and internalized. As you incorporate metrics-review and metrics-analysis into your development activities, you may see a similar organic process of understanding and evolution take place.

Set short-term, reasonable goals

Small improvements and adjustments can be significant. In baseball, one or two runs can be the difference between a win or a loss, and a few wins over the course of a long season can be the difference between second place and first. On a software team, a 2% productivity improvement equates to just 10 minutes “gained” per eight-hour workday, but that translates to an “extra” week of coding for every developer each year. The larger the team, the more those small improvements add up.

Once you begin to keep and review metrics regularly, the next step is to identify areas that you believe can be improved. There is no rush to do this. You might, for example, share and review metrics as a team for many months before you begin to discuss specific areas that might be improved. Over time, having reviewed their contributions and outcomes more closely, certain individuals may themselves begin to see ways to improve. For example, an engineer whose productivity is inconsistent may realize a way to become more consistent. Or the team may realize there are group goals they’d like to achieve. If, for example, regular examination makes everyone realize that the rate of new production bugs found matches or exceeds the rate of bugs being fixed, the team might decide they’d like to be more focused on turning that trend around.

It’s fine — maybe even better — to target the easy wins to start. It gets the team going and allows you to test and demonstrate the potential usefulness of metrics in setting and achieving improvement goals. Later, you can extend and apply these techniques to other areas for different, and possibly more challenging, types of improvements.

When you have identified an area for improvement, either for an individual or a group, you can identify the associated metric and the target goal. Pick a reasonable goal, especially when you are first testing this process, remembering that small incremental improvements can still have significant effects. Once the goal is set, you can use your metrics to track progress month by month.

To summarize, the simple process for employing metrics to make improvements is:

  1. Gather and review historical metrics for a specific area.
  2. Set metrics-based goals for improvement in that area.
  3. Track the metrics at regular intervals to show progress toward the goal.

The other thing to keep in mind when getting started is that it’s best to focus on goals that can be achieved quickly. Like any test case, you want to see results early to know if it’s working. If you target areas that can show improvement in less than three months, for example, then you can evaluate more quickly whether utilizing metrics is helpful or not. If the process works, then these early and easier wins can help build support for longer-term experiments.

Take one metric at a time

It pays to look at one metric at a time. Again, this is similar to tuning a software program. In that case, you instrument the code or implement other techniques to gather performance metrics and identify key bottlenecks. Once the improvable areas are identified, you work on them one at a time, tuning the code and then testing the results. When one area is completed, you move on to the next.

Focusing the team and individuals on one key metric and one area at a time allows everyone to apply their best effort to improve that area. As with anything else, if you give people too many goals, you run the risk of making it harder to achieve any of the goals, and you also make it harder to pinpoint the cause of failure should that occur.

If you are just starting with metrics, you might have the whole team focus on the same metric and goal. But over time you can have individuals working on different areas with separate metrics, as long as each person is focused on one area at a time. For example, some engineers might be working to improve their personal productivity while others are working to improve their quality.

Once an area is “tuned” and improvement goals are reached, you’ll want to continue reviewing metrics to make sure you don’t fall back. Then you can move on to something else.

Build on small successes

Let’s say that you begin reviewing metrics on production bug counts or development productivity per iteration; then you set some small improvement targets; and after a time, you reach and sustain those goals. Maybe, for example, you reduce a backlog of production bugs by 10%. Maybe this came through extra effort for a short period of time, but at the end, the team determines that metrics helped. Perhaps the metrics helped increase everyone’s understanding of the “problem” and helped maintain a focus on the goal and results.

While this is a fairly trivial example, even a small success like this can help as a first step toward more. If you obtain increased support for metrics, and hopefully some proof of the value, then you are in a great position to gradually expand the metrics you gather and use.

In the long run, the areas that you can measure and analyze go well beyond the trivial. For example, you might expand beyond core software development tasks and skills, beyond productivity and quality, to begin to look at areas like innovation, communication skills, or poise under pressure. Clearly, measuring such areas takes much more thought and effort. To get there, you can build on small, incremental successes using metrics along the way. In so doing, you will not only be embedding metrics-driven analysis in your engineering process, but also in your software development culture. This can extend into other important areas, too, such as how you target and evaluate potential recruits.

Moneyball-type techniques are applicable to small and big software teams alike. They can apply in organizations that are highly successful as well as those just starting out. Bigger teams and larger organizations can sometimes afford to be less efficient, but most can’t, and smaller teams certainly don’t have this luxury. Beane’s great success was making his organization highly competitive while spending far less money (hence the term “Moneyball”). To do this, his team had to be smarter and more efficient. It’s a goal to which we can all aspire.

Jonathan Alexander looked at the connection between Moneyball and software teams in the following webcast:

Photo: scoreboard by popofatticus, on Flickr

Related:

Related

Sign up for the O'Reilly Programming Newsletter to get weekly insight from industry insiders.
  • http://www.third-bit.com Greg Wilson

    Jorge Aranda, who has done some fascinating studies of how software teams actually work, wrote a critique of your first “Moneyball” post at http://www.neverworkintheory.org/?p=225 that readers may find interesting. (Full disclosure: I also contribute to the “Never Work In Theory” blog.) We’d welcome your comments on his comments…

  • http://www.evolvondemand.com Dan Enthoven

    I think the key thing about the Moneyball story for companies is that the most important decision is who you hire. This is what Billy Beane was really looking at.

    A great hire raises the quality of everyone’s work, and a bad hire demotivates everyone.

    Most companies have a really bad process for determining who to hire. The most common tool is use interviews. Charismatic people do well, but charisma doesn’t equal performance.

    Other people give tests or “thought questions.” (Write a code to find the prime numbers in this array. How many AA batteries get sold each year?)

    All the data out there shows that these filtering techniques are close to coin tossing when it comes to predicting performance.

    As companies think about how to use analytics, i think hiring has got to be the first thing they review.