Performance Zone is brought to you in partnership with:

Justin Bozonier is the Product Optimization Specialist at GrubHub formerly Sr. Developer/Analyst at Cheezburger. He's engineered a large, scalable analytics system, worked on actuarial modeling software. As Product Optimization Specialist he is currently leading split test design, implementation, and analysis. The opinions expressed here represent my own and not those of my employer. Justin is a DZone MVB and is not an employee of DZone and has posted 27 posts at DZone. You can read more from them at their website. View Full User Profile

Thoughts on the Zero Defect Mentality of TDD

10.16.2012
| 6249 views |
  • submit to reddit

Challenges that led me here

  • Heated arguments at work regarding how much TDD is enough and how little is too little. How do we find common ground?
  • An acknowledgement of technical debt and a confusion about how to leverage it. How much debt is too much?
  • Being labeled as pedantic and a zealot. Is a Zero-Defect Mindset ever worthwhile? When?
  • Learning exercise in how we can gain concrete insights using our intuition in a methodical fashion. How can I communicate abstract ideas without concrete evidence in a rigorous manner?

This article represents my lessons learned from this exploration.

Making the Abstract Concrete

It was a normal day at work, myself and another co-worker were strongly and passionately arguing for the benefits of strict, pure, clean TDD against a couple of other equally passionate co-workers who were sold on the idea of everything in moderation. Having just completed a four month full time Agile immersion with an amazing albeit very idealistic consultant, his ideas about a zero-defect mindset and the idea that it was practically achievable were seductive. I had entertained my own idealistic fantasies for a while never really thinking they could or should be taken so seriously.

It was liberating. 

Also, it was isolating. Having these thoughts, and that excitement placed me on one extreme side of a continuum with many of my other teammates on the other side or somewhere in the middle, nearer to the side of limiting TDD in the name of practicality. Conversation after conversation, debate after debate, we ended in the same place, perhaps even galvanized a bit by the disagreement and a bit further from finding common ground.

I finally came to understand that regardless of what I knew to be right, everyone on my team had their own perception and their own knowledge of what was right as well. That's not sarcasm. In social interactions there are multiple realities and all of them need to be appreciated and considered valid enough to be worth understanding. 

How could I model my perception of reality in some sort of a concrete way that would enable me to make rigorous (albeit somewhat subjective) predictions? How could I ensure my mental model was at least self-consistent and work-able? Like self-respecting geek, I decided the best way to model uncertainty was to run thousands of simulations and projections of reality to see what lessons could be gleaned.

Finding Common Ground in a Common Purpose

The first decision I had to make was figuring out the underlying metric I would use to compare the two development methodologies. Having been just recently introduced to systems thinking and the Theory of Constraints, I thought a great start would be to use the value throughput of the companies. 

But what is value? When we speak of delivering value to our business customers what is it we are actually delivering? In discussions with my team, we decided that business value is best seen as the present day value of your company were it to be valued by an external party. For the purposes of the simulation, I assume the value delivered by completed stories to be equivalent to some randomly assigned numbers provided by a value distribution and assigned without regard for feature size. That's right, it means a feature that takes next to nothing to develop may create an enormous amount of value for the company.

For further assumptions and specifics of my model, read on.

My Model for Thinking About This (AKA My Domain Model)

Concepts and their role in the simulation:
  • User Story- In this simulation, a User Story is the smallest unit of work that the Product Development Team can work on that provides the slightest bit of business value. They also have an associated size.
  • Business Customers- Generates a random set of randomly sized (to a discrete distribution) stories each iteration. Their value and size are also randomly assigned upon creation.
  • Product Backlog- Repository for all stories. New stories are all added as a top priority in the order delivered. Bugs are randomly dispersed into the Product Backlog when they are received. 
  • Product Development Team- Anyone and everyone responsible for getting the release out the door. This includes programmers, testers, technical writers, etc., etc. The Product Development Team iterates over the Product Backlog and works to complete stories. They also are the ones deciding the cost of the various stories. Over time the speed of their work can go up if a range (minimum velocity and maximum velocity) > 1 is specified on construction. The function which controls the team's performance improvement is an "Experience Curve" as documented here: http://en.wikipedia.org/wiki/Experience_curve_effects Without getting too into it, this experience curve essentially models the decreasing cost of development over time.
  • End Users- Who the Product Development Team releases to. Because the Product Development Team includes *everyone* needed to release the software, the End User may receive the software immediately afterwards. End Users discover bugs in the software. This is currently set to a constant rate per story per iteration. So if the defect rate is 1%, then a team with a hundred stories complete can expect to have, give or take, one story per iteration reenter the Product Backlog as a new Bug Story. The size of the Bug Story is randomly determined based upon a discrete bug size distribution. 
  • Bug Story- A Bug Story is a story that is focused on fixing a defect in the software. These stories are unlike normal stories in that they have no real value for the team and thus don't improve throughput. A Bug Story actually represents more of an opportunity cost as valuable work could be done in its place if the Bug Story hadn't needed to be written.
  • Support Team- Who the bugs are reported to. Currently only really used to track the total bug count. Could be used in the future to eliminate bugs due to "user error".
Each of these concepts map directly to objects in the Javascript simulation.

How The Simulation Works

First this is how the top most level of the simulation runs:
function run_simulation()
{
  var simulation_settings =
  {
    'teams_to_simulate' : 10,
    'weeks_to_project' : 1 * 52,
    'story_size_distribution' : [0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 4],
    'bug_size_distribution' : [0,0,0,1,1,1,1,1,1,1,1,1,2,2,2,2,2,4,4],
    'tdd_defect_ratio' : .07,
    'min_tdd_velocity' : 1,
    'max_tdd_velocity' : 16,
    'std_defect_ratio' : .14,
    'min_std_velocity' : 20,
    'max_std_velocity' : 20,
  };
  
  var tdd_std_comparison_chart = new Chart();
  var developmentProcessFactory = new DevelopmentProcessFactory(tdd_std_comparison_chart);
  
  var simulated_development_teams = new SimulatedDevelopmentTeams(simulation_settings);
  simulated_development_teams.create_tdd_development_teams_using(developmentProcessFactory);
  simulated_development_teams.create_standard_development_teams_using(developmentProcessFactory);
  
  for(var i = 0; i < simulation_settings.weeks_to_project; i++)
  {
    simulated_development_teams.iterate();
  }
  
  tdd_std_comparison_chart.draw_chart_to('value_delivered_chart');
}

The distributions you see are assumed to be valid randomly chosen samples of real data. There's one set of data for story sizes and another set of data for bug sizes. In the interest of full disclosure, this is NOT real data. That's for my co-workers only. :) 

 

The overall process is shown in the following code:
DevelopmentProcess.prototype.iterate = function()
{
  this._report.next_iteration();

  this._customer.deliver_new_stories_to(this._product_backlog);
  this._development_team.work_from(this._product_backlog);
  this._development_team.move_finished_stories_to(this._end_users);
  this._end_users.test_stories_and_report_failures_to(this._bug_queue);
  this._bug_queue.prioritize_and_move_bugs_to(this._product_backlog);
  
  this._product_backlog.report_to(this._report);
  this._end_users.report_to(this._report);
  this._bug_queue.report_to(this._report);
};

Simultaneously, that code also shows why I have a disdain for the fixation many developers have to instantiate objects within other objects to "hide worthless noise". It reads pretty damn well.

Constructing the simulated development teams is handled by the DevelopmentTeamFactory which constructs the teams using the TDD methodology as well as the standard development teams:
DevelopmentProcessFactory.prototype.create_tdd_development = function (simulation_settings)
{
  var tdd_development_team_report = new Report();
  tdd_development_team_report.plot_results_in_color('green');
  tdd_development_team_report.plot_results_on(this._tdd_comparison_chart);
  
  var business_customers = new BusinessCustomers();
  
  var product_backlog = new ProductBacklog();
  
  var development_team = new DevelopmentTeam(simulation_settings.story_size_distribution);
  development_team.velocity_begins_at(simulation_settings.min_tdd_velocity);
  development_team.maximum_possible_velocity_is(simulation_settings.max_tdd_velocity);
  
  var random_bug_size_generator = new DiscreteDistribution();
  random_bug_size_generator.use_these_as_samples(simulation_settings.bug_size_distribution);
  
  var end_users = new EndUsers(random_bug_size_generator);
  end_users.find_this_many_bugs_per_story_per_iteration(simulation_settings.tdd_defect_ratio);
  
  var support_team = new SupportTeam();
  
  // This is an ugly construction... How to improve it??
  var development_process = new DevelopmentProcess(business_customers, product_backlog, development_team, end_users, support_team, tdd_development_team_report);
  
  return development_process;
};

That code does instantiate within a method call. In this case I rationalize it as being a part of a factory and that being the factory's concern. I'm not certain that makes for the cleanest code though. I'll still be iterating over this after I publish this article. :)

Here's the rest of the code: http://github.com/jcbozonier/Monte-Carlo-Supporting-TDD

Everything should work in the latest versions of FireFox and Chrome. It requires HTML 5 in order to use the canvas for charting.

Here's a sample chart with 50 TDD teams (green tick marks) and 50 standard teams (red tick marks) each running over two years:
   Feel free to open the code and modify the json simulation settings yourself and run your own simulations. Note that as soon as you press the Run button your browser will seize for a bit. I recommend Chrome for its amazing speed. Don't kill the process! I promise it will finish eventually. ;)

[Insert lesson about how important User Experience design is here]

How It Was Built

Baby steps. I started with one team that could be modeled and just showed final iteration stats on the web page. Then I moved onto simultaneously comparing that team with a team using a standard development model using a table of data. Next up I found a pretty good HTML 5 charting API and got the key data visualized (Total Value to Date vs. Time Passed).

Lessons Reinforced
  • Lowering the defect rate, even at the cost of reduced performance, results in higher value throughput in the long term. Lowering the defect rate in the short term however is hardly ever optimal.
  • A higher defect rate results in a much higher spread of possible value throughput... in other words there's a higher variance in what you can expect in terms of value output from a product development team.
  • Every development team has a point where the highest possible testing and quality rigor begins to outperform the less rigorous teams. The trick is identifying where this begins to happen for your particular company or project.
All of these discussions of you should always TDD, always bake quality in, always etc., etc. These statements are just as accurate for the opposing view point for some company. Where is it for your company? You'll need to have some answer for this before you can make any sort of a real argument, at least based upon the ideal of striving for zero defects.

My Hacker Mentality

Given that it's all about where we decide that testing becomes the highest value decision I realized why I never test personal projects. For me, I set the expected life of my projects to be practically zero. Likewise, I end up unable to reasonably justify any testing. By my estimates, testing won't produce any real value. The real problem with my hacker mentality is that I tend to underestimate how long I'll be working on something. Take this simulation as an example. I began it with no tests because I figured I'd slam something together in a night and be done with it. However, I enjoyed it much more than I expected and ended up wanting to explore the nooks and crannies of my model.

Maybe some day I'll learn?

Conclusion

Assuming a business life of time in years of T where T is far enough out in the future, the more testing the better and the more defects you can prevent the better. This is regardless of the costs we encounter because in the long run if we don't test our primary concern will be just preventing errors from occurring in pre-existing features, thus barring work on any new features that add value.

However, if you can assure yourself a limited time period T, you can rest assured it may actually be in your best interest to not have a zero defect mindset. Just don't think that you can change to a 20 year life span at the end of the 5th and see an instant turn around in value throughput.

If you've read through this far, you deserve a reward. Here's my conclusions on the questions I asked in the beginning:
  • How do we find common ground? Share our assumptions and make them explicit. Codify them so that they can't be conveniently shifted when the arguments get uncomfortable.
  • How much debt is too much? I didn't model technical debt in terms of needed refactoring... just in terms of defect likelihood. Too much debt is so much that you spend most time paying maintenance costs than delivering value.
  • Is a Zero-Defect Mindset ever worthwhile? When? Yes it is, when you have set a goal of a sufficiently large life time for your product.
  • How can I communicate abstract ideas without concrete evidence in a rigorous manner? Hopefully I just did.
   
Published at DZone with permission of Justin Bozonier, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)