PERT Estimation 101

bear-to-do-list

Task estimates; the very real enemy of the developer. The numbers you provide for a task or piece of project work (whether this is in hours, days, story points, etc.), as an ‘estimate’, are often not being viewed as an approximation by all involved. This particular subject is covered, rather marvellously by Robert C.Martin, aka Uncle Bob, in his book The Clean Coder: A Code of Conduct for Professional Programmers; a book that I implore every developer to read. To quote Robert C.Martin directly, I believe this covers the underlying issue pretty well in a good number of cases:

The problem is that we view estimates in different ways. Business likes to view estimates as commitments. Developers like to view estimates as guesses. The difference is profound.

On a good number of occasions, I have found myself taking a good hiding, due to the estimates I have offered up on project work; normally where you find the estimate has been relayed to the customer as a concrete guarantee. In many cases, both parties should have shouldered more responsibility for this disparity; as the developer, I should have been more explicit as to what my estimates actually meant. In turn, those relaying information to the customer should have probably dug deeper into the heart of the matter, to extract further raw information as to the ‘likelihood’ that something could be achieved by a certain time.

Good dialogue and communication is a key factor, of course, as is making sure that you are always clear when you are providing an ‘estimate’ and making a ‘commitment’.

This is a subject that I always find myself circling back to and that I truly want to get better at, plain and simple. For that reason, this post will be geared towards looking at one potential strategy for improving the estimation process; the Program Evaluation and Review Technique (which is covered again pretty well in ‘The Clean Coder’).

What is PERT?

The Program Evaluation and Review Technique (PERT) encapsulates an analysis technique that is used for more accurately estimating large and complex projects. The basic idea is to reveal a distribution, or create a probabilistic structure, to reduce inaccuracies in estimates and attempt to account for uncertainty in a project’s schedule. It was created in 1957 for use by the U.S. Navy Special Projects Office to manage the Polaris nuclear submarine project.

In short, the process involves identifying the likely values for the minimum time and maximum time a task will take whilst pinpointing the ‘most likely’ estimate for a task’s completion, from observing a calculated distribution. Uncertainty is factored in using some standard deviation values.

How does it work?

The concept is as follows; first, when an estimate is required, you provide three numbers:

  • O: Optimistic Estimate (known as the ‘wow, how the hell did I do it that quick!’ estimate).
  • N: Nominal Estimate (the estimate with the greatest chance of being successful, or the correct estimate essentially).
  • P: Pessimistic Estimate (the doomsday estimate, where everything goes wrong except the Earths tectonic plates opening up to consume you, as you code the feature).

The first figure, ‘O’, involves thinking about the ‘pipe dream’ figure, i.e. if you could type the code at 1000 thousand lines a minute, making zero errors, and everything was to plug into place and work the first time. If Chuck Norris was coding the feature, this would be his estimate (this should have less than a one percent chance of being the actual time taken in order for the calculation to be meaningful). Next up, come up with the figure for N, which is defined as the estimate with the highest ‘chance’ of being the correct one. On a distribution bar chart of estimates, this would be the highest bar. Lastly, define P, which should be presented as the ‘worst case’ scenario estimate, stopping short of accounting for absolute catastrophes (again, this should have less than a one percent chance of being the ‘correct’ estimate, when all is said and done).

You’ve got three numbers, hoorah! A probability distribution can be gleaned for these figures simply enough (with ET being the expected time, I’ve changed the name of the variables to make more sense to me, when compared to ‘The Clean Coder’ or Wiki, but feel free to go with what fits for you):

ET = (O + 4N + P) / 6

The wiki definitions for all of these numbers are good, so I’ve included them below:

Wiki Definitions for PERT Calculation Facets

Wiki Definitions for PERT Calculation Facets.

The last part of this wonderful, little, puzzle is calculating the standard deviation (SD); which equates to getting your mitts on a value that indicates the degree of uncertainty that is associated with a given task:

SD = (P - O) / 6

Let’s observe a detailed example to get to grips with this fully, concentrating on how some extra elements come into play for a series of tasks.

A detailed example

In this example, Ryan (a name that came totally off the top of my head, milliseconds of thought went into that!) has a total of five items that he needs to estimate, that are part of a full project. Here are the specific tasks and Ryan’s initial estimates (expressed in days), before applying PERT calculations:

SIDE NOTE: Breaking a large project or piece of work into smaller tasks can often be a good strategy for mitigating risk up front and weeding out bugbears ahead of time.

Task Estimate
Write stored procedures for the customer export page 3
Write Web API components for the customer export page 2
Build the business logic layer for the customer export page 5
Build and style the customer export page (UI) 3
Assist Hayley with automated tests 1

That comes to a grand total of 14 days.

Ryan is now asked to apply the PERT approach to his estimates, and the mental cogs start to turn. Here’s a run-down of Ryan’s thoughts on each part of the project, in order:

  1. Stored Procedures: “Ok, I think I am fairly likely to complete this within 3 days, but there are quite a few elements of Dave’s existing helper stored procedures that I don’t understand. So, I’ll actually have to factor that in. Oh, and I have to do a data conversion!”
  2. Web API Components: “I’ve written most the of Web API components up to this point, so I’m well versed in this area at least and am confident with this figure. Even if everything went wrong it would only stretch to 3 days.”
  3. Business Logic Layer: “Jane wrote most of this layer and I’m not that well-versed in how it hangs together. The domain behind this is actually quite new to me. This could take significantly more time…”
  4. Building the page and styling: “The specification looks good and I’ve done a few similar pages in the past. It could take slightly longer if things did go wrong, as there are a few UI bells and whistles, including effects, that are required…”
  5. Assisting Hayley: “I was just kind of time boxing this in my mind, it’s almost certain she will need support beyond this, however – it really is an unknown!”

After a bit of head scratching and discussion with other team members, the figures below come out of the boiling pot. PERT has been applied, times are expressed in days and abbreviations used for the sake of space. Values are rounded to one decimal place:

Task O N P ET SD
Write stored procedures for the customer export page 3 5 11 5.7 1.3
Write Web API components for the customer export page 1 2 3 2 0.3
Build the business logic layer for the customer export page 2 6 13 6.5 1.8
Build and style the customer export page (UI) 1 4 9 4.3 1.3
Assist Hayley with automated tests 1 3 7 3.3 1

ET and SD values are calculated using the formulas previously discussed. Firstly, using the ET values, we can understand that Ryan will ‘likely’ be finished with these tasks (based on the sum of the ET values) in 21.8 days, let’s say 22 days then. To get a feel for the ‘uncertainty’ behind these tasks we have to look at the square root of the sum of the squares of the SD values for the tasks (a bit of a mouthful), as shown here:

(1.3 2 + 0.3 2 + 1.8 2 + 1.3 2 + 1 2) 1/2 =
(1.69 + 0.09 + 3.24 + 1.69 + 1) 1/2 =
7.71 1/2 = ~2.8 days

We can surmise from this that there could be 2.8 days, so let’s say 3 days, of uncertainty to factor in; therefore Ryan could well take somewhere in the region of 25 days to complete the series of tasks, or possibly even 28 days (if we double the time calculated for SD across the sequence). However, we know from this that anything over these values is an unlikely outcome. We are a far cry from the original estimate of 14 days to cover all of the tasks.

Summary

The general premise is to add some pragmatism and realism to the whole process of estimation. I have, on many occasions, estimated too optimistically and then have struggled to hit the mark; this is just one technique for attempting to mitigate that risk by being more methodical in the estimation process.

I hope you’ve enjoyed this primer and happy coding, estimating, etc. Until the next time!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s