Estimation is waste?

There is an ongoing debate about whether or not to use Estimates in your software development process, largely fueled by #NoEstimates, but as is often the case the battle has been picked up by many that have become over zealous without fully understanding but with great certainty tell you that you should never estimate.

There are many good explanations as to why estimating may not help you and some great explanations of alternative ways to get the information you need, or to help you understand why the information was not needed. But I want to focus on one of my pet-peeves – Thou shalt not estimate because estimating is ‘waste’ Every time I see that I shudder, and quite often I am left with the feeling that the person writing doesn’t understand either Lean or #NoEstimates.

Thou shalt not estimate because estimating is ‘waste’

What is Waste?

First of all the statement makes it sound like waste is bad, in fact the word does seem to imply that waste is bad.  However, Lean is a lot less emotive with the term, Lean is often confused and simplified into simply the reduction/removal of waste.  But this is a very lazy and incorrect interpretation.

Lean is about productivity first and foremost, and is about reducing ‘waste’ ONLY when AND if doing so does not impact the system productivity. In other words, in Lean ‘waste reduction’ is far less important than system improvement, but for whatever reason we get hung up on waste – especially when bashing others. We reduce waste to improve the system – waste reduction is not our goal it is just a tool to help us.

Is Estimation waste?

Is Estimation waste?  The short answer is yes, but that is only part of the truth.  Lean considers waste to be activities that do not directly add value to your product and can be considered either ‘Necessary Waste’ or ‘Pure Waste’.

Waste covers a whole host of things, but waste includes: Planning, testing, reporting, breaks, vacation, sickness, and a great deal of other things far too many to mention.

So calling Estimation ‘waste’ is akin to calling Planning ‘waste’, if we were to eliminate say planning and testing in an effort to reduce waste, we would very likely cause more and worse waste by producing the wrong thing (over production waste), in the wrong order(over production), or poor quality (rework waste).

In other words not all waste is bad and not all waste should be removed – simply calling something ‘waste‘ does not help the conversation.

Sometimes a little waste now can save a lot of waste later.

8 wastes

Is it beneficial?

The real question is whether the activity helps the system? and as a follow-up, is there a better way of achieving the same thing?

Questions to ask when considering waste:

  1. Does this activity help the system to be productive now and in the future? (Would removing it impact our productivity)
  2. Is there a better was to achieve the same outcome?

I can’t answer the question of whether Estimation is beneficial in your system, because every system is unique, if you are using estimation for forecasting purposes then I’d suggest that there may be alternative solutions that are better and #NoEstimates may be a good place to start.  But forecasting is only one use of an estimate, your system may find that estimates are beneficial it is for you to decide.

Next time you see someone use blanket statements that eliminating ‘waste’ as an absolute and unqualified justification for not doing something, please challenge them to qualify their statement. Remember that waste is often necessary, our goal is more often to improve or understand the wasteful activity rather than eliminate it entirely.

1490696188622049861

Let’s consider a world without waste

If you are still unsure think what would happen to your system if you abolished all wasteful activities:

  • Vacations – typically 10% of your productivity lost.
  • Coffee breaks – 10 minutes every 2 hours = another 8% productivity lost
  • Toilet breaks
  • Stand-ups – yep they are waste too.
  • Demos, Retrospectives, Planning, all are ‘waste’

Just imagine how productive you would be with no direction, no feedback and no staff?

 

 

Monte Carlo Syndrome

Monte Carlo Method

Monte Carlo Forecasting is a method for creating predictive forecasts based on a technique of repeatedly running random simulations of the samples to see a range of possible outcomes, and then to use those outcomes to forecast results of larger (or future situations).  The predictive model offers a percentage probability of being in certain ranges.  

It can be a very effective tool for statistical analysis of data, and there has been a surge in it’s use in Software delivery Forecasting.  It can be a useful tool even with small sample sizes but it relies on three very crucial premises.

  1. The sample data MUST be reflective of the forecast situation,
  2. The data must be also be statistically independent (results of one event does not impact another).
  3. The greater the sample size the more reliable the simulation will be.

Casino Roulette - 3d render

Monte Carlo Fallacy

Rather amusingly there is also a psychological condition called the Monte Carlo Fallacy where we assume past results impose a probabilistic bias on future events.  “The last 5 spins of a roulette wheel came up Black therefore the next is more likley to be Red as the odds must balance out.”  

That is the fallacy.  In a fair roulette wheel the odds of being black or red never change no matter how many times you get one result, any combination of outcomes has the same odds as any other.  In fact it is far more likely that the wheel has a physical bias towards Black than the odds righting themselves, probability has no memory.  

Applying the Method to the Fallacy

Amusing as it is the Monte Carlo Fallacy is the act of using observation to misinterpret probabalistic based statistics assuming the probabilities have memory, and the Monte Carlo Method is using observed events to create probabalistic forecasts by assuming (often dependent) events have independence.    

Just as a point of interest. If you used the Monte Carlo Method for gambling on ‘fair’ Roulette Wheels you would likely have no difference to any other ‘system’ – probability cannot be influenced, it would simply be using a ‘method’ to compound your Fallacy.

However, in theory the method could be used to identify faulty Roulette Wheels or other unusual variations from probabilistic results (e.g. to spot cheating), or to predict gamblers’ behaviour.  

Applying precision to inaccurate data is dysfunctional behavior. It is using the fog of precision to create an illusion of accuracy.

montecarlo

The Fallacy of the Monte Carlo Method

As you have probably surmised I have some grave concerns about using the Monte Carlo Method for forecasting software delivery.  In my opinion the Monte Carlo Method applies a huge degree of precision to an inaccurate forecast.  And applying precision to inaccuracy is dysfunctional behavior. It feels to me that we are using the fog of precision to create an illusion of accuracy.

The weaker the data available upon which to base one’s conclusion, the greater the precision which should be quoted in order to give the data authenticity.

Norman Ralph Augustine

Applying to Software Development

In software projects we tend to apply the reverse of the Monte Carlo Fallacy, we make the assumption that the past accurately predicts the future, so accurately we can give a percentage confidence level.  By doing so we are making certain assumptions.

The sample data MUST be reflective of the forecast situation.

  • Future stories are similar in size, scope, complexity, and effort to the sample data. E.g. early stories are similar to later stories.
  • Future stories are impacted similarly by external events (we don’t learn or fix problems)
  • Our ability to do the work is constant
  • The team does not change, either in size or skill set
  • The team’s ability to work together does not improve or degrade

The sampledata must be also be statistically independent (results of one event does not impact another).

  • The future work is not made easier by learning from previous work
  • The future work is not made harder by adding to a growing code base
  • We are not creating more rework later than we did at the beginning
  • Testing, feedback or support do not change as product grows or ages.
  • We do not improve our skills, or knowledge of the domain.
  • We do not mitigate problems to prevent recurrence.
  • We do not learn.

Would you feel comfortable giving that list of caveats along with a Monte Carlo forecast?

gamblers-fallacy

Where does Monte Carlo Work?

Monte Carlo Method and the simulations have many useful applications, googling brings up a whole bunch of options, but it works best where observations of sample data can be used to predict behavior.  As an example: parcel delivery time, whilst it may change over time it will likely be static enough to predict times based on certain volumes. Or to set reasonable SLAs for an IT help desk where the work is likely consistent in variation.

Where does Monte Carlo NOT Work?

Where it does not work very well is in situations where the sample is small or is not likely reflective of the event being forecasted. Or where past events influence future events, if you learn, or improve or grow, or become over loaded or congested   

Monte Carlo magnifies the significance of the sample data. Say you took a sample of 10 items and a 1 in a 100 event occurred Monte Carlo would apply a 1 in 10 significance to that event despite it being 1 in a 100. In software terms an unusually large story or small story or abnormal blocking event can throw off the results by magnifying the significance.

Quick example,

I ask 100 people to solve a puzzle and measure the time each takes.

If I use Monte Carlo forecasting based on the results, it will likely give me a pretty good projection of how the next 100 people will take to complete the puzzle.

If I apply Monte Carlo to forecast how long it will take those same 100 people to complete the same puzzle again it will likely get it completely wrong as some of them will likely get quicker learning from the first attempt. 

Forecasting is Hard

The problem is that in most cases Software delivery is uncertain, most software products are complex and the complexity varies from story to story. Work varies, early stories differ in composition, size and scope from later stories, we don’t order work in a manner that balances effort or delivery time, we order based on maximising value. Work evolves, we respond to feedback and we update. We learn, the first time we see a particular problem we may struggle but the next time it is a breeze.   As a consequence forecasting is very very hard to do.

NoEstimates helped a lot, we discovered that upfront estimating stories only gave us a marginal gain in accuracy for an upfront cost, and that by simply counting stories and measuring throughput we can get an adequate level of predictability, if we accept and understand the limitations.

Whenever I see a forecast written out to two decimal places, I cannot help but wonder if there is a misunderstanding of the limitations of the data, and an illusion of precision.

Barry Ritholtz

Accurate Forecasting of Software Products is Snake Oil

I am being a touch cynical but it is my experience that in most cases of people using Monte Carlo Simulations for software delivery forecasting it is being used by people that do not fully understand the tool, and the results are being presented to people that do not understand the limitations.  

  • I see people repeatedly tweak the settings until they get the answer they want,
  • others excluding data they don’t like
  • others making forecasts based on sample sizes of 6 stories
  • Many more based on backlogs or stories that include work that will be broken down at the point the work starts
  • or on backlogs excluding the possibility of new work being added (customer requests or bugs)

And those receiving the predictions believe that when the results say that there is a 90% confidence of hitting a particular date that they are completely unaware and uninformed of the assumptions behind that 90% figure or how it is calculated and there will be a great deal of expectation management needed to fix it later.

K.I.S.S.

I much prefer a simple moving average based on story count, nearly anyone can understand the numbers and the expectations and assertions of precision are absent so the expectation of accuracy is reduced. Good healthy conversation is invited about what might impact the product and it is very easy to see what could be done if the resulting forecast is beyond expectations.

Other than for a desire to dazzle someone with smoke and mirrors or to create false confidence, I struggle to see many situations where Monte Carlo adds any value over and above that which can be achieved with simple calculations.  For me there is far more value in everyone understanding the calculation and limitations.

Final word

In the right context and for the right audience the Monte Carlo Method and the simulations are hugely valuable and can be used to great effect. But ensure you understand how and when to use them.

A forecast is only useful if it is data that can be used to make an informed decision to take action.

If you do not fully understand how Monte Carlo Simulations work (including the assumptions and limitations), OR if those you are presenting to, do not fully understand how Monte Carlo Simulations work or it’s limitations then be wary that you are not simply baffling your audience with graphs rather than presenting them with valuable information they can act on. They could be making ill-informed decisions.

 

Lies; Damn Lies; and Forecasting…

NoEstimates in a Nutshell

NoEstimates has made a lot of traction over the last few years, with good reason, it is primarily about adopting Agile properly, delivering the valuable work in order of priority and in small chunks, and by doing so eliminating the need for a heavy duty estimation process.  If we are only planning for the next delivery we can reliably forecast.

But sadly that is generally not good enough and some level of forecasting is often requested.  So NoEstimates came up with a very useful and low cost method of forecasting. However, it has brought with it a whole host of misunderstandings, most of which are not from the book. The author must be as frustrated as anyone by the misinterpretation of his proposal.  This has led to resistance from many (including me) to adopting this method for forecasting.  I am all for delivering value quickly and small chunks or prioritized work, but slogans that are used to excuse bad behaviour are damaging and hard to resolve, especially when they seem so simple.

My biggest bugbear and one I have covered previously is that many have interpreted NoEstimates as an excuse to skip story refining entirely, this was not in the book but nevertheless you can see any number of articles on the internet professing how adopting NoEstimates has saved them from wasteful refining meetings, the misconception is that if you don’t need to estimate the story then the act of understanding the story is no longer required.  When actually the author was suggesting that you don’t need to refine all work up front and could defer deeper understanding until it became relevant – the last responsible moment.

planning dilbert

Story Writing and being Estimable

I encourage those writing stories to use the INVEST model for assessing the suitability of a story and in that: the ‘E’ is Estimable,  but that doesn’t mean you must actually estimate the story, just that you ask yourself whether the story is clear enough and well understood enough to estimate if asked – are there open questions? is it clear what the acceptance criteria are and that these can be met?  There may be a subtle distinction there, but NoEstimates does not offer an alternative to writing and refining good stories. It is just a method for simple forecasting and encouraging deferring effort until it is necessary.

How does NoEstimates work?

Caveat aside I will try to give a very high level summary of how NoEstimates forecasting works, and when and where it doesn’t work. I shall do so via the medium of potatoes.

Preparing Dinner

I have a pile of potatoes on the side and I am peeling them ready for a big family dinner.  My wife asks me how much longer will it take me?  By counting how many potatoes I have peeled in the last 5 minutes (10) and by counting the potatoes I still have left to do (30) I can quickly and simply calculate a forecast of 15 minutes.

That is NoEstimates forecasting in a nutshell, it really is that simple.

Assumptions

However, the mathematics requires a certain set of assumptions,

1. I did not apply any sorting criteria to the potatoes I selected- e.g. I wasn’t picking either small or large potatoes, we assume my selection was random or at least consistent with how I will behave in the future.

2. That the team doing the work doesn’t change, if my son were to  take over to finish the job he may very well be faster or slower than me and my forecast would not be useful.

3.  We also assume that I will not get faster

4.  We assume that all potatoes in the backlog will be peeled, and no others will be added. If my wife asks me to peel more potatoes or to do the carrots too, the forecast will no longer apply and will need revising.
So there we have it, a very simple and surprisingly accurate method for forecasting future work.  But do you see any flaws to the system?

Flaws in the system

Flaw 1. Comparing potatoes with potatoes

The first flaw is that I am getting potatoes ready for roasting so I want them to be broadly similar in size, so when I get to peel a potato I am also sometimes slicing it, some potatoes only need peeling others may be sliced once and others more than once.  Some potatoes are bad and I throw them away.

If my wife comes along and sees my pile of potatoes and asks how much longer it will take? I can look at my pile of potatoes I have completed in the last 5 minutes (18)  and I can count the potatoes I still have left to (30). The problem is I don’t know how many unpeeled potatoes were needed to produce those 18 peeled and sliced potatoes, I am not comparing like for like.   To be able to give this estimate I would have needed to count how many unpeeled potatoes I had peeled, information I don’t have.  Maybe I could take a guess and then use that guess to extrapolate a forecast, but that sounds like guesswork rather than forecasting.

Flaw 2. Forecasting an unknown

Let’s assume that I am producing 10 peeled potatoes in 5 minutes, and I am asked to give a forecast as to when I will be done, but so far I have been grabbing a handful of potatoes at a time, peeling them and then going back for more, one could say that my backlog of work is not definitive, We have a whole sack of potatoes but I won’t use them all for this one meal.  I am simply adding work as I need it. My aim being to judge when I am satisfied I am done and start cooking.  It is very difficult for me to judge when the sack will be empty or when I have prepared enough for lunch.

Flaw 3.  Changing and evolving work

It is a big family dinner and uncle Freddie has just called to say he will be coming so we need to add more food, Aunt Florance eats like a bird so probably not worth doing a full portion for her.  And the table isn’t really big enough for everyone, so maybe we should do an early meal for the kids first.  The point here is that simple forecasting only works if you have a reasonably good assessment of what the work is still to be done, if your backlog of work is evolving, work being added or removed then the forecast will be unstable.

Flaw 4.  Assuming consistency

When selecting work to do next I have a tendency to choose the work that will bring me the most value for the least effort.  The highest ROI, so in this case I may choose the small potatoes first, less peeling and less chopping.  But that means that if I count my competed work and use that to forecast my future work I will end up underestimating how much is left, the backlog has some really big awkward shaped potatoes that will take far longer to do. But my forecast is based on only doing small simple potatoes.

Doesn’t this apply to all forms of estimates and forecasts?

Flaws 2 and 3 apply to any form of forecasting, they are not unique to NoEstimates. Flaw 1 and Flaw 4 could potentially be mitigated with the use of T-shirt sizing or story points, but to do so requires a level of upfront effort.  Effort that is not spent on peeling potatoes, so may well be considered waste – that is unless you see value in a more reliable forecast.
For me Flaw 1 is my main objection to NoEstimates (beyond the belief that refining is unnecessary)  When stories are refined and better understood it is normal to split or discard stories, and often add stories as the subject becomes better understood. So any forecasting tool that uses a metric based on counting refined stories to predict a backlog of unrefined stories is risking over simplification of the problem. But because the maths is so simple it can lead to a confidence level that exceeds the quality of the data.  These assumptions based on flawed data gets even worse when you use a tool like Monte Carlo forecasting which applies a further confidence level to the forecast. By giving a date combined with a confidence level adds such a degree of validity and assurance that it is easy to forget that a forecast based on duff data will result in a duff estimate – no matter how prettily we dress it up.

Summary

Forecasting is risky at the best of times, especially in Agile where it is our goal to have the work evolve and change in order to give the customer what they truly want. Forecasting needs to be understood by both parties and accepted that it is an evolving and changing metric. Anyone expecting a forecast to be a commitment or to be static is likely to be disappointed. Just take a look at the weather forecast, the week ahead changes day by day, the further away the forecast the more unreliable it becomes.  Understanding the limits of the forecasting method is crucial, a simple tool like NoEstimates is fantastic IF the assumptions can be satisfied, if they cannot then the forecast will be unreliable.

It is probably also true that your forecast will improve if you spend more effort understanding the work. Time spent refining the stories will improve your knowledge. But no forecast can reliably predict work you do not yet know about.  The question as always is “What problem are you trying to solve by forecasting?” That will guide you in determining whether the up front effort is worth it.
Related articles:

Why I think estimating isn’t waste
Demystifying story point estimation