This is a short video giving an overview of #NoEstimates describing how simple and how powerful the tool is but also some clarification of the limitations of the tool.
One of the things I have noticed over the last few years is a drastic increase in the use of metrics. Particularly with the surge in the use of electronic boards for teams data is much more readily available and so is much more frequently used.
If you can’t measure it, you can’t improve it.
Previously there was an effort required to collect and process data to be meaningful, but now much more is easy to do, often it is done for you. But is this a good thing?
Problem 1: Lack of understanding the data or the tool
There are two problems I frequently see, the first is that we have access to data and tools but that doesn’t mean we know how to read the data or understand what it is telling us. Monte Carlo simulations are a great example here, I regularly see them being used without understanding what the data is telling you or understanding the limitations of the information. Instead the results are copied into forecasts and endorsed like they are gospel. The information is too easy to get that there is no longer a need to have an understanding to use the data.
Problem 2: Lack of understanding the consequences
The second problem is that our behavior changes as soon as we are measured, sometimes consciously sometimes not. Generally speaking this is the purpose of measuring, we measure what we want to improve, be it our weight or lap time or velocity.
However, there are nearly always unintended consequences, and they can be unexpected and unpredictable.
The Cobra Effect
The Cobra effect is term that originated in a story from British colonial rule in India. The British rulers were worried about the number of venomous cobras.
So they created a policy of putting a bounty on dead cobras. People were paid handsomely to deliver dead snakes for a payment. What could go wrong?
Well at first the reward system worked well, the number of snakes seemed to decline, but the bounties paid continued to rise. The British became suspicious and investigated, they discovered that the reward was encouraging enterprising locals to start to breed snakes for the reward, it was easier and safer to breed snakes than catch wild ones, and they were breeding them in large numbers.
The British rulers were shocked at the dishonesty (It is simply not Cricket, old boy) and were so outraged they cancelled the program and cracked down on the breeders, who released the captive snakes in to the wild. The result was a massive increase in snakes rather than a decline.
Republic of Congo
There is a much more horrific version of this situation from Leopold II (of Belgium) rule in the Congo. If your quota from your rubber plantation was not met the overseers were expected to execute the wives and children of poor performing farmers and send their hands (as proof of death) in lieu of the rubber quota. – a hand equated to a quantity of rubber.
The problem was, some of the villages found it easier to supply hands than to supply the rubber, so were meeting their quota with a mixture of hands and rubber, there were some reports of entire quotas being met just with hands.
Bad as this policy was, it then got worse. The hands were not necessarily of poor performing farmers or their families. Some villages and soldiers that would collect quotas were attacking other villages, and killing each other just to chop off hands and then using the hands instead of rubber as payment.
Such was the extent of this practice some reports suggest that the population of the region reduced by as much as half as a result of this practice. Estimates suggest that 10million people were killed for their hands.
Now those were extreme cases and it seems unfair to draw parallels given the seriousness but nevertheless it is human nature to respond to how we are measured, and by implication rewarded.
I hear tales of management that expects velocity to rise every sprint, and guess what it does. But does actual value delivered go up – I very much doubt it.
I see teams wanting to mark a card as done and open a new identical one, because the cycle times are looking bad.
But the worse is the measure of being busy, far too many organizations have a fixation with everyone being busy 100% of the time, regardless of whether what they are doing is contributing to delivering value to the customer or if having a little slack would allow the team to deliver far more.
This sounds like a message that metrics are all bad, it is not at all. Metrics can be very useful and as many say you cannot improve what you don’t measure. The key though is to have those being measured understand the desired outcome and share or support it.
Back to weighing yourself for a diet, your goal is to be healthier, your weight is a measurement. tampering with the scale may impact the measurement but will not help you to your goal. You may fool yourself for a while but it is only yourself you are fooling.
A clear goal, and a variety of metrics that support that goal is often more valuable than a single goal, especially if you are supportive of the desired outcome.
“Tell me how you measure me, and I will tell you how I will behave”
Beware of the measurements you use, you may very well be defining the behavior you get, even if it is not your intent.
A Product Owner, or development team will often be asked for an estimate or a forecast for a product; a feature; an iteration; or a story. But when you go to a team member and ask them it is not uncommon for the colour to drain from their face, twitching to start, and their pulse to race. Past experience says that the next words they speak will haunt them for the foreseeable future, maybe even for the rest of their career. It is only a rough estimate you say to reassure them, but they just know you have other plans and the fate of mankind lies in the balance. Or so it seems.
The difference between a forecast and an estimate.
For most practical purposes there are only subtle differences, the main one being that a forecast deals exclusively with the future. When I think of the two I tend to think of an estimate as information and a forecast as expectation. It is likely that I will use estimates to create my forecast and I may use multiple estimates and even apply other factors to derive my subjective forecast. Whereas an estimate is generally an isolated objective assessment.
Both have huge degrees of variation; accuracy; precision; and risk factors, but in some instances it may very well be that my estimate and my forecast are the same for all practical purposes.
For example, If I was asked to estimate a journey, I may say that it is between 15 and 45 minutes and on average it is 30 minutes. If I am asked to forecast when I will arrive, that requires knowing not only the estimate but the time the journey starts and if there are other factors such as a need to stop to get fuel, or food. I may also add a certain weighting to the journey based on the time of day. Thus my forecast, whilst based on an estimate may not be the same as the estimate. Other times though I may simply take the estimate and it will become my forecast.
In practice though the terms are used interchangeably and likely it really makes very little difference, except when it comes to expectations. If I ask “How long will this take” I am asking for an estimate, if I ask “When will this be done” I am asking for a forecast, both are fine so long as everyone knows what is meant by the question.
Forecasting is misunderstood
Forecasting is guesswork, it may be scientific guesswork and it may be based on past experience and metrics and clever projection tools, but it is a guess. You will be wrong far more often than you are right. The more professional and clever and precise the forecast looks the more confidence you may instill in that guess. But it is still a guess, and when your audience gains undue confidence in your guess they have a tendency to believe it as fact not forecast. It might be that a guess is all that is needed but dressing a guess up and delivering it with confidence may create a perception of commitment.
A forecast is commitment (in the eyes of the one asking)
In normal circumstances by giving a forecast there is an implied commitment, even if you give caveat after caveat, and at any point where you even hint at a commitment to delivery to a fixed scope, fixed cost AND fixed date you are setting yourself up for disappointment. And sadly that is what a forecast is seen as by most people.
By all means work to a budget or a schedule or even a scope (although that is very likely to vary) but ideally ONLY one and never all three.
Understanding why a forecast is needed.
There are many reasons why people ask for forecasts, and the why is the most crucial aspect of the process. Understanding the why is the first step to providing the right information, and hopefully changing the conversation. Forecasting for stories and features is both more reliable and more accurate but the project/product level forecasting is where there is the most confusion and least understood purpose.
If possible, try to change the question of “When?” in to an understanding of “Why?”
Generally speaking we gather information to help us make decisions. If the information will have no bearing on our decisions then the gathering of that information is wasteful. Forecasting all too often falls into this category. We take time and effort to produce a forecast and yet the forecast has no bearing on any decision, that effort was wasted, or more often the level of detail in the forecast was unnecessary for the purpose for which it was used.
Making predictions of unknown work based on incomplete information and a variety of assumptions leads to poor decisions, especially when the questions being asked are not directly reflective of the decisions being made.
In my experience the Why? generally falls into three broad categories:
- How long will this take?
- How much will this cost? and
- simple curiosity/routine.
But most people don’t ask why, they spend time creating a forecast and present it without knowing how it will be used. Some people asking for a forecast/estimate may not even know why they are asking, they just always get a forecast. But let us delve a little deeper.
How long will this take?
Why do you need to know? Some typical answers are:
- I need to plan
- I have dependencies
- I need to prioritize
- I have a deadline
How much will this cost?
Why do you need to know? Some typical answers are:
- I want to know if I will get a good Return on Investment
- I need to budget
- I need to prioritize
- I have limited available funds
- We always ask for a forecast, I need to put something in my report
- I want reassurance that you know what you are doing
- I want to know if my project is on track
What you will notice about these questions is that when you ask why, the first request for a forecast suddenly doesn’t make sense anymore, they are not really interested in the forecast itself, but in some other factor that they can infer from the forecast. If the intent is clear then the question can be tailored to get the required information in a better way.
I need a forecast – Why?… I need to know when I can allocate staff to the next product/project.
In this case would a simple high level guess be sufficient? I feel confident that staff won’t be available for the next 3-6 months, in 3 months let’s review, I’ll have a better idea then…
I could put a lot of effort into a detailed forecast but an instinctive response may give all the information needed, saving us a lot of trouble.
I need a forecast – Why?… I want to ship this to maximize the Christmas shopping period – or I want to time the launch for a trade show etc.
This isn’t a request for a forecast, this is a request for an assurance that there will be something suitable available for a particular event/date. I can give you an assurance and confidence level without a detailed forecast, I may even change the priority of some features to ensure that those needs are factored into the product earlier. Or can de-scope some features to meet a certain date.
I need a forecast – Why?… I have limited funds available, and I want to know when I can start getting a return on this investment.
This isn’t a request for a forecast, it is a request to plan the product delivery so that revenue can be realized sooner and for the least investment. It may be possible to organise delivery so that future development is funded from delivering a reduced functionality product early. Or that development is spread over a longer period to meet your budget.
I need a forecast – Why?… I need to budget, The way this company works I must get approval for my project expenses and staffing in advance so I need to present forecasts of costs and timelines. This answer is twofold, first – can you challenge the process? It might be better to have a fixed staffing pool and prioritize products/projects such that the most important ones are done first and then move on to the next, in which case the forecast for this is irrelevant, it is a question of prioritization. Or if the issue is ensuring staffing for the forthcoming year could I simply say whether this product will not be completed in the next budget year?
I need a forecast – Why?… I want confidence that you know what you are doing. This is not a request for a forecast it is an assessment of trust in the team. There are many more reliable ways to ascertain confidence and trust in a team than asking for a forecast.
I need a forecast – Why?… I have a dependency on an aspect of this project. This may not be a request for a forecast of this project, but more a request to prioritize a dependency higher so it is completed sooner to enable other work to start.
I need a forecast – Why?… I want to know if my project is on track. Essentially what you are saying is that I want to track actual progress of work done, against a guess made in advance that was based on incomplete information and unclear expectations. And I will declare this project to be ahead or behind based on this. I am sure those of you reading this will know that what you are measuring here is the accuracy of the original guess, not the health of the project. But we have been doing that for decades so why stop now?
finally the closest to a genuine need for a forecast
I need a forecast – Why?… I need to prioritize or I want to know if I will get a good Return on Investment.
Both of these are very similar questions, but are really requests for estimates not forecasts a rough estimate helps me gauge the cost and when I evaluate that against my determination of the value expected to be gained from the project it may help me decide whether the project is worth doing at all or if there are other projects that are more important. E.g. If it is a short project it may impact my decision on priorities
But even here it not the estimate that has value it is just information that helps me evaluate and prioritize. If I already know that this project has huge value and will be my top priority does forecasting aid with that decision?
When does it make sense to forecast?
Listing those questions above it seems like I am suggesting that a request for a forecast is always the wrong question and is never really needed. And it is true that I did struggle to come up with a good example of when it makes sense to do a detailed project level forecast that included dates or any type of scheduling expectations.
It may be necessary for a sales contract, to have a common set of expectations. Although I would very much hope that sales contract for agile projects are for time and materials and are flexible in scope and dates, if not cost too. So for the purposes of setting expectations and in negotiations I can see that there is value in an estimate, although I do still wonder if a detailed forecast adds anything here that a reasonable cost estimate doesn’t cover. If possible I would rather work to an agreed date or an agreed budget than suggest a forecast that may lead to false expectations.
But the reality is that sometimes your customer does want one and wont tell you why, or doesn’t know why. Some customers (and managers) are willing to accept that a forecast will cost time and money and the more detailed it is the more it costs, and of course being more detailed may not make it any more accurate.
More detailed investigation for your forecasting is likely to build greater confidence but may not be more accurate and you should ask the question “will more detail have a material impact on your decisions?”, and if the extra effort wouldn’t affect your decision then it is just waste.
I would caution that if there is no other alternative and a forecast is made, that it is revised regularly and transparently, the sooner the forecast is seen as variable the more useful it is. There is so much assumption tied to a forecast that it becomes a ball and chain if there is not an expectation of it changing and so it must be refreshed regularly to prevent the early assumptions being seen as certainty, or they will lead to disappointment later.
Short term forecasting
The real value in forecasts is when the forecast is for a short frame of time. Over the short term we can have much more confidence in our forecasts, especially if we have been working on this project for a while and have historic information we can base our forecast on. There are fewer assumptions and less variables.
- Can you forecast which features are likely to be included in the next release?
- If I add a new feature now can you estimate a lead time for this?
- Can you give me an estimate of how much this feature would cost?
By limiting the scope of the forecast to an areas where we have more confidence in our expectations the forecasts become more meaningful and whilst there is still a danger they are seen as commitments the risk is mitigated by the shorter time frame.
Alternatives to forecasting
Many of the questions above could have been resolved with much less effort than a detailed project level forecast. In most cases we could achieve sufficient accuracy for decision making with a high level estimate. E.g. A product like that is similar to ‘x’ therefore I’d estimate 4-8 months for a small team. Or as a rough estimate 12-18 months for two teams and calculate costs accordingly.
These estimates are certainly broad but if you have confidence in your teams and you believe they will use Agile principles to get value early and are able to communicate progress and be transparent with issues then I see no issue with broad estimates. It is sufficient to allocate staff and resources, to prioritize, to schedule and to determine Return on investment decisions.
For the other questions you may achieve far better answers through the use of Product/feature burndown charts, user story mapping, or even simple high level Road maps. These tools provide useful information which can be used for managing expectations, identifying dependencies and visualising the progress of a product. And crucially – aid in setting priorities and keeping the progress transparent.
NoEstimates has made a lot of traction over the last few years, with good reason, it is primarily about adopting Agile properly, delivering the valuable work in order of priority and in small chunks, and by doing so eliminating the need for a heavy duty estimation process. If we are only planning for the next delivery we can reliably forecast.
But sadly that is generally not good enough and some level of forecasting is often requested. So NoEstimates came up with a very useful and low cost method of forecasting. However, it has brought with it a whole host of misunderstandings, most of which are not from the book. The author must be as frustrated as anyone by the misinterpretation of his proposal. This has led to resistance from many (including me) to adopting this method for forecasting. I am all for delivering value quickly and small chunks or prioritized work, but slogans that are used to excuse bad behaviour are damaging and hard to resolve, especially when they seem so simple.
My biggest bugbear and one I have covered previously is that many have interpreted NoEstimates as an excuse to skip story refining entirely, this was not in the book but nevertheless you can see any number of articles on the internet professing how adopting NoEstimates has saved them from wasteful refining meetings, the misconception is that if you don’t need to estimate the story then the act of understanding the story is no longer required. When actually the author was suggesting that you don’t need to refine all work up front and could defer deeper understanding until it became relevant – the last responsible moment.
Story Writing and being Estimable
I encourage those writing stories to use the INVEST model for assessing the suitability of a story and in that: the ‘E’ is Estimable, but that doesn’t mean you must actually estimate the story, just that you ask yourself whether the story is clear enough and well understood enough to estimate if asked – are there open questions? is it clear what the acceptance criteria are and that these can be met? There may be a subtle distinction there, but NoEstimates does not offer an alternative to writing and refining good stories. It is just a method for simple forecasting and encouraging deferring effort until it is necessary.
How does NoEstimates work?
Caveat aside I will try to give a very high level summary of how NoEstimates forecasting works, and when and where it doesn’t work. I shall do so via the medium of potatoes.
I have a pile of potatoes on the side and I am peeling them ready for a big family dinner. My wife asks me how much longer will it take me? By counting how many potatoes I have peeled in the last 5 minutes (10) and by counting the potatoes I still have left to do (30) I can quickly and simply calculate a forecast of 15 minutes.
That is NoEstimates forecasting in a nutshell, it really is that simple.
However, the mathematics requires a certain set of assumptions,
1. I did not apply any sorting criteria to the potatoes I selected- e.g. I wasn’t picking either small or large potatoes, we assume my selection was random or at least consistent with how I will behave in the future.
2. That the team doing the work doesn’t change, if my son were to take over to finish the job he may very well be faster or slower than me and my forecast would not be useful.
3. We also assume that I will not get faster
4. We assume that all potatoes in the backlog will be peeled, and no others will be added. If my wife asks me to peel more potatoes or to do the carrots too, the forecast will no longer apply and will need revising.
So there we have it, a very simple and surprisingly accurate method for forecasting future work. But do you see any flaws to the system?
Flaws in the system
Flaw 1. Comparing potatoes with potatoes
The first flaw is that I am getting potatoes ready for roasting so I want them to be broadly similar in size, so when I get to peel a potato I am also sometimes slicing it, some potatoes only need peeling others may be sliced once and others more than once. Some potatoes are bad and I throw them away.
If my wife comes along and sees my pile of potatoes and asks how much longer it will take? I can look at my pile of potatoes I have completed in the last 5 minutes (18) and I can count the potatoes I still have left to (30). The problem is I don’t know how many unpeeled potatoes were needed to produce those 18 peeled and sliced potatoes, I am not comparing like for like. To be able to give this estimate I would have needed to count how many unpeeled potatoes I had peeled, information I don’t have. Maybe I could take a guess and then use that guess to extrapolate a forecast, but that sounds like guesswork rather than forecasting.
Flaw 2. Forecasting an unknown
Let’s assume that I am producing 10 peeled potatoes in 5 minutes, and I am asked to give a forecast as to when I will be done, but so far I have been grabbing a handful of potatoes at a time, peeling them and then going back for more, one could say that my backlog of work is not definitive, We have a whole sack of potatoes but I won’t use them all for this one meal. I am simply adding work as I need it. My aim being to judge when I am satisfied I am done and start cooking. It is very difficult for me to judge when the sack will be empty or when I have prepared enough for lunch.
Flaw 3. Changing and evolving work
It is a big family dinner and uncle Freddie has just called to say he will be coming so we need to add more food, Aunt Florance eats like a bird so probably not worth doing a full portion for her. And the table isn’t really big enough for everyone, so maybe we should do an early meal for the kids first. The point here is that simple forecasting only works if you have a reasonably good assessment of what the work is still to be done, if your backlog of work is evolving, work being added or removed then the forecast will be unstable.
Flaw 4. Assuming consistency
When selecting work to do next I have a tendency to choose the work that will bring me the most value for the least effort. The highest ROI, so in this case I may choose the small potatoes first, less peeling and less chopping. But that means that if I count my competed work and use that to forecast my future work I will end up underestimating how much is left, the backlog has some really big awkward shaped potatoes that will take far longer to do. But my forecast is based on only doing small simple potatoes.
Doesn’t this apply to all forms of estimates and forecasts?
Flaws 2 and 3 apply to any form of forecasting, they are not unique to NoEstimates. Flaw 1 and Flaw 4 could potentially be mitigated with the use of T-shirt sizing or story points, but to do so requires a level of upfront effort. Effort that is not spent on peeling potatoes, so may well be considered waste – that is unless you see value in a more reliable forecast.
For me Flaw 1 is my main objection to NoEstimates (beyond the belief that refining is unnecessary) When stories are refined and better understood it is normal to split or discard stories, and often add stories as the subject becomes better understood. So any forecasting tool that uses a metric based on counting refined stories to predict a backlog of unrefined stories is risking over simplification of the problem. But because the maths is so simple it can lead to a confidence level that exceeds the quality of the data. These assumptions based on flawed data gets even worse when you use a tool like Monte Carlo forecasting which applies a further confidence level to the forecast. By giving a date combined with a confidence level adds such a degree of validity and assurance that it is easy to forget that a forecast based on duff data will result in a duff estimate – no matter how prettily we dress it up.
Forecasting is risky at the best of times, especially in Agile where it is our goal to have the work evolve and change in order to give the customer what they truly want. Forecasting needs to be understood by both parties and accepted that it is an evolving and changing metric. Anyone expecting a forecast to be a commitment or to be static is likely to be disappointed. Just take a look at the weather forecast, the week ahead changes day by day, the further away the forecast the more unreliable it becomes. Understanding the limits of the forecasting method is crucial, a simple tool like NoEstimates is fantastic IF the assumptions can be satisfied, if they cannot then the forecast will be unreliable.
It is probably also true that your forecast will improve if you spend more effort understanding the work. Time spent refining the stories will improve your knowledge. But no forecast can reliably predict work you do not yet know about. The question as always is “What problem are you trying to solve by forecasting?” That will guide you in determining whether the up front effort is worth it.
As an Agile Coach I naturally talk about agile practices very frequently, and participate in a variety of discussions with people of varied levels of understanding of agile principles and practices. But there are certain words that I notice are used with a disproportionate frequency, and used to describe different things. One word in particular gets used very frequently with emphasis and rarely in the right context, that word is Efficiency. And just for the record, I have been know to misuse it and am making a concerted effort to find the right word for the context, to that end I will try to clarify some of the situations where a different word may be more appropriate.
Efficiency seems to be used to convey such things as. Utilization, Productivity, Effectiveness, Velocity, and sometimes even Efficiency. Naturally I am not suggesting that people are not using English correctly, or that they don’t know what ‘efficiency’ means, but when you look at the examples below you will see that whilst they use the word efficiency in a way that is syntactically correct, they are missing the depth of their question or statement, and often the context. The preoccupation with efficiency results in behaviour that reduces the effectiveness of the team.
- Isn’t Pair Programming less efficient than working solo?
- Isn’t it less efficient to have a developer testing software?
- Isn’t it inefficient to have the whole team doing story writing (or backlog refining, or stand-up, etc)?
- Isn’t it inefficient to attend all these meetings, we’d be more productive here?
- Wouldn’t it be more efficient if I specialize in this one area and all tasks come to me?
- Wouldn’t we be more efficient and increase our velocity if we ignored technical debt or skipped testing?
I previously covered an example where maximizing utilization of resources was seen as being efficient, but that efficiency was at the expense of the profitability of the business. I think you are missing the point…
In most cases these questions are simply missing the bigger picture, or the larger context.
In KanBan there is a saying that any efficiency improvement on anything other than your primary constraint (your bottleneck) is just an illusion.
If something isn’t on your critical path then it shouldn’t be your focus. Of course as you make improvements to flow and address one constraint the critical path may change, but you should stay focused on the critical path. Identify your constraint and put your efforts there.
Isn’t Pair Programming less efficient than working solo?
The Efficiency of Pair Programming is a recurring question, and one that I likely will not quash here, and that is not my goal. The issue is whether the question of efficiency is the right word, or the right focus for your efforts.
The definition of Efficiency is: “achieving a goal with the least effort or least waste.”
Pair Programming assumes two developers at one workstation, and the implication of the question is that more can be achieved with less waste by working independently in parallel.
The problem with Efficiency is understanding what’s it is that you are measuring – what is your goal, and identifying whether this particular activity is your constraint in achieving that goal.
Let us take a car journey as a very simple example: What is the most efficient journey? Is it driving at 50mph so as to use less fuel? Is it to get to the destination in the shortest time to reduce utilization of the vehicle, or could it be driving in the middle of the night when the roads are clear and the vehicle is not needed by anyone else?
In all of those examples we are focused solely on the individual journey itself and even then we still have ambiguity of which efficiency we are concerned with. But very likely the reality in that situation the destination is just a step towards the goal. That isolated Driving journey is unlikely to be entire goal in itself – unless maybe I am a taxi driver, and even then I suspect my constraint is maximizing fares rather than reducing costs, so likely maximizing the number of fares rather than driving efficiently is my goal. Driving efficiently only at 4 am may not be the ideal business model.
The road less travelled…
If for now we assume that we are driving efficiently to use less fuel, because fuel is expensive and money is more of a constraint than our time. Then possibly the most efficient journey would be to combine multiple journeys into one (car sharing, or batch delivery), or make a phone call, in this context a journey not taken(or shared), is far more efficient than the most efficient journey.
I glossed over that fairly quickly, so take a moment to think about it. Efficiently doing an unnecessary task is just waste.
Efficiently doing an unnecessary task is just waste.
Time = Money
But hold on, the most efficient journey for me could simply be one that gets me to my destination on time, regardless of cost or duration. My time and the appointment are more important than the cost efficiency of the journey. Sure I would prefer the journey to be shorter and cost less if that can be achieved without impacting on my goal, but my priority is being there at a particular time. A super efficient journey that gets me to my destination 10 minutes early so I can sit idle for 10 minutes does not help me to my goal, the efficiency gain is just an illusion.
For a while I used to car share, and it saved me quite a lot of money, but I was tied to a schedule and my journey was much longer. After a while I realized my constraint was time and not money. The ability to be flexible over my schedule was a much bigger constraint. My journey became more expensive but I needed flexibility to reach my goal. In other words efficiency in one area may not just be an illusion it may actually cost you more elsewhere as a consequence.
This is a far too common situation in business, a business will often go on an efficiency drive and look to cut costs without any consideration to the larger impact that the efficiency savings are causing. The focus becomes narrow and they delude themselves and often others that they are being efficient, when actually the true constraint is ignored. The cost cutting creates larger costs elsewhere.
Why? What is the goal?
I don’t mean to labour the point, but to be able to understand efficiency we must first understand the true goal, and it is very likely that the true efficiency we are interested in is related to a goal at a higher level.
Let us revisit the Pair Programming. Our goal in programming is to deliver a software solution, likely a quality solution that meets the client’s needs, with minimal support requirements. It is likely that they are also interested in maximizing return on investment. So where does the issue of efficiency for programmers come in?
Pair Programming as an activity shares and grows knowledge, it often enables more creative solutions, the quality is higher, there is an inherent in built code review and QA in to the process. So the question isn’t so much whether Pair Programming is more or less efficient in terms of number of lines of code written; or stories coded; or bugs found; but whether it is producing a better return on investment overall in the context of our true goal: in terms of quality, support and meeting the client’s needs.
I don’t plan on answering the question to the satisfaction of the skeptics but I suspect my opinion is fairly evident. My point is not to directly answer the question, the point is that the real issue is not what it first appears to be when the question was broached.
If your team is dealing with support issues, bugs, deployment problems, getting features to customers quickly and so on. Then it is possible that ‘coding time’ is not your primary constraint.
In this case the response to the question of efficiency is: What are you measuring to determine efficiency? Or… Is efficiency of one singular part of the process our primary goal or is it just an illusion? Would being more efficient help you to be more effective at achieving your goal?
In this example the word efficiency is taken in a context to justify behaviour that possibly diminishes the effectiveness of the team. Lean principles do look to minimize waste and to be more efficient, but lean looks at the end to end flow and is only concerned with efficiencies when it is applied to the critical path or a bottleneck and ONLY when those efficiencies are removing waste in the context of our overall goal, not simply a small part of the flow.
If we improve efficiency at a point before a bottleneck all we do is pile up work faster, if we improve efficiency after a bottleneck or anywhere not on the critical path we fail to see the benefit because the flow is already limited by an earlier bottleneck, there is no net gain to our efforts. Our efforts are just an illusion.
As a very generalised statement we as individuals find it inconceivable! (Had to include it somewhere) to view our work as part of a larger scope, our desire to be efficient in our own little bubble often leads us to behaviour that is inefficient for the larger scope that we are part of. We should be less concerned with being efficient in our roles and more concerned with doing what is effective for the larger system or team we are part of.
Efficiency is not the same as being Effective
Prefer being Effective over being Efficient. Make sure you are effective at achieving the goal first and foremost before even looking at efficiency, and then only if increasing efficiency does not come at the expense of being effective.
Visualising our work in that larger scope is one of the best ways of understanding where we are struggling to be effective and to highlight where the true efficiencies can be gained. But try to see the whole process not just your part of it.
A brief story….
A team found that many of their impediments and problems with delivery were caused by a lack of access to IT operations or lack of responsiveness, they requested a member of the IT operations team be co-located with (and become part of) the delivery team. This was approved and trialled as an experiment.
From the perspective of the development team this was a major success, delivery times improved, delays were reduced and the major constraint to delivery was aleiviated. In short software projects were delivered sooner and with less complications, the overall cost of delivery went down dramatically.
However, the management team decided to remove the IT ops team member and retrurn them to their old team on the basis that they were not being fully utilized. The notion of a human resource not being fully utilized was too much for them to bear.
[Please note when I say utilization – that is on IT Ops tasks – they were using ‘slack time’ to support the team in other ways]
When I challenged this decision, I was told that they were unable to increase headcount for IT operations staff unless it could be demonstrated that existing resources were more than 100% utilized on IT Ops tasks over the course of a normal week. So anyone not 100% utilized was considered a problem for them and became a burden on other team members.
This same IT operations group has a huge backlog of work and lead times were extremely long, they were a major bottleneck for almost every aspect of the business operations. Sadly the management team did not see any correlation between the absence of slack time and the long lead times.
Too often the dependency on IT Operations and their perceived hindrance to business operations, leads to unnecessary conflicts and frustrations. Usually felt by the staff on the IT Ops teams. But it is because IT Operations are so important that they get this reputation.
How can you justify extra resource if you are not fully utilizing the ones you have?
I have no doubt this is a common story, “how can you justify extra resource if you are not fully utilizing the ones you have?” In principle this would appear to be a perfectly reasonable and logical statement, and I can understand why many management teams fall into this trap. But it is the wrong question.
What troubled me most was that we had demonstrated that the impact and delays suffered by the delivery team in just this one example were actually costing the organization so much that they could easily pay for 5 or more additional IT operations staff with the benefits gained. And more so that this situation could be repeated all over the business. The response from the IT Ops manager was that I was missing the point, and they couldn’t justify underutilizing staff. Their measure for success was maximizing utilization of staff, and not supporting business goals, like say profitability.
I see this mainly as a failure of the business, for not making clear to management where they contribute and how they impact on the larger business objectives. That combined with a manager that either doesn’t see or doesn’t challenge this omission leads to perpetuation of dysfunctional behaviour.
Managing resources effectively not efficiently
I apologize for comparing people to resources, but in this context it is applicable.
Could you imagine if that same principle was applied to other resources? Your PC would be taken away if you didn’t use it 40 hours a week, you would be forced sell your car because my guess is that you only use it 10% of the time. You would wear the same clothes every day. Cooking would be a nightmare if you could only have food where all ingredients took exactly the same amount of prep time and cooking time. I’m getting silly now but it is the same principle, by totally ignoring the reason why we have a resource and focusing entirely on maximizing it’s utilization we behave in really rather perverse and contrary ways.
In any other context the notion that anything less than 100% utilization prohibits buying a second item (regardless of how beneficial) is plain nuts.
Please do not misunderstand, I am not encouraging waste by this, I am encouraging understanding of how resources are being used. When you are meeting the business goals and your flow is optimized, that is the time to look at whether reducing waste is possible but to do so in a way that ensures that any reduction in waste is not hurting business goals. It may turn out that just like your car, the optimum utilization is less than 100% and availability and responsiveness are valued higher than utilisation.
In this situation, I am very proud to be “missing the point” and I wish far more managers would take the time to understand how their role impacts the larger business and perhaps they too will miss the point and start asking important questions. IT Operations in particular are the lifeblood of so many businesses and their actions can make or break a business, understanding all the areas of the business that IT operations supports and how the cost of delay is often significantly higher than the individual cost to operations of that task or that ‘resource’. The ability for IT operations to prioritize and react promptly to the needs of a business is a far better measurement than measuring utilization of staff.
I’d highly recommend that anyone in IT Operations reads the book The Phoenix Project by Gene Kim, and ensure that anyone that benefits from the service they provide also read it.
Understanding how you contribute to the goals of the business is so important, and if after that you truly believe that your role should be maximizing the utilization of resources and not supporting the business achieve it’s goals, then I would suggest it may be you that is missing the point.
I participated in a great discussion this week on the use of WiP limits. For those that don’t know, WiP stands for Work in Progress, and is a measure of how many activities you have started but not completed.
It is important to note that this is not a measure of how many tasks you are currently actively working on. If I started a task but have become blocked, so I start another then my Work in Progress is 2 – even though I am only actively working on one. WiP is a measure of how many started but incomplete tasks I have.
For example: call-waiting or being put on hold during a phone call: You are talking away and get another call so you switch to the second call without hanging up, and when you get another call you take that too, how many people can you have on hold at once? You are only working on one call at a time, although you must remember the content and context of all those other calls. For the people still on hold while you have all these conversations I expect they are wishing you had a WiP limit. Your WiP-Work In Progress is the sum of all active(incomplete) phone calls.
There are many reasons for limiting Work in Progress not least because I hate being put on hold. I will go in to some later and it is likely that if you follow any type of Agile framework you already limit WiP although you may not be consciously aware of it.
If you have a backlog, then you are limiting WiP. You are consciously choosing not to start new work immediately but are prioritising it and organising it to work on later. It is likely that you are informally limiting your WiP according to what you feel you or your team can manage at a given time. It is important to realise that this is a WiP limit even though it may not be conscious or scientific.
If you follow Scrum: WiP is consciously and explicitly limited at Sprint Planning, often by estimating effort, or counting stories or calculating story points. The limit is generally set based on Velocity – our average achieved over previous sprints. If on average we finish 10 stories a Sprint, or we average 35 points, then that average is generally taken as the starting point for our WiP limit, we may choose to push for a few more, or if we are expecting to be slower due to vacation time, or known issues we may choose to commit to less. But in Scrum we don’t usually refer to it as a WiP limit. But that is exactly what it is.
KanBan is actually very similar, if we on average are completing 10 stories a week, then it is likely that we will consciously or subconsciously limit ourselves to only take on 10 new stories within a week, although in the case of KanBan this is not done in a Big Bang explicit decision but gradually over the measured period and is more a consequence than a plan (A Pull rather than a Push). We pull stories when we are ready and we pull at the pace we are working at, that just happens to be 10.
Slightly off-topic, but one of the crucial differences between Scrum and KanBan is that Scrum is a ‘Push’ model and KanBan is a ‘Pull’ model. In Scrum we Push an amount of work to the board and spend the Sprint working to Remove(complete) it. With KanBan we are aiming for a more steady amount of Work in Progress and will pull new work as current work is completed, which creates a smoother flow, but conversely lacks a unified time based goal or target. But it is important to understand that both frameworks limit WiP and both focus on ‘completed’ work being the primary objective.
Why do we limit WiP?
I’d like to think it was obvious by now and I think the basic principle is, but there are nuances that may complicate things which may be where the cause for debate comes from.
At it’s simplistic level if I can only complete an average of 10 stories a week/sprint then starting an average of 20 a week without changing anything else is nuts, all that will happen is that on average 10 or more of those stories each week will remain incomplete and stay ‘in progress’ by the start of week 4 we will still have completed 30, but will have 50 now in progress. Chances are by now we will actually go slower because we will be distracted and falling over ourselves trying to juggle more than we can possibly achieve.
Think of all those people on hold, growing frustrated at being ignored, and you trying to remember all those conversations, the phone rings again, can you cope with another caller on hold?
We limit WiP because we understand that by working only on what we can achieve, we can focus and be more effective.
Limiting WiP and WiP limits
“Ah but you have talked about Limiting WiP but you haven’t mentioned WiP limits” I hear you say…
And this is where I think the conversation really stems from and where we get into nuances. KanBan began in manufacturing where work was passed from one workstation to another and there was a measurable flow through the system. Limiting the production on one workstation so that it maintained pace with the entire system limits waste. Having one hyper-performing efficient widget maker that can produce widgets 4 times faster than they can possibly be used is wasted effort. And the same applies to a software process, we use WiP limits to regulate one part of the flow to maintain pace with the flow as a whole, the goal is to visualise bottlenecks and by visualising bottlenecks we can take steps to increase the flow of the whole system.
Other forms of WiP limit
But WiP limits can take many forms and are applicable to almost all aspects of life. The most common example used is a highway, when a road gets busy it slows down, when it gets very busy it grinds to a halt. Limiting access actually speeds things up for everyone on the road, and ultimately more cars can get through far quicker.
But there are other less obvious WiP limits. A long-distance runner wants to increase her times, she starts fast but she runs out of energy and is slowing down near the end of the distance. By setting herself a slower time per mile early on (Pace is a form of WiP limit) she has a consistent speed and reserves energy so is able to run the full distance faster by slowing down during the early part of the race.
Are you dedicated to a single product/project? That is a very low WiP limit of 1 project.
Do you use a calendar and try not to book two meetings at the same time, that is a very tight 1 item at a time WiP limit.
What about a budget? Do you manage your finances as an individual or company. A budget is a WiP limit for your money, by limiting money spent on beer you have more available for clothes. By regulating your spending you can save for a holiday. By remaining in credit at the bank you do not have to pay interest and so on.
Example of how to choose a WiP limit in a workflow
In the case of the widget maker, if the system can only use 10 widgets a day then we may set his WiP limit to 10, when he gets to 10 he is free to go and help someone else. Suddenly rather than unused widgets we have an extra pair of hands to do other work. But hold on! these widgets are easy to make and only take a short while to do, we could probably limit WiP to a lower limit and jump on the workstation as an when needed to produce more. So how do we set the WiP limit? My advice is not very scientific at this point. Start by continue doing what you currently do and just watch, if your workstation/activity columns are sub divided in to doing and done, take a look on a regular basis and see where the biggest stacks are. If one column regularly has more cards on the done column than doing, it ‘may’ be a sign that you should reduce your WiP limit for that activity. Try it and see if it improves your flow.
The largest queues of done work are usually the area that needs limiting, but be aware of ‘bursts’ of activity, for example there may be a workstation that is only available for one day a week, in which case the queue for that workstation may need to be longer – or maybe you should be asking for more regular access to the workstation. But when imposing limits do it gradually and monitor to see if it improves the flow.
Anything in a done column is normally waste, if it is done and not in production that effort cost you money and is just sat there. Sometimes that makes sense but the car industry learned that complete cars sat in storage amounted to $millions of wasted investment, that components stockpiled in advance was essentially big piles of cash that could be used more effectively, we can learn the same.
What I am saying is that a WiP limit being reached is a conversation trigger, a long queue is a conversation trigger, and so on. With so many things in Agile we create early warning systems to create conversations and make decisions when it is necessary. We make small changes to the process and watch, we see what changes and if it is better we carry on, but look for another new small improvement.
WiP limits are not unique to KanBan, and like it or not you are already limiting your work and activities in all aspects of your daily life, you just perhaps didn’t realize it. In the context of a KanBan board we do not arbitrarily impose a WiP limit because we can or because it is fun. We impose a limit when we feel it adds value, normally where we see a queue of completed work growing faster than it is being consumed. We limit various stages of the work because our goal is the output of the system (the whole team) and not the perceived utilization of individuals. One team member being 100% occupied all day producing something that will sit untouched for a week is not efficient.
We are not trying to manage a system where our goal is to keep busy a group of individuals, our goal is to create a cooperative and coordinated team working towards a common purpose in the most effective way possible. A WiP limit is just one of many tools that can be used to enhance the prioritisation of work and focus the team on the true goal.
Which do you use and Why?
I have already blogged about some common metrics but here are ten that I use or have used most frequently. It is my aim to summarise them here and expand each in greater detail over the next few weeks. (I will come back and add links as I do)
Before we begin, the first piece of advice that can be said about metrics is that before you record data or use metrics you should really ask why you are recording the information and how it will be used.
If you cannot be sure that the metrics will be used for the benefit of the team then you shouldn’t record the metrics and certainly shouldn’t use them. If ever metrics are misused they become worthless as they are very easily gamed, especially if the team feel they will be abused.
Metrics are almost always reliant on honesty and there are not many metrics that couldn’t be ‘gamed’ if someone was motivated to do so. Never attribute reward or penalty for a metric or even multiple metrics. As soon as you do they are broken.
Have you ever heard tales of teams that were paid for number of lines of code written or for number of bugs fixed. I have worked at companies where the annual bonus was dependant on a high management approval rating in an annual employee survey.
Any environment where you are effectively penalised for honesty loses all hope of trust, transparency and integrity and your metrics become worthless.
If there is even the hint or suggestion that any of the metrics will be used for anything other than self-improvement and introspection again they immediately become worthless.
But that being said as a Coach measurements can be very useful. Measurements, metrics and reflection and retrospection are hugely valuable for coaching an individual or team on how to improve. An Agile Coach or Scrum Master has a variety of metrics available – and these 10 are by no means a comprehensive list, actually it is a very limited list, Each project or development environment is different and so tailoring appropriate metrics to your particular environment is important, not all metrics apply to all projects.
One last note is that there are myriad tools for tracking projects, some will measure these metrics but many won’t and it will be necessary to manually record the data. It will often be difficult to backdate if you haven’t been recording it.
In projects I work on I try to keep a lot of metrics just for myself, much more than the team uses. If at any point I feel there might be value in a metric I will bring it to the team and allow them to decide if it has value and if they are interested. But too much becomes confusing. As a rule of thumb they team only really have interest in a few of these, although which ones may vary over time. Keeping metrics without using them is also a good way of avoiding them responding to inspection.
Beware of context
I am also aware that metrics can be misleading if taken out of context. Information can be dangerous if misunderstood, if I choose to publish metrics in reports or in the office on a wall I will work hard to make sure the context is clear and if appropriate there is an explanation. But even then you can be sure that someone will misuse the information, so be prepared!
1 – The Burndown:
This is probably the most often used Scrum metric. In essence it measures work outstanding at the start of the sprint and a line is drawn converging on zero at the end of the sprint and each day the progress or burn-down is measured. The goal being to complete all planned(and emerging) work by the end of the sprint.
2 – The Velocity Chart:
The velocity chart can take many forms and can be used to track a number of key factors. In the most simplistic form it is a bar chart showing the number of points or sometimes the number of stories completed in the recent sprints.
3 – The Velocity Committed/Actual Chart:
Sometimes the velocity chart can be further enhanced to show committed points compared with actual points, this is useful when the team is not consistently delivering on commitments.
4 – Release Burndown
The principle is that for a given product we measure the Burndown of the entire product in a similar way to a Sprint Burndown, it may take the form of a cumulative flow diagram or be similar to a velocity chart.
5 – Cumulative Flow Diagram
A cumulative flow diagram tells all sorts of stories about a project, essentially it measures the number of stories in various states as the project progresses.
6 – The Burn-up
The Burn-up measures the actual effort expended in the sprint. In a perfect world it would be an exact inverse of the Burndown. But rarely do we predict all tasks in advance nor estimate the ones we know accurately. It can help see where time went and improve our estimates.
7 – Flow time, Lead Time, Cycle Time
There are some varieties in how these terms are used but I use a simplified version as follows:
Flow-time is the time from when a story is started to when it is completed by the team.
Lead-time is the time from when a story is added to the backlog to when it is completed into production – from conception to getting the value.
Cycle-time is the average time from when a story is started to the point where it is completed – we could be working on multiple stories at once, or overlap them. e.g. each story could have a flow time of 3 weeks but we could stagger them to deliver 1 per week.
Using averages for these allows us to measure whether we are improving over time.
8 – Health check survey
A health check survey is more touch-feely than the other more raw metrics but can be useful for gauging the team’s own opinion of their progress and behaviour. You can measure opinions on anything, from software practices to interactions with other teams, to quality of stories, anything the team has an opinion of can be measured.
9 – Predicted capacity : Actual Capacity
Each sprint during planning – stories can be broken into tasks, each of those tasks can be given a time estimate. When totaled up and the team has committed to the Sprint content, this total effort estimate is the ‘predicted capacity’ at the start of the sprint. This is the important number. Tasks will almost always be added as the sprint progresses and many of the task estimates will be off, but the initial predicted capacity is a measure of what the team is anticipating the workload will be for the sprint.
In theory the actual capacity of the sprint is simply the number of team members multiplied by the number of hours available in the sprint, excluding, planned holidays, planned meetings, predicted time on non-sprint activities.
Tracking these figures allows the team to gauge whether the capacity is realistic in relation to the previous sprint, the end result will be a (hopefully consistent) ratio.
10 – Scope creep
This can be covered by the cumulative flow diagram, and by the release burndown, but sometimes it is useful to explicitly report on what has been added or removed from the Scope each sprint.
Estimates should be based on the relative size of the story. E.g. If our story is to do a 1000pc jigsaw puzzle and we have estimated it as an 8 point story. The story takes us 6 hours to complete. We break up the puzzle and put it back in the box.
We are then asked to do the exact same puzzle again as a new story. We have just done it, so we know the difficult bits, we have fresh knowledge and recent experience, it is highly likely We’d complete the story in much less time. But the story is identical, We still have to do the same puzzle. Last time it was an 8 point story, this time it is still an 8 point story.
In other words our experience changes our ability to complete the story it doesn’t change the relative size of the story. We estimate using relative size because we don’t know who will be doing the story or when it will be done.
Hopefully we learn and get better, equally it is likely a more experienced or senior developers will complete stories quicker, but none of this changes the relative size of the story.
Story point estimates are to offer the ability to forecast, they are accurate in that context and over the long-term only, Think of stories like rolling a dice. A three point story is like rolling a dice 3 times and totalling the results, an 8 point story is like rolling a dice 8 times and totalling the results. Sometimes a 3 point story will take longer than a 5 point story. But in the long run the average will be 3.5 per roll.
I could never guarantee the next roll of the dice will result in a value of 3.5 but what we can offer probability not predictability over a longer period, by the time you roll the dice (take the story into sprint planning) the story points offer no value or interest to the development team. The forecasting value is gone. The story will take as long as the story takes, we must trust the team to do their job.
In June 2014 a new ScrumMaster was brought in to work with an existing and established development team
At the point of joining the team had already raised concerns about environments and tools but the team had been unable to express specific issues that could be resolved, these problems had been intermittent for most of the year. The team was composed of a BA acting as Product Owner, a Lead Engineer, three other developers and two testers with the addition of a ScrumMaster this made a team of size 8, which equates to an approximate cost to the business of around £600,000/$1,000,000 a year.
The team measure their productivity using a term called velocity, which is how much work measured using a relative scale is achieved over a period of time. The amount of work fluctuates for a variety of reasons, holidays, sickness, bugfixing, mistakes, environmental issues, focus or context switching, support requirements, spikes etc. It is an anecdotal estimate of productivity only. But has become the industry de facto standard for measuring improvement. This had been recorded over the previous year and the documented average for the 12 months preceding the new ScrumMaster is noted below as 100 basis points per week.
Velocity can only be used to compare a team with it’s own past performance, i.e. it is a relative scale, not an absolute measure but in that context it is very useful. But should be used with caution. In this case the team had an existing Datum and a set of benchmark stories. The Datum was not modified and the benchmark stories remained in this period, thus there is a consistent base level for comparison.
Year 1 (Prior to ScrumMaster)
The new ScrumMaster, spent some time with the team, observing and evaluating. He sought to identify problem areas and any behaviour in the team where improvements could be made. But unlike a conventional manager or trouble-shooter he did not issue directives on changes to behaviour.
The team scheduled regular ‘retrospective’ meetings where they review the past time period and look for ways to improve. The ScrumMaster used a variety of techniques, to coach and guide the team to focus on areas identified either by the team or by the ScrumMaster as problematic or where improvement could be made. The ScrumMaster collected and compiled metrics to aid this analysis, and facilitated meetings to be productive in deconstructing problems in a non-negative manner.
By using this approach the ScrumMaster did not solve the teams problems, he did not offer ‘his’ solution to the team, nor issue directives for improvement. The ScrumMaster worked with the team to enable them to become aware of their problems and the ScrumMaster created an environment where the team were empowered to solve their own problems. The ScrumMaster uses his past experience to identify problems and may steer the team to suitable solutions, but relies on the team to solve their own problems.
In the following quarter there was a notable upturn in velocity (productivity)
Year 2 (First quarter after ScrumMaster)
The ScrumMaster is responsible for creating an environment where the team can reach a long-term and consistent performance – a spike or a blip is not considered sustainable, so it is important to not seek to boost productivity with short-term fixes like overtime or cutting quality. The aim is sustainable pace and sustainable quality over the long-term.
The ScrumMaster continued to work with the team identifying more bottle-necks and waste, removing impediments and coaching the team how to work around impediments or resolve them themselves, there is always room for improvement in a team.
The ScrumMaster also spent time coaching the Product Owner and Programme Manager on expectations and in their interactions with the Scrum Team, including challenging the Programme Manager on how his priorities are fed to the team so that the team could focus to be more effective.
Year 2 (First full year with new ScrumMaster)
Velocity is only one measure of productivity and is largely anecdotal, however it is the only real measure we currently have available. The productivity improvement should be considered in that context. However, there is a clear and notable improvement in the productivity of the team over the 12 month period. The team deserves the credit for the improvement as they have improved themselves. But by adding an experienced ScrumMaster to an existing team he has been able to coach the team into identifying ways they could improve and in doing so doubling their productivity – a significant change in just 1 year. In this instance it could be argued that the ScrumMaster’s coaching of this one team has resulted in £600,000/$1,000,000 worth of added value to the business and assuming the team does not regress this is an ongoing year on year gain.
It is not often possible to measure the impact on a team, and velocity is a fragile tool for that, but I hope it is clear from these metrics the value that one person can have on a team. It is highly likely that Year 3 will not have the same growth but even if the new velocity can be merely sustained the value to the organisation is significant.