It is not how big it is…

A question I have been asked a number of times is how do we “Scale Agile”?   There has been so much discussion of scaling frameworks such as LeSS, and SaFe, or Nexus or D.A.D. or even the Spotify model, that we have become conditioned to assume that these are the solution to any scaling or indeed any ‘Agile’ problem.

But as with so many situations and so many problems, deciding on a solution before fully considering the question at hand, leads to an outcome that is likely the wrong one and potentially unexpected and undesirable results.

What are you scaling?

Most of the Scaling frameworks deal with organizing and communicating between multiple teams all of which are working on a single product.  But that is not always the case, there is no ‘Normal’ and every organisation is different, some will have many teams and just one product, and some will have many teams and many products, and then there are the multitude of variations in between.

slide1

Many teams, Many products

But if the scaling agile frameworks cover the situation of many teams  and one product what of the remaining situations?  My instinctive response is to wonder whether this is actually an agile scaling question at all. It feels more a question of organisation scaling rather than agile scaling. That may feel like I am splitting hairs but I wonder if we are jumping to finding a solution to a problem that is neither new nor necessarily agile specific.

slide2

If our organisation is simply growing and we have more teams working on more products and there is little or no dependencies between the various teams then there is no scaling in the sense that these frameworks relate to.  The agile frameworks provide a structure for shared communication and consistency of purpose, and there is an overhead for ensuring a common vision and a consistency of direction. All very important when there is strong dependencies and interactions between the teams.

But if you are able to organize your teams with autonomy of purpose, and are able to minimize the coupling and dependencies then are you really scaling or are you simply growing?  The organisation that are multiple teams working on one product are genuinely candidates for scaling and so I will purposefully ignore them for the purposes of this post.

More teams does not mean you need a bureaucratic layer

The difference is that we don’t need huge amounts of overhead or a layer of bureaucracy, if we simply have a larger number of autonomous teams, then we are able to maintain a flat structure but simply grow and duplicate.  Naturally there are elements of the organisation that will need to ‘scale’ to deal with the growing number of teams and people but this is not agile scaling of the teams themselves.

What about dependencies between teams

In some cases there will be teams that have dependencies on another team, but it may be possible to organize your products such that whilst dependencies occur, the dependent team becomes a (very important) stakeholder in the other product, and so there may be a case for prioritizing stories that support the dependency. However,  there is not a tight coupling between these teams, they can still each be run almost entirely independently of each other.

Now this proposal is not entirely without overhead especially when there are teams that are inter-dependent. Identifying the boundaries between products may require some thought and effort. Coordinating the dependencies and ensuring they are given the appropriate priority will require some additional communication.

What about scaling on a small scale?

So that covers many teams and one product and many teams and many products, but what about a few teams and one product? This is actually the situation I consider most difficult.

slide3

But these problems are likely not enough of a problem to warrant the large overhead of structure that a scaling framework imposes.  Once again this seems to be more of a question of organisation than scaling, sure there is an element of scaling, but not so much that we need a heavyweight solution.

For this situation we may be able to make it work with something as simple as a few shared meetings (Scrum of Scrums) combined planning or refining session and a structured approach to delivery.  Organisation, and dare I say it, the true sense of management – not throwing around orders, but coordinating, enabling communication, promoting transparency and creating a clear direction.

The Product Owner may also be able to think about stories in a context that works well with a multi team view. It may be that it is a single large product but could  we divide by feature areas, or logical vertical slices of work.  Yes there may be situations where there is overlap but it may be that we are able to write stories or share stories between the teams that intentionally minimize conflict.  There are some elements that will require special consideration, such as UX which must be considered across the entire product and should be consistent, but that is not true for all stories.

But please do not underestimate the overhead of multiple teams – even just two, it is significant and can cripple the product delivery if not implemented well. You may need to lose some of the freedom of small teams, a shared definition of done, possible a common cadence, common deployment environments etc.  And there is a greater need for trust in an environment that is less conducive to trust. It is not possible to be involved in all decisions in a multi team setup. There will be compromises and conflicts. So much so that in many situations you would be far better to consider extending the timescale and managing with one team.

Summary

  • The first (many teams and 1 product) is well documented in the various Scaling Frameworks, I am not suggesting implementing them will be easy but they are well documented and there are many experts.
  • The second – many teams, many products – is a question of organisation rather than scaling, so should be no harder than you have found it so far, it is repeating the same albeit with a few complications at an organisational level.
  • But the last group fall in to a gap between the two extremes.  If I have say two or three or even 4 or 5 teams working on one product, I will still have the usual communication difficulties, the need for consistency of direction becomes ever more important and we have a situation where teams can be highly dependent on each other.

It may sound like I am discouraging using the Scaled Frameworks, and I suppose in a sense I am, but not because I disagree with their use, but because they all carry with them quite significant overhead of effort and cost and in a lot of situations there is no benefit and may even be a severe detriment. If you can solve the problem with a little bit of organisation, rather than adding layers of overhead and bureaucracy, surely that is better?

Naturally every situation is different, consider each independently, but please don’t assume because you have multiple teams you need a scaling framework.

 

Advertisements

Understanding cycle-time in software

Cycle-time and Lead-time and even Little’s Law are terms that have migrated over into the software development context and are becoming more widely used, but I am not sure they are fully understood so I’ll attempt to clarify my understanding of the terms.

How does cycle time differ from lead time?

In manufacturing:

There are two different ways I have seen to express Cycle-time:

“Cycle Time” is the overall amount of elapsed time taken to create an item. Measured from the point when work starts until item is delivered.

or

“Cycle Time” is the average amount of time between each delivered item.  e.g. 7 items delivered in a week is a cycle time of 1 day

Lead-time seems to be consistent

“Lead Time” is the amount of elapsed time from which the order was first requested by the customer until the customer has received it.

Note: The different uses of cycle time cause confusion and I prefer the first description in a software context, as the second one ignores the impact of WiP. 

Clearly both of these are more complicated in software terms as the boundaries are not as clear.

cycletime

Cycle-time

Cycle-time is generally considered to be the point when a backlog item is committed and the item is “in progress” through our development cycle, in Kanban terms this is likely time counted from when it moves out of the ‘backlog’ column. We stop counting when it is moved to the ‘done’ column.   There can be a whole debate about what ‘done’ means and that is for another topic. But for simplicity the ‘cycle’ ends when your team will stop work on it (ideally it will be in the hands of the customer).

In Scrum cycle-time is typically fixed at a sprint length, we commit at the start of the sprint and deliver at the end.  But that is not universally true, some teams do not always deliver and there will be effort later to deploy as part of a scheduled release, and some teams deliver as soon as a story is complete.

Note: Cycle-time is sometimes called delivery time. But again this get confused with lead-time because of the non-linear manufacturing to software conversion.

Lead-time

Lead-time is a bit more complicated, it may be counted from the point a need is first identified by a customer to when it is resolved but this is rarely a one-to-one mapping, or it may be considered from the point when the need is identified and refined into a backlog item that will be worked on and delivered.

But really for agile backlogs we generally don’t work in a linear fashion, just because a request has been on the backlog longer doesn’t mean it will be delivered sooner. We prioritize and so a request identified last week may be delivered sooner than one from last year.

As a result Lead-time is generally not used as frequently as Cycle-time, not because we can’t measure it but because it is not as meaningful or useful.

Throughput

Another often used term is “Throughput” this is simply the number of units that have passed though the system (units completed) in a given period.  e.g.  Our throughput is: 10 stories per week, or 5 customers per hour.

Unless stated otherwise you can usually assume that it is an average (mean) or the most recent time period.

Little’s Law

Cycle time is a trailing measurement and we are able to come up with metrics and averages from historic results.  But a mathematician from MIT – John Little  devised a mathematical formula for predicatively calculating Cycle-time from the number of units in the system.

Average Cycle Time =  Average Number of items in progress / Average Throughput

Projected Cycle Time   =  Number of items currently in progress  / Average Throughput 

picture1

e.g. So let’s say you have 50 items in progress and on average you complete 10 per week.

picture2

Number of items currently in progress (50)  / Average Throughput (10 per week)

Projected Cycle Time:   50/10  =  5 weeks

What this tells us is that for the new units entering the system, on average it will be 5 weeks before each of them is completed.

If we want to reduce Cycle-time we have two options we can either reduce the amount of work in progress, or we can increase throughput.  Often our throughput is limited by other factors and so may be harder to change, such as team size or equipment availability. But work in progress (WiP) is typically more within our discretion.

Lets say we reduce the work in progress to just 20.

picture3

Number of items currently in progress (20)  / Average Throughput (10 per week)

Projected Cycle Time:   20/10  =  2 weeks

What this tells us is that for the new units entering the system, on average it will be 2 weeks before each of them is completed.

What you see here is that our throughput is the same, we are still delivering the same amount but by reducing work in progress we are able to deliver the same amount in a shorter elapsed time, and in a desire to have more agility then a shorter cycle time enables us to change direction sooner.  Less work in progress does not increase our throughput, however, it makes us more dynamic and more adaptable to change.

The less work you have in progress the sooner it will get done.

Yorke’s motto

In essence – the less work you have in progress the sooner it will get done, that is the lesson that should be taken from this.

But there does become a point where you can reduce the work in progress to a point where throughput is impacted (the system can become starved) and thus there is a trade-off.
In some situations cycle-time is a critical factor so an element of slack time is desirable, say urgent need to see a doctor, compared to a regular appointment.
But in most other cases we would aim to limit our WiP as much as we can to the point where throughput is not impacted.

Other less used terms

Then we come to the fun ones, these terms come from manufacturing but are used much less frequently in a software context but may be useful to be aware of as it can help understand the composition of cycle-time and where you can take action to tighten up your process.

  • Process Time
    • This is the time when someone is working on creating something: writing code, documents, design, tests, etc.
  • Move time
    • In software terms this would the time to move from one user to another or from one action to another and so would include, compilation, building, deploying and hand over/knowledge sharing
  • Inspection time
    • This might be QA time or code reviews, demonstration to Product Owners or stakeholders, but this may overlap with process time as this is still value added work.
  • Queue time
    • This is time when a unit(story/task) is in progress but no worked on, say blocked, or waiting to be code reviewed waiting for QA waiting for Demo, any time work is paused for any reason before it is done.
  • Wait time
    • This is the time a request spends in the backlog before we commit to work on it, e.g. the time from when a customer identifies a feature/function or bug until we start work on it.
  • Value added time
    • Any time in the system where activity that adds value is happening e..g not queuing, waiting.

Summary

Cycle-time and Little’s Law are becoming much more frequently used and whilst they are not complicated to understand the basics of they do bring a new set of terms and associated confusion.

For example I have seen a number of times recently people advising that reducing cycle-time increases throughput.  Mathematically speaking this is incorrect, (although the reverse is true increased through put should reduce cycle-time). Anecdotally less context switching does increase throughput.
What they are advising is good advice – reduction of Work In Progress. However, their expectation and  reasoning is wrong and adds to yet more confusion.

Cycle-time is not a measure of quantity of output, but a measure of the duration it is in the system. It is important to remember that throughput and cycle-time are different measures.

Cycle-time is not a measure of quantity of output, but the duration it is in the system.

I hope this helps and I’d be happy to add any further clarification if there are parts that are unclear.