Estimating Complexity

Over the last few years, teaching people the Cynefin framework early on in engagements has really helped me have useful conversations with my clients about when different processes are appropriate.

There’s one phrase I use a lot, which is self-evident and gets around a lot of the arguments about how to do estimation and get predictability in projects:

“It’s new. If you’ve never done it before, you have no idea how it will turn out.”

This is pretty much common sense. When I teach Cynefin, I also help management and process leads look for the areas of projects or programmes which are newest. These are the areas which are most risky, where the largest number of discoveries will be made, and often where the highest value lies.

A Simple Way to Estimate Complexity

There’s one kind of work which is urgent and unplanned, and we don’t tend to worry about measuring or predicting, because it absolutely has to be done: urgent production bugs, or quick exploits of unexpected situations. This matches Cynefin’s chaotic domain; a place which is short-lived, and in which we must act to resolve the situation lest it resolve itself in a way which is unfavourable to us.

Aside from this domain, all other planned work can be looked at in terms of how much ignorance we have about it.

Something I often get teams to do is to estimate, on a scale of 1 to 5, their levels of ignorance, where 5 is “complete ignorance” and 1 is “everything is known”.

If a team want a more precise scale, I’ve found this roughly corresponds to the following:

5. Nobody in the world has ever done this before.
Someone in the world did this, but not in our organization (and probably at a competitor).
3. Someone in our company has done this, or we have access to expertise.
2. Someone in our team knows how to do this.
1. We all know how to do this.

You can see that if a piece of work is estimated at “5”, it’s likely to be a spike, or an experiment of some kind, regardless of how predictable we might like it to be! This matches Cynefin’s complex domain, and sits at the far edge, close to chaos, since we don’t yet know if it’s even possible to do. 4s are also a high-discovery, complex space; we know someone else has done them, but we don’t know how.

As we move down the numbers, so we move through complicated work – understood by fewer people that we might consider to be experts – through to simple work that anyone can understand.

We can also measure this complexity across multiple axes: people, technology, and process. If we’ve never worked with someone before, or we’ve never made a stakeholder happy; if there’s a UI or architectural component that’s unusual; if there’s something we’d like to try doing that nobody has done; these are all areas in which the outcome might be unexpected, and in which – as with Cynefin’s complex domain – cause and effect will only be correlated in retrospect.

Embracing Uncertainty

Helping teams to be able to estimate the complexity of their work has had a number of interesting outcomes.

Devs are happier to provide estimates in time alongside the complexity estimates. There’s nothing like being able to say, “It’ll take about 20 days, and if you hold us to that you’re an idiot,” with numbers!

Management can then use the estimates to make scoping decisions about releases (in the situations where an MVP might not yet be doable due to large transaction costs elsewhere in the business, like monolithic builds or slow test environment creation). We can also make sensible trade-offs, like whether to use an existing library, or build our own differentiating version now rather than later.

When the scope of a project is decided, be it an MVP or otherwise, it’s very easy to see where the risk is, in the project – and to do those aspects first! Even at a very high level, if a team are delivering a new capability for the business, we can still talk about how little we know about that capability, and in what aspects our ignorance is greatest.

When it comes to retrospectives, rather than treating actions as definitive process changes, teams can easily see whether it’s something that will predictably lead to an improvement, or whether it should be treated as an experiment and communicated as such (and that last can sometimes be important – the worst commitments are often the ones we don’t realise we’re making!)

And best of all, rather than pushing back on business uncertainty (“I’m sorry, we can’t accept this into our backlog without clear acceptance criteria”), the teams embrace the risk and potential for discovery instead (“What can we do to get some quick feedback on this?”) They can spike, learn from the spike, then take their learning into more stable production code later (Dan North calls this “Spike and Stabilize”). Risk gets addressed earlier in a project, rather than later. Fantastic!

And all you need to do, to enjoy this magic, is estimate which bits of your work you, and the rest of the world, know least about.

Making Better Estimates

One of the things I’ve noticed about development teams is that they often like to make everything complex, particularly the devs.

Testers are very happy to do the same thing over and over again, with minor tweaks. Their patience amazes and inspires me, even if they are utterly evil.

Devs, on the other hand, will automate anything they have to do repeatedly. This turns a complicated problem into a different, complex one.

The chances are that if we’re actually in a well-understood, complicated domain, rather than a complex one, someone will have solved the problem already and – because we hate having to do the same thing twice – they’ll have written up the solution, either in a blog post, or a StackOverflow or other StackExchange answer, or as an open-source library.

So before you go off reinventing the wheel, you can perform a few searches on the internet to see if anyone has some advice for you first. This can help you work out whether your work really is complex or not.

The Evil Hat

One of the things we need to do in the complex domain is ensure that any experiment is safe-to-fail.

A pretty easy way to do that is to put on the Evil Hat, and think about how you could plausibly cause the experiment to fail. You know – for fun. Think about how you could do it in the most destructive way possible. Then try to think of ways that the nasty, good people might stop you from doing that.

Cognitive Edge have a great method called Ritual Dissent, that’s very similar to the pattern Fly-on-the-Wall that Linda Rising taught me some time ago. This is similar to putting on the Evil Hat, or at least, inviting others to do so.

If you have any difficulty coming up with ways in which to cause an experiment to fail, try asking a tester. They’re really evil, and very, very good at breaking things.

Lastly, take a look at Real Options, a significant part of which is about making decisions into experiments instead. (Another part of it is about getting information before decisions are made, so it plays nicely with both complicated and complex spaces, and even helps us move our problems between them).

Since we don’t always know what we don’t know, and, in a genuinely complex space, things which worked last time might not work this time, it’s a pretty useful tool for when we’re not sure exactly how little we know, too.

Coming Up

The complexity estimates turn out to be all kinds of useful. I’ll be writing a couple more blog posts soon; one about capability-based release planning (which I’ve touched on here), and one about pair-programming, including how it relates to complexity.

Watch this space!

Edit 2019-04-12: Also made it more obvious that 4s are complex too, even if not quite as much.
Edit 2021-06-22: Turned the scale upside-down (5s first) to match how I’ve found it best to teach it.

This entry was posted in capability red, complexity, cynefin, evil hat, real options. Bookmark the permalink.

27 Responses to Estimating Complexity

  1. Pingback: Five Blogs – 23 July 2013 | 5blogs

  2. This is a really stimulating blog post and I really like the way you are using this scale of 1 to 5 to estimate complexity. However I am wondering whether what you describe would be a matter of estimating complexity or rather uncertainty? This is the more so, as I note that in his latest most recent version of the Cynefin framework, which he shared a while back on Twitter, Dave Snowden has added a mention of the simple domain as that of the “know knowns”, the complicated domain as the “known unknowns”, the complex domain as the “unknown unknowns” and the chaotic domain” as the “unknowable unknowns”. Vide, Also I note that with Cynefin, when we are in the domain of novel practice we are in the Chaotic domain. I was wondering what were your thoughts on this.

    • Liz says:

      Hi Pascal,

      I’ve covered the different way in which I treat chaos in the first paragraph after the cut, and heaven help anyone who’s doing a Sensemaking exercise or estimating anything at that point. The novel practices come from acting, urgently, using experts who know what they’re doing (otherwise you’re probably dead).

      So we’re down to the complex, complicated and simple domains.

      If nobody’s done it, or somebody’s done it but you don’t know how or how hard it was, then you’re in the complex domain, possibly on the border of complicated for a 4. Otherwise, you are indeed in complicated or simple land, and those are 3 to 1 on my scale.

      In all honesty, once it’s a 1, 2 or 3 then the expertise is retrievable and I don’t worry about it. My focus with the estimation is entirely on finding and addressing complexity, which is why I call it “estimating complexity” – because I don’t actually care very much about complicatedness. I should rightly call it “estimating to find complexity” or “estimating for complexity”, but it has a nicer ring this way.

  3. Pingback: Capability-based Planning and Lightweight Analysis | Liz Keogh, lunivore

  4. galleman says:

    Liz, Just discovered your blog. Good stuff. I work in the space and defense industry where we do many of the thing found in agile, just not called that. My area of interest is cost and schedule forecasting on large programs. I see here the ranking of the forecasting process. We use a geometric scale instead of a linear scale for categories. This creates the needed separation. 2 is twice 1, but 5 is only 20% bigger than 4.

    As well we use “Reference Class Foreasting” Google will find everything you need for that.

    Finally the BDD seems a lot like our “Capabilities Based Planning,” Goggle will find for you as well.

    Got you on my track list now, thanks for the contribution.

  5. testerab says:

    ” Testers are very happy to do the same thing over and over again, with minor tweaks. Their patience amazes and inspires me, even if they are utterly evil.”

    Very interesting and useful post, but I had to jump in on this one: I am a tester. Doing the sane thing over and over with minor tweaks doesn’t make me happy, it makes me very unhappy. I have an extremely low boredom threshold. As do many good testers.

    So I wouldn’t call it patience, so much as crazy levels of stubborness when I think I’ve caught a sniff of something. Finding out something new or interesting isn’t boring.

    I think the distinction is very important, because if what non-testers see is “testers like doing boring stuff” then they start feeling it’s okay to dump boring stuff on them (and bad stuff ensues, like demotivated testers, loss of trust in the team, etc). Whereas the truth may be more that “this tester has got their teeth into something potentially interesting to them – even if it looks boring to you from the outside, your judgment of what looks interesting to someone else isn’t as reliable as theirs”.


    • Liz says:

      I can see your perspective, and I’m definitely in favour of devs helping out testers and making their lives less boring. I’m comparing the boredom threshold of testers I have met to my own boredom and attention threshold, which is frequently… ooh! Squirrel!

      • testerab says:

        I think my point is, you’re not seeing their squirrels. I don’t know that for sure of course – I wasn’t there, but I have frequently experienced devs interpreting a test session where I lost focus after about 3 seconds cos I saw a glimpse of a squirrel I just had to chase as “oh my. She musta had to try EVERYTHING to find that one, testers have a real tolerance for boredom don’t they?”

        Things tend to look different from the outside. If I see someone doing something that looks boring to me, that may just mean that the interesting stuff is happening inside their head. Or that I’m not a good person to pick out interesting and engaging activities for them. Or something other than inaccurate perception or different relative values.

      • Liz says:

        As I said, I’m comparing my threshold to testers that I have met.

        I tend to find patient, tenacious testers more valuable. I’ve tended in the past to measure that criteria by their willingness to give me feedback on something that they only just found a bug in half an hour before. And do it again an hour later. And again an hour after that. If I’m working on something with a number of subtle scenarios, performance and security criteria, and neccessary usability, and you’re one of the testers who can help me go through those combinations to check that my stuff works, despite the many interactions with the legacy codebase, the stuff that our junior dev wrote and the stuff that I wrote last month for which I have no excuse, then you have more patience and tenacity than me.

        If you don’t, then I won’t find you as valuable, in that role, as a tester who does.

        And I don’t understand how testers have that.

  6. Pingback: Agile Testing & BDD eXchange 2013 – Notes | Jose Lima

  7. Pingback: Cynefin In Software Testing | Duncan Nisbet

  8. Pingback: Negative Scenarios in BDD | Liz Keogh, lunivore

  9. Pingback: Agile Testing Days 2015, Day 2 | A Priori

  10. Pingback: Cynefin #4 – Safe to Fail – Listy Blog

  11. Pingback: # 8 Side notes JAX London 2016 – Automation Journal

  12. Pingback: Mapping Maturity: create context specific maturity models with Wardley Maps informed by Cynefin – Maturity Mapping

  13. Pingback: Agile Requirements & Behavior Driven Development | Pragmatic Coders

  14. Pingback: Envisioning workshop: Co-create shared Vision and Strategies –

  15. Pingback: How Agile Manages Out Innovation | Liz Keogh, lunivore

  16. Pingback: Cynefin Complexity | No Motherships

  17. Pingback: Estimating Complexity - Liz Keogh on The Product Experience - Mind the Product

  18. Pingback: Estimating Complexity – Liz Keogh, lunivore – reflektivism

  19. Andy Longshaw says:

    Hi Liz. Just re-reading this as we are using it as a reference blog post to help people estimating / forecasting and I noticed that neither of the two links to Ritual Dissent and Fly on the Wall work anymore.

    • Liz says:

      Thanks Andy, fixed. The version of F-o-t-W I found isn’t exactly the same as Linda taught me, but I understand better these days why the only-negative-criticism-ritual is important in Ritual Dissent… so I recommend that link anyway!

  20. Pingback: Is the user story overrated? Some story patterns and formats to learn from – Michael’s lean and agile musings

  21. Pingback: Mapping Maturity: create context specific maturity models with Wardley Maps informed by Cynefin - mm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s