I’ve found myself repeating this a few times lately, so maybe it will help some people out there.
In BDD, we don’t design using mocks.
We design by thinking about context, responsibility, collaboration and delegation, then we use mocks to express that (or stubs, in the case of context).
If we can find a different word to replace mock, stub, test-double, test-spy, etc., we probably should. In the spirit of BDD’s NLP roots, it might help us think differently. There may be more than one word for different uses. Any suggestions?
While I’m coding, I usually have a bunch of very helpful pixies hanging around my desk. (They’re Dan’s pixies really, since he thought of them first; I’m just borrowing them.)
The pixies are bored, and just waiting for a job to do. So, when I’m coding a class, they look out for opportunities to help out that class. When I’m coding the Game of Life, for instance, I write a Gui class that lets me toggle the cells on the grid. Then I have to work out what happens when I toggle the cells.
I could do it in the same class – in the gui – but fortunately the pixies step in to prevent me making these poor design decisions. “Oh, I’ll do that for you!” one of the pixies calls out. (They usually start with this phrase, and they’re all called Thistle.)
“Thanks, Thistle! Do you know what you’re doing?”
“Um, not really. What’s toggling a cell? Why’s that valuable? What is it you want me to do for you again?”
“I need you to handle the cell living and dying when I toggle it.”
“Oh, okay!” Thistle says. “I don’t like the ‘and’ word so much, though. It makes me feel like I’m doing two things at once. What do you call that? The living and dying thing?”
“Hm.” I think about it while the pixie taps his foot impatiently. “I’d call it a lifespan, maybe. Can you handle the cell lifespan for me? Just let me toggle the cells. I also need you to tell any observers that there have been some changes to the cells, and give them a way of finding out where those changes are. I think they’ve already got an idea of what they want there.”
“Really? Both things?”
“Well, there’s no point doing one if you don’t do the other. It’s all part of the same role.”
“If that’s what’s valuable to you then I’ll do it,” he says. “Just pretend I’m there for the moment; I’ll be back when you need me.”
“Fine,” I say. So I use a mock pixie in place of the real one. I create an interface which does what the pixie’s going to do: IHandleCellLifespans.
(See, it’s an interface that starts with “I”, and it represents a role that the pixie is playing for me. This is a role-based, anthropomorphized interface.)
So, now we have code which compiles. Of course, the real code in the Gui is null, or maybe a null object pattern – I might create something like IHandleCellLifespans.KILLING_THEM_ALL if I’m feeling particularly mean. But that’s all right, because Thistle the pixie will step in when it’s time.
So, I run the code. I’ve usually written an automated scenario. It doesn’t matter whether I run the scenario or step through the game manually; both result in the same thing happening, or not happening – no matter where I click the mouse, no cells appear. Pixies are notoriously unreliable.
Since I can’t rely on the pixies, I inject a new class to handle the dependency instead. I decide to call it the Engine, for the moment, and I write an example of how to use it and what it should do for me.
The next step is the Next button. I think about how this would work in the Engine, and start writing some code to show how the Engine needs to behave. I’ll need to calculate the number of neighbours, and apply the rules accordingly.
One of the pixies pipes up, “Oh, I’ll do the neighbours!” and another one says, “Oh, I’ll handle the rules!”
“Fantastic!” I’m so trusting; I always forget what these pixies are really like when it comes to getting the work done.
“If I’m going to count the neighbours,” Thistle says, “I’ll need some information about where I’m counting from, and what’s around me.”
“Ah, I can get the information from the cell itself,” I say.
“No, don’t do that. It’s fine where it is; I’ll just sit inside the cell and do it from there. Can you give me something that lets me know where the other cells around me are? Then I can do the work for you.”
“Sure,” I say, “the Engine knows where the life is. I’ll just give you access to the Engine and let it play that role for you.”
“Can’t I do it instead? I’m bored,” one of the other pixies asks. “Just give me the information from the engine, and let him talk to me instead.”
“Um, Okay.”
Of course, when I try to run it then I find out that all the pixies have mysteriously vanished, and I end up assigning the role to the Engine anyway, or one of its anonymous inner classes. Having it defined as a different role means that it’s easy to move this responsibility around. Maybe I’ll create a World to look after the cells, and let that do the job instead; the pixies certainly aren’t very helpful.
“What do you mean, we’re not very helpful?” Thistle says. “Look at your code. You haven’t written anything that isn’t needed by something else, so there’s less code to maintain. Because we jump in all the time to try and do jobs for you, every time you can assign a new responsibility to something else, you do – that’s the single responsibility principle in action; none of your classes are doing too much. And you can replace us with something else that does the job at any time – that’s the Liskov Substitution principle. The roles we perform are clearly named. It’s been easy to describe the behaviour of each class using mocks to stand in when we’re not there, and the examples are very readable. You can also use them to work out whether your code still works or not, by running them as tests.”
“Okay. I can’t see myself relying on you guys for bigger, enterprise stuff, though.”
“What do you mean?” Thistle looks offended. Oops.
“Well, let’s say that I’ve got a shop, and I need the tills to talk to stock control.”
“Controlling stock? Oh, I’ll do that!” one of the pixies announces excitedly.
“What, all on your own?”
“Well, I’ll probably delegate it to a team, but that’s my responsibility – you don’t need to worry about it. I’ll be there when you need me. Just pretend I’m there for now. How would you like to find me? What kind of stuff are you going to send to me, and what do I need to do with it? What would you like back?”
So I start with something simple – a URL that I’ll use to find Thistle the pixie, some domain objects that I want to send him, and some objects that I’d like back in return. We talk about how to get the information across, whether some of the tills might provide different stock information, how to talk to the claims department about the quality of the goods we’re selling, and whether I’ll be okay if the claims he gives me have more information than I need.
“Hold on,” I say. “You’ve got me playing this game now. I’m not a Claims Department. I’m not going to do the job myself. I’ve got a home to go to!”
“Meh, never mind,” Thistle replies. “I’ll be sitting with this team over here, coding the stock control. We’ll just pretend you’re doing the job; we’ll mock you out for now.”
“How will I know that I’m doing the job correctly?”
“We’ll have to talk to each other occasionally. Is that going to be hard? We’ll write some scenarios over in our team that describe how we’re going to use you.”
“What if I make a mistake?”
“Do you know what mistakes you’re going to make already?”
“No,” I confess. “I’m sure there will be some, though.”
“When you make a mistake, we’ll deal with it at that point. Sound good?”
I think about it. I reckon I could write some code that pretends to be doing the job of the Claims Department and responds correctly to the way they’ve described how they’re going to use me – just for those examples – then I could go home and Thistle would never know. I knock up a quick stub and slip it into the stock control team’s scenarios, then I disappear too, just like the pixies. I figure I’ll start coding a real Claims Department that does a more robust job tomorrow.
When I get back the next morning, all the pixies performing the role of Stock Control have been replaced with code too. The Stock Control team claim that they’ve never even seen them.
I corner Thistle again. “You’re really not very helpful, are you?”
Thistle looks sulky. “Of course we are! Look at your architecture. You’ve got simple messages going back and forth. Your code is very tolerant of extra information, as is the code on this side. You’ve got lovely RESTful URLs, because you were thinking about how you’d like to find us, instead of us providing you with some weird mechanism that doesn’t match exactly what you wanted. We’ve got clean interfaces and APIs. There are no extra columns in your database, because you only replaced exactly what we said we’d do in the first place. You’ve got scenarios to describe how we work together, and at a unit level examples to describe how you’re delegating responsibility to the other pixies. It’s a lovely, maintainable system with a fairly flat cost of change. Isn’t that what you wanted?”
I nod thoughtfully. “I think it would have been easier for me to just write the code instead of going through you all the time.”
“Ah,” says Thistle, “but then you’d have code that was easy to write, instead of code that’s easy to use.”
I think about how they made me fill in the role of the Claims Department. “You never did any of the work, though! I could have done that job myself; put myself in each of those roles and then replaced myself with real code. That would have let me create consumer-driven interfaces just as easily as using you.”
Thistle shrugs. “If that’s what works for you, sure.”
“Won’t people think I’m a bit mad? If I start talking about how I’m personally going to use a particular class, or how I’m offering to do a job for another?”
Thistle looks at me with raised eyebrows, then gestures at all the other pixies clustered around my desk.
“I think it’s a little late to be worrying about that now,” he says.
My first ever article, “Pulling Power: a new Software Lifespan” is up on InfoQ. BDD, Feature Injection, Lean and Kanban playing nice together!
Big thanks to Dan North, Chris Matts, David Anderson, Amr Elssamadisy and the amalgam of developers who make up “Jerry”; also to the Thoughtworkers who reviewed my article and gave me advice, and the Kanbanjins like Eric Willeke who patiently talked me through Lean and Kanban. Several times. Anything I still haven’t understood is my error, not theirs.
I wish I could have put in everything I’ve learnt about Kanban, which is far larger than this article allows. It strikes me as a lovely, simple, high-discipline, least-wasteful way to deliver software, and it matches my feeling that you should fit your process to reality, not reality to your process.
My experiments with Kanban boards so far have been highly successful. Now I want to know about the problems, too. Have you tried Kanban? What happened? Did you try to introduce it? Did that work out? If not, why not? Did you think about introducing it, but decide not to? Tell me your stories!
Prompted by Szczepan’s post.
I have written tests after code when
- the code is already there, but I / others don’t know what it does or how to use it, and we need to
- I wrote a spike which turned out to be pretty good code, and I wanted to describe what it does and how to use it
- I realised I needed to add some simple yet important behaviour while making a different test pass, and forgetting to add the code would possibly introduce serious bugs, so I added it as I went along and now I’m going back to show what it does and how to use it
- I was learning how to write the tests first, and wanted to know what they would look like.
When we first wrote JBehave 1.0 we quickly recognised that there was power in the scenarios; in the conversations that they could help to drive, and in the reusability of the steps.
I loved the ease with which you could combine smaller steps to make bigger ones, use scenarios as the contexts of further scenarios, and take the necessarily procedural automated tests and turn them into sets of reusable objects.
Now BDD is more widely used, and people out there are using JBehave 2 and RSpec, and I hear complaints. Amongst them, this is one of the more common:
“Every time I change this screen, I have to go through fifteen files and add another step.”
Since JBehave uses plain text scenarios you can’t rely on the common refactoring tools, and that can make it a bit more painful than just messing with code. So, I thought I’d have a go at explaining how I avoid this issue; by sharing some of the ways in which I divide or combine contexts, events and outcomes into reusable components that help me avoid duplication in my scenarios.
Given a context
A context is a state which was set up by irrelevant forces, and which is used within the scenario to alter the outcome resulting from the events.
If the manner or time in which the context was set up matters, then that, too, is part of the context; unless it’s part of the behaviour you’re looking for, in which case it’s an event. The only reason for separating the contexts in which a scenario occurs from the events which are performed in the scenario is because it doesn’t matter how they were created. This means that Givens have more flexibility in implementation than Whens.
A context should matter. If you can remove the context from the scenario without changing the outcomes, it isn’t part of the Givens.
A context should be independent of other contexts. So, I prefer Given a wet newspaper to Given a newspaper / Given that the newspaper is wet. The first is less likely to require refactoring than the second.
A context should create an abandonable artifact. By this I mean that the forces which created the artifact – data in the database, files on the disk, a particular page at a given URL – can safely forget about the context they’ve created. Given an article about Iraq is a good context. Given I am logged in is not so good, even though we frequently use it as an example. Sorry. If it helps, it’s a step that rarely needs maintenance. Given we’ve filled in the comment box and are ready to submit it is likely to cause issues, because of all the other tiny steps that you have to use to get there.
When an event happens
An event exercises the feature whose behaviour you’re interested in when describing or running the scenario.
If you don’t care whether it works as long as it leaves things in a clean state, it’s a context. If you don’t actually need to do anything to cause an outcome – you’re simply checking that given some state, some other state is also present – you don’t need to write an event; just skip straight from Given to Then.
An event should create a valuable outcome.The granularity of the ideal event is very similar to that of the ideal context. As a user, I don’t want to go to the screen with the book, click the purchase button, navigate to the basket, enter the credit card number and click “submit”. I just want to buy a book. By specifying the steps which a user purchases a book inside the granularity of this larger step, we capture the value of that step. Since people rarely do things that they (or their sponsor, paymaster, loved one, etc.) don’t find valuable, this can usually be reused as one large step.
An event may be dependent on context or on another event. So, when I buy a book, and when I cancel my order within 15 days, then I should not be charged for the book.
An event should cause or contribute to an outcome. The outcome is something measurable. It could be that the outcome you’re looking for is an absence of something, for instance if a user’s preferences have been changed, and you no longer want to see all those Facebook groups. If it doesn’t cause an outcome, it’s a fairly irrelevant event.
Then an outcome occurs
An outcome describes the benefit that your system or application provides when the events are performed in the given context.
An outcome should have teeth. If a particular error message doesn’t have the exact wording expected, the world will not come to an end. If my credit card gets billed for the books but I don’t get them when I expect, it might.
An Outcome should be Specific, Measurable, Achievable, Relevant and Timeboxed. Ask a QA if you don’t know what this means. QAs are wise and can break anything.
An Outcome should represent the valuable purpose of the events. Instead of checking that a series of menus exist when you navigate to a particular screen, write a scenario that uses those menus and check that the benefits they provide are accessible through them.
Stories and regression tests
It can take quite a while to run scenarios. I sometimes like to turn mine into regression tests by combining them. I like to add contexts, events and outcomes to existing scenarios to better describe the benefits of using any particular feature in any particular context. This may mean that scenarios are related to more than one story. This will help keep them maintainable, and isn’t a bad thing.
I have never found the need to add a context to a scenario half-way through that scenario, even if it’s been created from several others.
I do frequently use one scenario as the context for other scenarios.
Unteaching the business
Sometimes, we accidentally train our business to talk to us about the solutions they’d like, occasionally in the language of software development. In that case, they’ll quite happily discuss the particular steps they need to take in the GUI to achieve the desired outcome, and may even have an idea of the underlying database tables and discuss what ought to be in those tables after the events.
If this happens, try to draw the conversation back to how the data will be used; why it’s valuable to produce that artifact in the first place. You never know; you may find you need to do less work than you thought you did.
People often ask me to review their unit tests, or answer questions on how to write different aspects of tests. I frequently find myself making the same suggestions over and over again. So, if you were thinking of asking one of these questions, maybe these common BDD refactorings will help.
I have 10 different instances of this class or enum. Does that mean 10 tests?
Tests should make things easy to change. What you’re doing is pinning down your code so that it’s hard for people to break it. I know that when your code is changed and the test breaks, people will look at the tests and work out what went wrong; that’s because your tests are brittle. Instead, I’d rather people came and looked at the tests because they’re clear examples of how to use the code and why it’s valuable.
QAs use equivalence partitioning. They say, “If I do this, it’s equivalent to doing these 5 other cases. So I don’t need to do those.” They also look at boundary conditions, where a shift in context starts to produce different behaviour. If we can do this too, we can cut down the number of different examples we need to produce for one aspect of behaviour.
So, for your ten cases, I might have just a few examples. This makes it easy for people to add more examples, and encourages them to understand the boundaries and partitions. We want it to be easy for people to change the code.
How do I do <this complicated thing> with my mocking framework?
Tests should make things easy to change. If you’re doing something complex with mocks, this is what it looks like:
- Given <this context>
- Expect <this outcome>
- Also <this really complex stuff that you have to read carefully>
- When <this event happens>
- Then <go back to the expectations and read them, because this is where that gets checked>
If the mocking you’re doing is making it harder to change the code, or you don’t have access to something like Mockito, try rolling out your own. This will make it easier to do the things you want in a legible way, and any assertions can happen at the end of the test, along with the rest of the outcomes.
I’ve called my test should. Is that right?
Tests should make things easier to change. Why is returning true valuable? Because it triggers some business process? All right, then let’s call it should. This will encourage people to read your tests. While you’re at it, you might want to rename the method to the same thing.
Yes, you’ll get duplication between the test name and the code name. That’s because your method is doing what it says on the tin, and you’re providing an example of how to do that.
Now my test is called should. Do I need to write shouldNot?
Tests should make things easier to change. If I have to look in two different places for the examples of behaviour I want to change, then it’s not so easy. So, maybe not.
There are three ways of doing this. I’m guessing that triggering the process is the “happy path”. If this is the only valuable thing that happens, and it always happens, that’s fine – that’s the first way. But we know you were returning true, so there’s something else we need to consider. Is the happy path valuable without the sad path? For instance, a list tells you when it’s empty. That’s not valuable unless it also tells you accurately when it’s not empty. True for empty, false for not – they’re the same piece of behaviour. Neither is valuable without the other.
If you have two tests – one for each – you might want to put them in the same test method. Now you’re describing the behaviour holistically. I know Dave Astels said “one assertion per test”. We’re only looking at one aspect of behaviour. That’s the second way.
Of course, you might have a third way too. If you have several exceptional cases, you can list these as exceptions. So you’ll have one shouldNot for each context in which the process is not triggered. Validation is a good example of when this happens; you would normally have one description of the happy path, and one description for each of the exceptions.
I can’t write these automated tests. It’s very difficult to describe how to use my class.
Really? Probably it’s quite a difficult class to use correctly, then. How did you think of that API? Did you think of the class first, then think about how to use it?
Instead, try thinking as if you were the consuming class. Think, “I need something that does this for me. This is how it should look. This is how I want to use it.” Then write some code that does exactly what you want.
Because you’ve thought about how your class is going to be used, it will be easier to describe how to use it in the examples. It will also be easier to understand. So when you do successfully write your tests, you’ll know that your code is easier to change.
Damien Guard introduces the other *DDs..
I particularly like “Psychic Driven Development” and “Golf Driven Design”. Thanks, Damien.
Mike and Gabriel both posted comments to show how the Cowhand example in my last post might look and evolve with different mocking frameworks. They’ve used FluentSpec and Rhino Mocks respectively.
Thanks, Mike and Gabriel! It was a pleasure to find these snippets waiting.
Also, Olof suggested that I include something a little less farm-related, so I’ve added another example with Mockito*, too. Thanks, Olof!
*Mockito’s syntax is now even better than this, but this is the syntax I’m used to.
Some of us have taken to writing comments in our BDD classes to give us Given, When, Then at a unit level.
So, if I’m writing examples for a cowhand, I might write something like this:
public class CowhandTest {
public void shouldMilkTheCow() {
// Given a repository which knows about one cow
CowRepository shed = mock(CowRepository.class);
Cow daisy = mock(Cow.class);
stub(shed.getCowByName("Daisy")).toReturn(daisy);
// Given a cowhand who knows where the cows are kept
Cowhand cowhand = new Cowhand(shed);
// When I ask the cowhand to milk the cow
cowhand.milkCow("Daisy");
// Then the cow should have been milked
verify(daisy).milk();
}
}
(updated) Olof rightly suggests that cows may not be the clearest way to explain this, so here’s another:
public class LoginControllerTest {
public void shouldRedirectSuccessfulLoginsToTheHomePage() {
// Given an authenticator which allows the user to log in
Authenticator authenticator = mock(Authenticator.class);
stub(authenticator.login("Fred", "P@55word")).toReturn(true);
// When the controller gets Fred's login attempt
LoginController controller = new LoginController(authenticator);
HttpResponse response = controller.attemptLogin("Fred", "P@55word");
// Then the response should redirect us to the home page
assertTrue(response.isRedirect());
assertEquals("/home", response.getRedirectUrl().toString());
}
}
Occasionally I’ll run across something that needs to be explicitly captured in a step as a method; mostly this is sufficient. The audience for the class-level steps is technical, so this works.
You can think of each class as having a stakeholder, or consumer, which is usually another class. There will be something, somewhere in the codebase that uses my Cowhand, or my LoginController (or I probably don’t need to write it yet). The exceptions are GUI classes; their stakeholder is the user.
You’ll notice that I’ve mocked out my collaborators – the CowRepository, or the Authenticator (using Mockito, because it doesn’t require expectations to be set so it allows me to keep the G/W/T flow).
I may not have written the real Cow, CowRepository or Authenticator class yet. This may be the first time I’ve decided that I need one. In that case, by the time I come to code the class, I’ll already have some examples which describe how I expect it to behave.
Back last year, Vlad Gitlevich kindly made a video of Dan and I talking about BDD.
We concentrated almost exclusively on the principles rather than the technology, which means the video is still very relevant. Particularly we talked about how BDD plays with DDD, outside-in, stories and scenarios and using them in conversation with the business, and our own experiences.
See it here!
Thanks, Vlad!

Comments