OO acceptance tests (or behaviour)
I’m sure this can’t be original – most of it has come about from talking to my colleagues, or is nicked from the way JBehave works and my attempts to use it – but I wanted to write it down so that I’ll remember next time. It’s a bit long and techy. Forgive me.
JBehave introduced me to the idea:
- Given a context
- When these events happen
- Expect this outcome to occur.
At the moment, our team writes the context by performing the steps which get us into that context – we run a scenario, which might be the first part of an existing story test. It might be the first part of tests for a lot of stories. This means that we’re duplicating bits of tests. It often means that the same code appears twice in different tests. It certainly takes time to run.
Here are some stories:
- As a sheep farmer and organic wool producer, I want to shear sheep and sell the raw wool so that I can make money.
- As a sheep farmer and organic jumper maker, I want to shear sheep, spin the wool, make jumpers and sell them so that I can make money.
- As a sheep farmer, I want to shear sheep and record the weight of wool sheared so that I can breed the best wool producers.
- As a sheep farmer, I want to shear sheep and record the date on which they were sheared so that I know when to shear them again.
All of them require the shearing of a sheep. I imagine nice little screen in which you can add the details of your sheep to the database, then pick which of your sheep were sheared, what weight of wool was produced, etc. Filling in these screens for every story doesn’t give you any value in terms of code confidence the second time it’s done. Why not just do it once, then pretend that it’s been done for every other story?
So here’s my idea for cleaner, quicker acceptance testing.
Context
We write the first part of an acceptance test. The bit of the test which gets us to this point of the story is a class of its own, and implements a contextual interface. So, for instance, we might test that when we shear a sheep, weigh the wool and tell the app that we’ve put it in the cupboard, we get records that there are 3kg of wool in the cupboard. We could also, if we wanted to, check that one of our sheep was shorn on the 13th August 2005. We would name this test SheepShearingTest, and it might implement the interfaces ThreeKilogramsOfWoolAreInCupboard and BlackieIsShorn, both of which would extend the role of Context. (In JBehave there’s a Context interface). If we want to make these contexts reusable, we can always just call them RawWoolInCupboard and SheepIsShorn and either set parameters which configure them or respond to the configuration in our tests.
At this point in the acceptance test, we can check the state of the domain model – eg: check that we have 3kg of wool in our class representation of the domain. This isn’t part of the story test itself. It might even have a class of its own – RawWoolInCupboardVerifier. This verifier takes a domain model in its constructor, possibly through interfaces only, and has methods to allow you to check the bits you’re interested in. It might even be able to check what shades or species of wool are in the cupboard, what date they were put in there, etc. It’s reusable, so you can put it into lots of tests. It should also be pretty quick to run. Each test should only use the verifications it’s actually interested in.
Event
The second part of our acceptance test deals with the events which happen – eg: spin the wool. This, we just ‘do’. Whatever we use to do it implements interfaces – for instance, WoolSpinner. The implementing class is part of the production code. If we don’t have code to do this, we can write it as we complete the story, unit testing as appropriate.
Outcome
Running the events on the context gives us an outcome – 10 balls of wool in the spun wool cupboard. We write another checker for this. It might also be in its own class, which implements an interface of its own – SpunWoolInCupboardVerifier. You get the idea. (In JBehave, we also have an Outcome interface).
Now, when we construct our test, we can dependency-inject it with the contexts, the bits of code which actually do the events, and the outcome.
And they’re all reusable. They’re all clean. Best of all, they encourage us to think about the domain model; to write our application code cleanly, around that model; to understand the points at which events converge, and package the application classes appropriately.
Things I really like about this approach
- At any point, you can replace the context of an acceptance test with anything else that matches the context / outcome Verifier – a stub of a domain model, a domain backed by a test database, or a slightly different method of getting into the same state.
- The outcome of one event can be used as the context for another.
- You can even do the same thing with the events.
- The stubs or mocks for a context can be written before the real code to get to that point has been developed – so any part of a system can be produced to an interface without waiting for other parts. The Verifiers help check that the real system does actually produce the expected result. Top-down or bottom-up – doesn’t matter.
- If we have a bug which requires a bit of deviation from the simple story, we can use the same context and outcome classes as the simple story tests do, and just change the events. And as long as we can create the same context without running the application, it doesn’t matter whether we use the real application, or just build up an appropriate domain model as if we really had used the application.
- The Verifiers themselves can be implementations of interfaces – for example, one might just check the domain model; another might check that the database has changed. Which checker you use would depend on whether you’re running real code, or just stubbing the model out, or mocking a system… etc.
JBehave is not yet at version 1.0 – it needs a bit of work. The unit behaviour classes are good, but the story runner isn’t done yet. It promises to be good, and to support the framework above. I can’t wait till it’s finished, so I’ll be putting some more effort into it next week.
Acceptance tests: features vs. bugs
Last week, we had a lunchtime get-together to talk about refactoring the functional tests. This week, we had another conversation about “things that were really good” about the project. A little gem came up between them: CardSuites.
A CardSuite is a suite of acceptance tests which describe a story.
The acceptance tests run as fully as possible, from beginning to end. Now, we have a problem, which is that we’re not differentiating much between stories (ie: additions of features) and bugs (where a story has been misunderstood, or doesn’t fully work, or the story has changed slightly due to business requirements). They all get tagged with the universal “Change Request”, and they all get full acceptance tests written for them. Beginning to end. And this is a huge project – it’s no wonder our tests take 6 hours to run.
Most of the bugs, though, involve only a small deviation from existing behaviour. Writing an entire acceptance test seems like overkill. Sometimes a bug occurs because something’s not being tested – for instance, a message goes to two different systems in two different formats, and only one of them’s being checked. Sometimes it occurs becaue a series of events lead to the domain model being in an odd state. Sometimes it happens because a story is wrong, or being interpreted wrongly. So there are a few ways of automating the test for a bug fix which don’t involve writing end-to-end:
- add a check (or mini-test) to an existing story
- change an existing story to match new understanding (this doesn’t mean, say, changing it to test a more complex process and forgetting about the simple one)
- test the bit which deviates from an existing story.
The last is my favourite. Wouldn’t it be great if we could write a partial acceptance test which starts from a point we know we can get to without any bugs, and end at a point from which we know we can proceed?
Gaming the system
Lots of better people than I have written excellent posts about how, in any system, people will play it to get the best reward. It’s not just software:
- If a teacher is rewarded according to the success rate of his classes, he has less reason to encourage less able students to stay.
- If a council is threatened with a budget reduction if they don’t spend it all, they will find ways to spend (waste) it.
- If a civil servant is paid according to the number of people who work for him, he has no reason to encourage efficiency amongst his staff.
- If the money a criminal can stash away is more than he could earn in his time spent in jail, then crime pays.
And we wonder why the UK is going downhill.
These are the things which prompted me to write this, which are software-related:
- If a customer creates separate budgets for ‘bugs’ and ‘enhancements’, the owner of the enhancements budget has no impetus to keep the bug count low.
- If a customer writes a full specification for a story, the dev team have no reason to hold a conversation with the customer (story cards are placeholders for a conversation, not replacements for it).
- If a bug doesn’t affect a user’s paycheck, working hours or sanity, the user has no encouragement to report it (regardless of how it might affect users in other departments).
Games are fun. I’ll try to think of some more positive examples (or steal them from comments if you’re kind enough to let me.)
When did we stop caring?
Today, we had a lunchtime meeting about our ever-contentious functional test suite.
Who cares about the functional test suite?
- The customers really care about our functional test suite. I know, because when I was on the dev team I had pretty senior people coming up and asking me what coverage we had, and how certain I was about the system working.
- The analysts really care about our functional test suite. Most of our tests check that the stories they’ve written have actually been implemented correctly. When we show an analyst an Abbot test running through the gui, they get to see whether the way we’ve interpreted their story is actually the way they meant it to work.
- The devs, who are nominally responsible for the functional test suite, really care about it. With it in place, they can tell that their fixes aren’t causing bugs in other parts of the system.
- The support staff really care about the functional test suite. With it, we can check that bugs in one version of a system won’t be there in the next one. We can provide interim workarounds with the knowledge that they really will be a temporary measure, just until the next version comes in, and we have more confidence that there won’t be any new bugs introduced.
- The end users, whether they know it or not, really care about the functional test suite. If the functional tests are working, then the application is more likely to work correctly and their lives will be easier.
And the only reason the functional tests aren’t running, and haven’t been for a month or so, is because big companies being the way they are, the desire to get a new machine for the new code branch got lost in the pipeline. A few devs thought the customers didn’t care. The other devs thought the rest of the team didn’t care. I (on support) thought the dev team didn’t care. The end-users almost certainly think that support staff don’t care. As soon as some of us got together to talk about who stopped caring, though – whose decision it was to forget about the tests – the answer was:
No one. See above.
And who wants the tests back?
Everyone!
We all really want the functional tests. And there’s got to be a spare machine around the office somewhere. Who cares enough to find one and set it up? We’re not short of volunteers. And suddenly, the ever-contentious functional tests are once again our much beloved functional tests which let us know that everything’s all right, unless there’s a little bit that needs to be fixed. (There might be a few big bits by now, but at least we’ll know, and we care enough to do it.)
So maybe, if it looks as though nobody cares, we should just talk to each other (there’s a surprise). Chances are that if it’s important, people care. Nobody ever actually stops caring, unless they get the idea that nobody else cares either – which is why it’s so important to go out there today, smile, and care about your job.
Agile: another word for unhappiness?
Reading about Neal Ford’s frustration when trying to explain Agile reminded me of this:
“The best way not to be unhappy is not to have a word for it.”
Douglas Adams, The Hitchhiker’s Guide to the Galaxy
I guess if the only thing you’ve ever experienced is Waterfall, the illusion of control that it gives can be comforting. Agile practices don’t do much good for illusions. So if brutal honesty isn’t your thing, you’re probably better off with the old way. Maybe it’ll work, maybe it’ll fail – but at least you’ll have the comfort of never quite knowing why.


Comments