BDD: The campaign against testing

Reading the latest posts on the Yahoo XP group, it seems that good TDDers understand that the word “test” is developer-speak for something a little more complex. “Test”, in Test Driven Development, encapsulates the idea of specification, design, verification of implementation and the ability to confidently refactor. Regression testing itself is almost a by-product of these uses of the word “test”. “Test” is often the first word we think of when doing TDD – you always start with a test – and yet we use it in defiance of its own definition.

I believe that the word “should”, and similar elements of Behaviour Driven Design vocabulary, remove the developer-speak, thus helping to bring the code closer to those people who aren’t privy to the secret language of software. This includes the customers, and anyone who isn’t familiar with TDD.

I believe that the inclusion of developer-speak such as “test”, which no longer means what it says, allows us to easily utilise developer-speak in other ways without noticing. We use other laden terms such as “mock”, “stub”, “functional test”, “…Impl”, etc. We find ourselves free to name classes after the patterns they mimic and the classes with which they interact (ScalpelSheepVisitor) instead of capturing their role succinctly (Veterinarian). We refer to documentation blindly (JIRA345Test.testHappyPath()). We don’t make readable, legible code.

Sometimes, because we can’t read our code properly, we mix up the roles and the responsibilities of our classes (ScalpelSheepVisitor shears the sheep too). We assign responsibilities to classes which don’t really have responsibility for them (SheepVisitor.getCostOf(sheep)), and we verify that these new responsibilities work (testCostOfSheepReturnedSuccessfully) instead of considering whether they’re appropriate.

For people who are already aware of these pitfalls, and who use TDD with care, mindful of what the word “test” really means – BDD will bring little that’s new into your life.

For everyone else – try thinking of these things when you code. Don’t test; think “What should this class do? What are its roles and responsibilities? What does it do to the things I’m going to give to it?” Then write a method which will describe the class’s behaviour. If you need a prefix on the name of the method to make it run then write it, but don’t type those letters until the real name has come to you.

I find that using the word “should” helps me think of the name. I find it helps me consider the roles and responsibilities of the class, and the class’s interaction with others. I find it keeps me away from that distracting, misleading, misnomer: “Test”.

Posted in Uncategorized | 7 Comments

Dave Astels’ BDD paper

If you’ve ever been wondering what the difference is between Test Driven Development and Behaviour Driven Design, Dave Astels has written a fantastic paper which explains it all, linked here.

It mentions JBehave, which is still in version 0.0.1 … we’re going to have to do something about that. Soon. I promise.

These are the things I’m learning to love about JBehave while messing around with it:

  • JBehave’s MockObject.proxy() is deprecated. You can now cast the mock directly. (With a minor bug to be fixed soon.)
  • The aggregate context, event and outcome classes have been renamed to Contexts, Events and Outcomes, respectively. Each one can take two or more Context, Event or Outcome objects. This makes them immeasurably easier to use.
  • The whole Context, Event and Outcome framework leads to a more OO framework for story and scenario specifications (no more procedural functional tests!).
  • Verify.that(…) is far easier to use than assertEquals(…), and you don’t have to extend anything to use it.
  • JBehave’s like that. You can take the mocks, or the behaviour framework, or the story runner, and use them all completely independently of each other. If you have a ‘test’ framework, or a mocking system, or a gui harness, which you prefer to JBehave’s, you can just plug in whichever bits of the JBehave APIs you need to fill in the gaps.

That’s quite aside from the shift in thinking that BDD encourages, which is bigger by far than our little framework.

Posted in Uncategorized | 6 Comments

Apologies…

Looks like LiveJournal are messing with RSS (thanks for the help, Ade). So my feeds are a bit messed at the moment. Normal service will be resumed shortly.

Posted in Uncategorized | Leave a comment

Refactoring – an (amateur) etymology

The Yahoo XP group has been having a fairly heated debate about the term “refactoring”. When you refactor, do you change the behaviour of your code, or not? I decided to go have a look on dictionary.com, my ever-ready friend, for a definition.

It didn’t have a definition of “refactor”. It did, though, have “factor”, particularly as relates to the phrase “factor in”:

factor in
    To figure in: We factored sick days and vacations in when we prepared the work schedule.

For me, that implies that the thing associated with the factoring in changes as a result of the new information.

Here’s where Dictionary.com says the word “factor” comes from:

[Middle English factour, perpetrator, agent, from Old French facteur, from Latin factor, maker, from facere, to make.]

We associate the root “fact” with “making things”, like factory. So sometimes I think of “refactor” as “remake”, which usually implies that there was a flaw in the original, or some new requirement, necessitating the remake. That means changing the behaviour.

I guess if you have a “refactoring phase” or you explicitly separate the idea of tidying up the code from remaking it, you might want to insist that refactoring shouldn’t change any of the code. I don’t think of refactoring as a process, and I haven’t found “refactoring phases” useful except as a smell that I’m not doing it enough as I’m going along. I just change things to put new behaviour in, so any “refactoring” I do is usually accompanied by a change in the code.

Refactoring techniques, though, are important tools that mustn’t change the behaviour of my code, because otherwise I won’t be able to control the explicit changes which I’m trying to make. I guess that’s where I see the confusion coming in. Refactoring isn’t a process. It’s just another tool, or a set of them, which lets me maintain my relationship with my code. My relationship with my code involves legibility, maintainability and predictability, and that particular behaviour should never change.

Update: An anonymous friend comments:

The ‘factoring’ in re-factoring refers to how something is made up. For example, 12 can be (6 + 6) or (4 + 4 + 4). Refactoring the code is changing it from two 6s to three 4s.

That makes sense; I wondered where the original use (ie: non-behaviour-changing transforms) came from. Aren’t words fun?

Posted in Uncategorized | 4 Comments

How to write shorter responses

Ben on the XP thread wrote something which made me smile. I hadn’t seen it before – a Google search attributes it to various people but I suspect it originates with Mark Twain (Update: Apparently Blaise Pascal said it first, but in French):

“I apologize for the length of this response, but I just don’t have time to write a short one.”

Whoever came up with that gem, I applaud them. Writing a short, pithy response is far harder and takes longer than splurging everything onto a page. This is how I go about writing shorter responses:

  1. I write down all the ideas which the post, email etc. gives me.
  2. I work out which of the ideas are relevant to the conversation. I delete the others.
  3. I work out which bits of text duplicate ideas, and I delete or reword until each piece of text explains one or more unique ideas clearly.
  4. I look at how much text each idea in my post takes up. If the idea isn’t actually worth spending that much text on, I delete the text.
  5. I think about how many people are going to read the post. If the remaining ideas aren’t worth their time, I delete the post.
  6. I try to get to the previous stage as quickly and ruthlessly as possible. It’s like being a pirate on the seas of your own intellect, which isn’t so bad.

Many of my deleted posts are helpful for me to write anyway as they help me think through my thoughts. I don’t always write shorter responses, even when I should.

I have no idea whether this is a short enough post or not, but if you got this far I must be doing something right.

Posted in Uncategorized | 9 Comments

Things we talked about at the XtC last night

(Just so I remember)

Which of your projects made you cry at night, and why?

I suspect that our abilities to define exactly why a project made us cry have been shaped by those projects which didn’t. Lack of communcation between teams / leaders, an inability or unwillingness to face reality and / or measure it, and projects of poorly defined or undefined value all made it into this mix.

Use of colour in mind-mapping

Edward de Bono wrote a couple of books: Six Thinking Hats, and Six Action Shoes (one of which I own, but haven’t read yet). Successful mind-mappers sometimes use the six colours as BOIs (Basic Ordering Ideas) in their mind maps, substituting grey for white. Others in the team find it easy to understand the key. Time I read that book.

Performance tests, and how to write stories for them

  • Is it even possible?
  • Knowing when to stop helps.
  • There are different measurements of ‘performance’, eg: processing speed, bandwith use, caching, etc.
  • It’s still possible to phrase things in terms of “As a <role>, I want <some outcome> so that <some value>”.
  • It’s harder if it’s a pet project and there’s no deadline.

Refactoring the ten commandments

A controversial discussion in which we use agile techniques to remove duplication and make the stone-etched edicts easier and more fun to use.

(XtC)

Posted in Uncategorized | Leave a comment

And in other worlds…

One of my poems, “A Stitch in Time”, is now online at The Fortean Bureau.

Posted in Uncategorized | 1 Comment

Owned!

After some struggling, we managed to get our build file working this morning. “Owned! Devs 1, Build File 0”, I thought.

It’s an odd expression, “Owned”, commonly used by gamers and people who read too much PvP. It’s used to express one’s mastery over an opponent in-game, as in, “Ha! I got you with the rocket in midair! You were everywhere! You’re nothing but pieces! I so owned you!” And so forth, though usually with numbers instead of letters.

In this instance, with the slightly less gory game of “Forcing the Build File to Build Things”, the expression “Owned” seemed surprisingly apt. We had managed to get into a position where were were servicing the build file regularly, checking it and pandering to its quirks. It owned us. We were its slaves, even though we were also its creators. We made it to give us an easier life; to provide us with a service. Somehow the roles got reversed. We’ve coerced it into doing its job for a day, but who knows what will happen tomorrow? Will it still behave? or will it, like Homer’s monkey in the Simpsons, lie beered up on the couch, spreading its mess across the room while we run around after it fetching it peanuts every half hour?

I’m firmly on the side of test-driven tools, including Ant Builds. A complex build file often gets changed and altered more than any single piece of the code it builds, and it’s just a domain-specific language, after all. I don’t know how to test it, or what alternatives exist, but I intend to find out. Maybe you can tell me. Somewhere out there is a harness I can use to control these things.

How much of your code do you really own – and how much owns you?

Posted in Uncategorized | 7 Comments

OO acceptance tests (or behaviour)

I’m sure this can’t be original – most of it has come about from talking to my colleagues, or is nicked from the way JBehave works and my attempts to use it – but I wanted to write it down so that I’ll remember next time. It’s a bit long and techy. Forgive me.

JBehave introduced me to the idea:

  • Given a context
  • When these events happen
  • Expect this outcome to occur.

At the moment, our team writes the context by performing the steps which get us into that context – we run a scenario, which might be the first part of an existing story test. It might be the first part of tests for a lot of stories. This means that we’re duplicating bits of tests. It often means that the same code appears twice in different tests. It certainly takes time to run.

Here are some stories:

  • As a sheep farmer and organic wool producer, I want to shear sheep and sell the raw wool so that I can make money.
  • As a sheep farmer and organic jumper maker, I want to shear sheep, spin the wool, make jumpers and sell them so that I can make money.
  • As a sheep farmer, I want to shear sheep and record the weight of wool sheared so that I can breed the best wool producers.
  • As a sheep farmer, I want to shear sheep and record the date on which they were sheared so that I know when to shear them again.

All of them require the shearing of a sheep. I imagine nice little screen in which you can add the details of your sheep to the database, then pick which of your sheep were sheared, what weight of wool was produced, etc. Filling in these screens for every story doesn’t give you any value in terms of code confidence the second time it’s done. Why not just do it once, then pretend that it’s been done for every other story?

So here’s my idea for cleaner, quicker acceptance testing.

Context

We write the first part of an acceptance test. The bit of the test which gets us to this point of the story is a class of its own, and implements a contextual interface. So, for instance, we might test that when we shear a sheep, weigh the wool and tell the app that we’ve put it in the cupboard, we get records that there are 3kg of wool in the cupboard. We could also, if we wanted to, check that one of our sheep was shorn on the 13th August 2005. We would name this test SheepShearingTest, and it might implement the interfaces ThreeKilogramsOfWoolAreInCupboard and BlackieIsShorn, both of which would extend the role of Context. (In JBehave there’s a Context interface). If we want to make these contexts reusable, we can always just call them RawWoolInCupboard and SheepIsShorn and either set parameters which configure them or respond to the configuration in our tests.

At this point in the acceptance test, we can check the state of the domain model – eg: check that we have 3kg of wool in our class representation of the domain. This isn’t part of the story test itself. It might even have a class of its own – RawWoolInCupboardVerifier. This verifier takes a domain model in its constructor, possibly through interfaces only, and has methods to allow you to check the bits you’re interested in. It might even be able to check what shades or species of wool are in the cupboard, what date they were put in there, etc. It’s reusable, so you can put it into lots of tests. It should also be pretty quick to run. Each test should only use the verifications it’s actually interested in.

Event

The second part of our acceptance test deals with the events which happen – eg: spin the wool. This, we just ‘do’. Whatever we use to do it implements interfaces – for instance, WoolSpinner. The implementing class is part of the production code. If we don’t have code to do this, we can write it as we complete the story, unit testing as appropriate.

Outcome

Running the events on the context gives us an outcome – 10 balls of wool in the spun wool cupboard. We write another checker for this. It might also be in its own class, which implements an interface of its own – SpunWoolInCupboardVerifier. You get the idea. (In JBehave, we also have an Outcome interface).

Now, when we construct our test, we can dependency-inject it with the contexts, the bits of code which actually do the events, and the outcome.

And they’re all reusable. They’re all clean. Best of all, they encourage us to think about the domain model; to write our application code cleanly, around that model; to understand the points at which events converge, and package the application classes appropriately.

Things I really like about this approach

  • At any point, you can replace the context of an acceptance test with anything else that matches the context / outcome Verifier – a stub of a domain model, a domain backed by a test database, or a slightly different method of getting into the same state.
  • The outcome of one event can be used as the context for another.
  • You can even do the same thing with the events.
  • The stubs or mocks for a context can be written before the real code to get to that point has been developed – so any part of a system can be produced to an interface without waiting for other parts. The Verifiers help check that the real system does actually produce the expected result. Top-down or bottom-up – doesn’t matter.
  • If we have a bug which requires a bit of deviation from the simple story, we can use the same context and outcome classes as the simple story tests do, and just change the events. And as long as we can create the same context without running the application, it doesn’t matter whether we use the real application, or just build up an appropriate domain model as if we really had used the application.
  • The Verifiers themselves can be implementations of interfaces – for example, one might just check the domain model; another might check that the database has changed. Which checker you use would depend on whether you’re running real code, or just stubbing the model out, or mocking a system… etc.

JBehave is not yet at version 1.0 – it needs a bit of work. The unit behaviour classes are good, but the story runner isn’t done yet. It promises to be good, and to support the framework above. I can’t wait till it’s finished, so I’ll be putting some more effort into it next week.

Posted in bdd | Comments Off on OO acceptance tests (or behaviour)

Acceptance tests: features vs. bugs

Last week, we had a lunchtime get-together to talk about refactoring the functional tests. This week, we had another conversation about “things that were really good” about the project. A little gem came up between them: CardSuites.

A CardSuite is a suite of acceptance tests which describe a story.

The acceptance tests run as fully as possible, from beginning to end. Now, we have a problem, which is that we’re not differentiating much between stories (ie: additions of features) and bugs (where a story has been misunderstood, or doesn’t fully work, or the story has changed slightly due to business requirements). They all get tagged with the universal “Change Request”, and they all get full acceptance tests written for them. Beginning to end. And this is a huge project – it’s no wonder our tests take 6 hours to run.

Most of the bugs, though, involve only a small deviation from existing behaviour. Writing an entire acceptance test seems like overkill. Sometimes a bug occurs because something’s not being tested – for instance, a message goes to two different systems in two different formats, and only one of them’s being checked. Sometimes it occurs becaue a series of events lead to the domain model being in an odd state. Sometimes it happens because a story is wrong, or being interpreted wrongly. So there are a few ways of automating the test for a bug fix which don’t involve writing end-to-end:

  • add a check (or mini-test) to an existing story
  • change an existing story to match new understanding (this doesn’t mean, say, changing it to test a more complex process and forgetting about the simple one)
  • test the bit which deviates from an existing story.

The last is my favourite. Wouldn’t it be great if we could write a partial acceptance test which starts from a point we know we can get to without any bugs, and end at a point from which we know we can proceed?

Posted in Uncategorized | Leave a comment