admin

May 142008

If you’re more interested in the results than the conversation, skip to the summary.

Recently, I’ve realised that a lot of BDD has been very dev-focused. There’s a reason for that. Dan’s a dev, I’m a dev and most of the people who helped to evolve BDD are devs.

BDD’s about the interaction between the business and the technical people in software. I want to know how it’s working from the other side. So, I’ve been learning more about the customers, and particularly, more about BAs and their role. I had the recent fortune to run into Angela Martin at Agile India, where she taught me the art of grouping stories into coherent releases, while still keeping the stories granular enough that progress can be measured and delivery made regular and predictable.

Last night at XtC I met up with Chris Matts (of Real Options fame) who told me we’re doing it all wrong. “It’s not an art, it’s a process,” he said. “You’re focused on the stories; that’s why you think it’s an art form. The focus should be the value you’re releasing.”

“Right”, I say. “So, when we originally planned the stories that would deliver the value, we knew what would contribute to that value. But we’ve lost sight of that. Changes have been made. As devs, all we have are those narratives – the ‘As a <role>, I want <a feature>, so that <some benefit is achieved>’. So we need to work out which of the benefits add up to the value we’re aiming for. That helps us work out which stories we should try to complete for a release. The QAs are helping us work out what’s ready; the BAs are helping us work out what’s important.”

“That’s the problem,” says Chris. “You’re putting the role first. It’s the value that’s most important. Try this: In order to <deliver some value>, as a <role>, I want <some feature>. Instead of working out why people want a feature, and whether it contributes to the value, now we’re working out who needs a feature, then assigning the story. So our stories are much more focused. If all the stories that contribute to a value are ready, we release.”

I guess if we get to the point where we can release with only some of the stories ready, we didn’t break down our values into granular enough elements. And when we want to work out what goes in a release, it’s easy. The word ‘release’ is more meaningful. There’s some untapped money out there – some market share, some cost saving, some battle against a competitor. All the features we produce go towards releasing that value for our customers to use – and it’s the value we’re releasing, not the features.

Thanks, Chris, for that.

I need to also thank Dave for giving me a better understanding of what a value is.

In summary: my preferred narrative now reads:

In order to <achieve some value>
As a <role>
I want <some feature>.

Because, in order for my work to have any meaning, as a dev, I want to know why you want it.

Edit: Chris said he posted this to the Kanban group on Yahoo, but they aren’t responsible for it. Sorry, Chris, didn’t mean to misaappropriate! See you on there soon.

Apr 212008

BDD for TDDers

bdd Comments Off

Anthony Bailey and I had a conversation over email about what good, experienced TDDers might get out of BDD.

If you’ve been wondering what all the fuss is about, maybe this will help! Thanks, Anthony, for tidying the conversation up so nicely.

Apr 172008

Peter Bell and I had a great conversation over Skype yesterday, which he’s kindly blogged. We covered test names, and also talked about how to develop libraries using BDD. Again, this is how I do things; it’s not necessarily the only way.

Peter’s mentioned JBehave 2, so the secret’s out now – yes, it’s in progress. It’s not publically available because it’s just a spike at the moment. We’re starting afresh, learning some of the lessons that made JBehave 1.0 hard to use, taking advantage of JUnit 4’s features, and drawing heavily on the success of RSpec.

Our goals for JBehave 2 include:

  • no internal support for mocking – should be able to use any of the libraries
  • extends JUnit – right-click and run both behaviours and scenarios
  • uses Hamcrest’s matchers
  • uses plain-text scenarios à la RSpec

Here’s a sneak preview of the spike so far.

We have a scenario file, i_can_toggle_a_cell:

Given a 5 by 5 game
When I toggle the cell at (2, 3)
Then the grid should look like
.....
.....
.....
..X..
.....
When I toggle the cell at (2, 4)
Then the grid should look like
.....
.....
.....
..X..
..X..
When I toggle the cell at (2, 3)
Then the grid should look like
.....
.....
.....
.....
..X..

We have a small class in the same package to run this, called ICanToggleACell.java (the Scenario class is a JUnit test):

package com.lunivore.gameoflife;

import org.jbehave.scenario.Scenario;

import com.lunivore.gameoflife.steps.GridSteps;

public class ICanToggleACell extends Scenario {

 

@SuppressWarnings("unchecked")
public ICanToggleACell() {
super(new GridSteps());
}
}

And we have steps defined thus:


package com.lunivore.gameoflife.steps;

import static org.hamcrest.CoreMatchers.equalTo;
import static org.jbehave.Ensure.ensureThat;

import org.jbehave.scenario.annotations.Given;
import org.jbehave.scenario.annotations.Then;
import org.jbehave.scenario.annotations.When;
import org.jbehave.scenario.steps.Steps;

import com.lunivore.gameoflife.domain.Game;
import com.lunivore.gameoflife.view.string.StringRenderer;

public class GridSteps extends Steps {

  private Game game;
  private StringRenderer renderer;

  @Given("a $width by $height game")
  public void theGameIsRunning(int width, int height) {
    game = new Game(width, height);
    renderer = new StringRenderer();
    game.setObserver(renderer);
  }

  @When("I toggle the cell at ($column, $row)")
  public void iToggleTheCellAt(int column, int row) {
    game.toggleCellAt(column, row);
  }

  @Then("the grid should look like $grid")
  public void theGridShouldLookLike(String grid) {
    ensureThat(renderer.asString(), equalTo(grid));
  }

 

}

I’m excited that this actually works. The next step is to get appropriate error messages when the scenario fails! We’re working hard to get this out to you as soon as we can; watch this space.

Mar 052008

I’ve just been reading the debate between Bob Martin and Jim Coplien on InfoQ, centred around Bob’s assertion that “nowadays it is irresponsible for a developer to ship a line of code he has not executed in a unit test.”

I have to confess… despite my desire for BDD, I don’t always do automated tests for everything. The place I’m most likely to skip automated tests is when something shows up in a GUI.

That might strike you as odd, if you know that BDD’s outside-in starts from the GUI and works downwards. It’s probably less odd if you also know that automated testing is no substitute for manual testing. Here are a couple of things for which I haven’t written a test.

From Hellbound, the Tetris game in JBehave’s examples:

public class AcceleratingHeartbeatBehaviour extends UsingMiniMock {{

    public void shouldBeatAfterElapsedTime() throws Exception {
        <Test code exists>
    }

    public void shouldBeatMoreQuicklyWithEachBeat() {
        // No way of ensuring this with automation.
    }

    public void shouldStopAnyExistingTimerThreadsBeforeStarting() {
        // No way of ensuring this with automation.
    }

    public void shouldMoveImmediatelyToNextWaitingPhaseWhenSkippingABeat() {
        // Nor this.
    }

    public void shouldNotBeatAfterBeingStopped() throws Exception {
        <There's some test code for this one too>
    }
}

It’s not entirely true that there’s no way of ensuring these things with automation (and the main drawback here was the time it took, which spoils the < 5 second regression test suite). One could extract out clock classes, etc. But then, you're sacrificing readability and ease of design - a single class is enough. Why would you do this, anyway? The only way of telling whether the timing of a game is fast enough to be challenging but slow enough to be feasible is to play it!

It's much the same with this, from the same Game of Life and GameFrame class as the last post:

public class GameFrameBehaviour extends Behaviour {

    @Test
    public void shouldHaveAButtonForTheNextGeneration() throws Exception {
        <See last post for this code>
    }

    @Test
    public void shouldCreateCellsInGameCorrespondingToMouseClicks() {
        // I'm not putting any examples here, because in reality the
        // only way to tell if this works is to test it manually!
    }

}

I tried writing an automated test to check that mouse-clicks corresponded to cells appearing in the grid. I got it wrong; the automated test passed but the cells were appearing in the wrong place. I made the same mistakes with Swing’s coordinate system in both the test and the code. This time round I can’t even be bothered; it’s easier just to use it, check that I got it right, and never change the behaviour of the grid GUI again (at least, not without manual retesting).

There are other places where I’ll skip unit tests too – getters, mostly. I’ll even skip automated system-level tests if there isn’t an appropriate harness. I’ve done this in my own code; I’ve done this at client sites.

Do I feel guilty or unprofessional?

No.

Does it stop me describing how the class behaves?

No. I usually add empty methods to describe an aspect of behaviour that I don’t want to test. At least they get read; they also remind me that I have a missing test, and should use careful inspection if I change the code.

Does it mean that I can get away without testing it altogether?

No. I have to test it manually, from the GUI, instead.

But then, you’re doing that anyway… aren’t you? Because I think Bob would agree on one point with me: nowadays it is irresponsible for a developer to ship a line of code he has not tested, and all your tests are worthless if it doesn’t actually work.

Nov 192007

After my last post, Negin and I were quite pleased that we’d got as far as we had.

So was our Business Analyst. “So, this story that was estimated at 3 days,” she said. “Can I say it’s only taken one?”

“No! We’re not done yet!” I protested.

“Really? It looked like you were almost done on Friday… what happened?”

Oh, well. At least we know it looks good.

Nov 162007

Today, Negin and I paired on a brand new piece of work.

“We’ll need to create this domain object,” she said, “and a database table.”

“I don’t want to do that,” I said. “I’d rather fix the stuff that’s broken.”

She looked puzzled. “What do you mean? We haven’t written any code yet.”

“Well, we know that if you go to the URL, you should see the form. But when I go there I get a 404 error.”

“Well, yes. We haven’t written any code yet.”

“So, it’s broken. It doesn’t work yet. We should fix that.”

So we wired up the container and knocked out a controller. We restarted the server and refreshed the URL. Spring told us we had left out a couple of things. We fixed those.

Negin tapped something into the template and refreshed the URL again. “We have a page. It says HELLOOOOO! across the top. Now what?”

“Well, we got rid of the 404 error. But the page doesn’t look right.”

“Of course not. We haven’t written the form yet.”

“We should fix that.”

We wrote the form. It didn’t look right, so we added the styling. Our business analyst peered over our shoulders at what we were doing. “Looks like you’re doing well. Why doesn’t the drop-down have my data in?”

Negin said, “You’re right. We should fix that. This is fun!”

“It is,” I said. “Don’t you just love that we get paid for this?”

Nov 142007

Crazy like a fox.

bdd Comments Off

At my current client, everyone loves BDD, and everyone starts their tests with the word ’should’, describing the behaviour of the associated class. I’m currently looking at this code:

public class PrimaryMixingIteratorTest extends EasyMockObjectTestBase{
    public void testShouldIterateLikeAFox() throws Exception {
        //...
    }
}
public class SecondaryMixingIteratorTest extends EasyMockObjectTestBase{
    public void testShouldIterateLikeABadger() throws Exception {
        //...
    }
}

Once I’ve remembered how foxes and badgers iterate, this code might make more sense to me. Remind me to run that ‘should is not a silver bullet’ brown bag soon…

Update: If you tied a fox and a badger together and dropped them into the corner of a square pond, they’d make a splash. Imagine that splashes happened in squares instead of circles, and that the quarter of the concentric square formed by the fox and badger started at the top-right then went to the bottom-right then bottom-left. Now imagine that the fox shouts out which row the splash happens in, and the badger shouts out the columns.

It’s a way of combining the values of two infinitely-sized lists for an arbitrary number of combinations, without loading the lists into memory. Makes so much more sense. Hold on, I’m getting a phone call from the RSPCA…

Oct 252007

We write scenarios to help us know when we’re done… not

I’ve been quite happily espousing the idea that we write scenarios in order to help us (devs) know when we’re done, which helps drive our design, and everything else is a lucky by-product.

Dan referred in our OOPSLA tutorial to something he calls ‘Beer Driven Development’; how soon can I get down the pub? As a business analyst, it helps to assume that all devs are horribly lazy and want beer, so if you haven’t specified the scope of the problem, you won’t get that bit of the solution you’re after. Conversely, if you do include some behaviour in the scope of a scenario, you’ll most likely get the simplest thing that could possibly work. This is good, because it cuts down the number of bugs.

There are a couple of other reasons for writing scenarios: regression, and documentation. It turns out (from feedback at OOPSLA) that a lot of people care about automated regression tests. It’s a nice way for devs to know that we’re still done, and as long as the QAs have confidence in the automated tests, they don’t have to manually cover that ground. (In practice I’ve found that they usually do, just to make sure that the tests themselves aren’t faulty, but less often than they would.)

BDD, it turns out, lends itself quite nicely to regression testing.

A scenario is a set of steps

Given <some context>, when <some event occurs> then <some outcomes should result>.

There may be more than one context, event and outcome. It may also be useful to exercise the behaviour a few times in the same scenario, so perhaps there will be more than one event-outcome set.

Scenarios are regression tests once the code that makes them run has been written

Sometimes, though, we like to change scenarios or acceptance tests for the specific purpose of regression testing. This is because the frameworks that we have to test the UI are horribly slow.

I’ve worked on one project where, while running the suite of 500 Fitnesse scenarios, the CPU buzzed at 100% and the whole suite took 5 minutes to run. A lot of technical jiggery-pokery was needed to get this to happen, but if you can do it, you don’t need to differentiate between scenarios and regression tests (the one becomes the other as soon as the code that makes it run is in place). There are some frameworks which are faster than others – Simon Stewart’s WebDriver, and the tiny Swing harness packaged with JBehave, which I’m still incredibly proud of – but generally these fast frameworks aren’t mature, or full-featured yet. And, even if they are, once you start getting above 500 scenarios you may still want to think about cutting down the length of time it takes to get feedback from your suite.

Acceptance tests are hard to read, which makes it hard to merge them

I’ve seen (and written) a number of acceptance tests which look something like this (only they weren’t using pretty fluent interfaces, so they were even worse):

I can refund a customer’s purchase

TillScreen.addItem("Panatachi Television 3X").pay("2000").with(new CreditCard("2345123478902468")).done();
int originalNoOfTelevisionsInStock = StockScreen.count("Panatachi Television 3X");
Money originalBalance = FinanceScreen.getBalance();

TillScreen.findMostRecentBill("2345123478902468").refundItem("Panatachi Television 3X").with(new CreditCard("2345123478902468")).done();

assertThat(StockScreen.count("Panatachi Television 3X"), eq(originalNoOfTelevisionsInStock));
assertThat(FinanceScreen.getBalance(), eq(originalBalance - 2000));

I can replace a customer’s purchase

TillScreen.addItem("Panatachi Television 3X").pay("2000").with(new CreditCard("2345123478902468")).done();
int originalNoOfTelevisionsInStock = StockScreen.count("Panatachi Television 3X");
Money originalBalance = FinanceScreen.getBalance();

TillScreen.findMostRecentBill("2345123478902468").replace("Panatachi Television 3X").done();

assertThat(StockScreen.count("Panatachi Television 3X"), eq(originalNoOfTelevisionsInStock - 1));
assertThat(FinanceScreen.getBalance(), eq(originalBalance));

Did you actually read those, or did you just skip to this line? Chances are that you just glossed over them. I would. BDD helps clear this up when the scenarios are written, so instead of this we’d have:

I can refund a customer’s purchase

Given that a customer purchased a Panatachi Television 3X
and a Panatachi Television 3X costs $2000and he paid with credit card number 2345123478902468
When I search for the most recent bill with credit card number 2345123478902468
and I refund the Panatachi Television 3X
Then the stock of Panatachi Television 3X should be unchanged
and we should have $2000 less.

I can replace a customer’s purchase

Given that a customer purchased a Panatachi Television 3X
and a Panatachi Television 3X costs $2000
and he paid with credit card number 2345123478902468
When I search for the most recent bill with credit card number 2345123478902468
and I replace the Panatachi Television 3X
Then the stock of Panatachi Television 3X should be one less
and we should have the same balance.

(If you think it’s impossible to write code this way, take a look at the latest RSpec features, brought to you by David Chelimsky. I’m completely blown away by this, and can’t wait to convert all my breakable toy scenarios to RSpec running on JRuby. Anyway…)

It’s easier to merge scenarios

With BDD, it’s fairly easy to see which contexts, events and outcomes are common between the two scenarios, and we can imagine a situation in which we could exercise both features associated with these stories.

You can see that the context of the scenarios both start with a customer owning a previously-purchased Panatachi television. In most frameworks, this is accomplished by running the scenario in which a customer purchases a television; we’re using a scenario as the context for another scenario. Running scenarios takes time! Wouldn’t it be great if we could just run this once?

To make these into one scenario, we need the outcome from one to resemble the context of the other, then we don’t need to set up that context again. In the same way that we’d refactor code to look similar, we can refactor the context of the first to be identical to the outcome of the second.

Interestingly, as I come to do this, I realise we’re missing a couple of contexts! They’re implicit in both scenarios – what’s the actual balance? and the actual stock levels? I’ve also swapped the scenarios around, because that’s the order they’ll happen in now:

I can replace a customer’s purchase

Given that a customer purchased a Panatachi Television 3X
and our balance is $23150
and we have 10 Panatachi Television 3Xs in stock
and he paid with credit card number 2345123478902468
When I search for the most recent bill with credit card number 2345123478902468
and I replace the Panatachi Television 3X
Then we should have 9 Panatachi Television 3Xs in stock
and our balance should be $23150.

I can refund a customer’s purchase

Given that a customer purchased a Panatachi Television 3X
and our balance is $23150
and we have 9 Panatachi Television 3Xs in stock
and a Panatachi Television 3X costs $2000
and he paid with credit card number 2345123478902468
When I search for the most recent bill with credit card number 2345123478902468
and I refund the Panatachi Television 3X
Then we should have 9 Panatachi Television 3Xs in stock
and our balance should be $21150

So, now we can merge the two scenarios:

I can replace or refund a customer’s purchase

Given that a customer purchased a Panatachi Television 3X
and our balance is $23150
and we have 10 Panatachi Television 3Xs in stock
and he paid with credit card number 2345123478902468
When I search for the most recent bill with credit card number 2345123478902468
and I replace the Panatachi Television 3X
Then we should have 9 Panatachi Television 3Xs in stock
and our balance should be $23150.
When I search for the most recent bill with credit card number 2345123478902468
and I refund the Panatachi Television 3X
Then we should have 9 Panatachi Television 3Xs in stock
and our balance should be $21150

Trying to do this with code, instead of English, sucks. It’s hard to see the difference between the setup of a context and the occurrence of an event. It’s a lot easier if the code is written in English, to the extend that most people will skip that second step and move straight to the third.

Even so, we’ve lost some of the emphasis. We don’t want to know that the balance is $23150. We want to know that the balance is the same. The implicitness of the two missing contexts was actually quite beautiful. Wouldn’t it be lovely if we could bring those back?

I want some new features!

The scenario I really want looks like this:

I can replace or refund a customer’s purchase

Given that a customer purchased a Panatachi Television 3X
and he paid with credit card number 2345123478902468
When I search for the most recent bill with that credit card number
and I replace the Panatachi Television 3X
Then we should have 1 less Panatachi Television 3X in stock
and our balance should be the same
When I search for the most recent bill with that credit card number
and I refund the Panatachi Television 3X
Then we should have 1 less Panatachi Television in stock
and our balance should be $2000 less.

So, as a scenario writer (aka BA for happy paths, or QA for full regression, or some combination with a dev to help turn it into code):

  • I want outcomes to be able to record the state of the world before the events that will result in those outcomes, so that I can verify comparative outcomes and retain meaningful emphasis (This is possible in JBehave, but horribly ugly and ties the Givens to the Outcomes – I don’t want to do that!)
  • I want all scenario steps to be aware of the world in which they are running, so that they can share aspects of the world instead of duplicating them (’that credit card number’ instead of ‘2345123478902468′. JBehave already does this, but I’m not sure if RSpec does. Will ask…)
  • I want to be able to associate a scenario with more than one story, so that I can change the scenarios for several stories into a single regression test (RSpec lets you tag things; JBehave doesn’t. Yet.)

As a regression tester (aka dev) being asked to save QA some work:

  • I want to be able to run scenarios as well as stories (JBehave doesn’t do this yet).
  • I want to be able to associate a scenario with several features, benefits, etc., so that I can test a feature or check that a benefit is still being provided (see tagging, above.)
  • I want to have the option to check for the outcome of a scenario being used as the context of another scenario, so that I know whether I need to run them or not. (Since a scenario might be used as the context for a scenario that’s being used as a context, etc., at some point we are likely to find ourselves in a world we can use. Dan says this is a ‘backwards-chaining rules engine’ and is quite excited by the idea. He also says this is incredibly dangerous and will only work for Given/When/Then, not Given/When/Then/When/Then, so it should be used with some thought.)

Fortunately, as well as being a scenario writer (at least for my breakable toys) and a regression tester, I’m a JBehave dev, so I have some control over when I get these features. Looks like I have some work to do.

Jun 132007

Someone on the XP thread suggested, “BDD is just TDD done well with different words.”

Here’s my take.

An application is the end-product, and the value of the application is delivered through its behaviour, as perceived through its user interface.

BDD uses descriptions of behaviour and executable examples (or exemplars; examples carefully chosen to illustrate a particular behaviour). These exemplars, whether in code or plain English, are an aid to communication, to driving a clean, simple design, to understanding the domain, to focusing on and delivering the business value, and to giving everyone involved clarity and confidence that the value is being delivered.

Is doing the above merely ‘doing TDD well’? Maybe it is, but I think that the words do help to change the focus; also, my experience (amongst others) is that some people who struggle to do TDD well find BDD relatively easy.

Here are some of the practices I use in BDD.

A story is a narrative from a customer in a particular role, describing a particular feature and a value that will be delivered through the use of that feature. Scenarios associated with the story will be executable, either manually or through automation, once the feature delivers the value.

(This isn’t a practice; it’s just how stories and scenarios work.)

Making the scenarios and examples executable allows us to know when the value has been delivered, at each level of code. Maintaining those examples allows us to continue to have that certainty as other features are developed.

(This isn’t a practice; it’s just how good tests work. But without using the word ‘test’.)

Each level of code in the application delivers value to some other piece of code, through its own behaviour and its own interface (which could just be the externally visible methods, or an actual java-style Interface). Eventually the sum of those values is delivered to the user interface, which delivers the value to the customer.

(This isn’t a practice; it’s just how software works.)

Describing these stories and behaviours in English allows more people to understand the code and talk about it than otherwise might. For developers, it encourages the use of English in place of Geek. This aids communication.

Driving behaviour from the user interface downwards, and focusing on each unit of code, its responsibility to deliver its value to another, and its ability to use other code to deliver value to itself, minimises non-valuable code and allows developers to split responsibilities easily. This results in a clean, simple design.

Capturing common contexts, events and outcomes, and using these as part of the (coded) domain language, helps us to understand the domain.

Coding the story and scenarios (exemplars at the user interface level, automated at the closest possible level to that interface) helps us focus on, and deliver, the desired value.

Using English for executable scenarios and examples means that everyone, including the customer, can understand which values are delivered, which have been lost (either deliberately or otherwise), and which are yet to come. This helps to give everyone clarity and confidence.

(These are some things I practice as part of BDD.)

I believe this is more than just ‘doing TDD well’. I think it’s ‘doing TDD and a number of other practices related to professional software development well’. The five practices above might conceivably be done in addition to TDD, but they’re not part of any definition of TDD I’ve seen.

Mar 162007

Feedback loops

feedback Comments Off
  • Because our customer doesn’t know what he wants, he finds out from the people that want the system. He sometimes gets this wrong.
  • Because I don’t know what to code, I find out from our customer. I sometimes get this wrong.
  • Because I make mistakes while coding, I work with an IDE. My IDE corrects me when I’m wrong.
  • Because I make mistakes while thinking, I work with a pair. My pair corrects me when I’m wrong.
  • Because my pair is human and also makes mistakes, we write unit tests. Our unit tests correct us when we’re wrong.
  • Because we have a team who are also coding, we integrate with their code. Our code won’t compile if we’re wrong.
  • Because our team makes mistakes, we write acceptance tests that exercise the whole system. Our acceptance tests will fail if we’re wrong.
  • Because we make mistakes writing acceptance tests, we get QA to help us. QA will tell us if we’re wrong.
  • Because we forget to run the acceptance tests, we get Cruise Control to run them for us. Cruise Control will tell us if we’re wrong.
  • Because we forget to maintain the acceptance tests, we get QA to check that the system still works. QA will tell us if it’s wrong.
  • Because we only made it work on Henry’s laptop, we deploy the system to a realistic environment. It won’t work if the deployment is wrong.
  • Because we sometimes misunderstand our customer, we showcase the system. Our customer will tell us if we’re wrong.
  • Because our customer sometimes misunderstands the people that want the system, we put the system in production. The people who want it tell us if we’re wrong.
  • Because it costs money to get it wrong, we do all these things as often as we can. That way we are only ever a little bit wrong.