Using BDD with Legacy Systems

One question I keep being asked is, “Can we use BDD with our legacy systems?”

To help answer this, let me give my simplest definition of BDD:

BDD is the art of using examples in conversation to illustrate behaviour.

So, if you want to talk through the behaviour of your system, and use examples to illustrate that behaviour, yes, you can do this.

There are a couple of benefits of this which might be harder to achieve. The biggest is that using examples in conversation helps you explore behaviour, rather than just specifying it. Examples are easy to discuss, but it’s also easy to decide that they don’t matter, or that you can worry about that scenario later, or that you want different behaviour. It’s harder to do that when you’re talking about tests or specifications, and it’s harder still when the behaviour you’re discussing already exists. If you do find yourself talking through the different examples, you’re probably clarifying behaviour. That’s OK; just recognize that you’re doing it, so some of the usual advice we all give around BDD, and particularly around scope management with BDD, won’t apply. Asking questions, though, is still a great idea. You might find all those places where the behaviour is wrong, or annoying, or isn’t needed at all. Using examples is a great way to illustrate bugs, and helps testers out a lot.

The second aspect is the automation. (If you do this, please consider the other post I wrote about things I like to see in place first; these still apply.) Automation is usually harder with legacy systems, because often they weren’t designed with automation in mind. Websites have elements with no identifiers or meaningful classes, windows applications have complicated pieces with no automation peers, some Adobe widgets assume that if you’re using automation it must be because of reading disabilities so it will helpfully pop up boxes on your screen to ‘help’ you (thank you for that, Adobe).

But the real reason why BDD becomes hard with legacy systems is because often, the system was designed without talking through the behaviour, and the behaviour itself makes no sense.

I recently tried to retrofit SpecFlow around my little toy pet shop. The pet shop itself was designed just as a way of showcasing different automation elements, so it wasn’t particularly realistic. Because of that, I find it impossible now to have conversations about its behaviour, because its behaviour simply isn’t useful. It isn’t how I would design it if I were actually designing a UI for pet shop software. I can’t even talk to my rubber duck about it. I won’t be able to sensibly fit SpecFlow to this until I can actually change the behaviour to something sensible.

If you’re in one of those unfortunate environments with a bit of a blame culture, BDD will help introduce transparency into the quality of your process – or lack of it. Just so you’re warned. (In my instance it was a sensible trade-off at the time, since I originally wanted automation software, not a pet shop, and it’s my software so it’s my problem. You may not be so lucky.)

Automation on legacy systems can give you a nice safety net for any other changes, so it might be worth trying this for a few key scenarios. Teams and particularly testers I’ve worked with have been saved a lot of time in the past by just having one scenario that makes sure the app can start – automation is particularly useful if your build system is closer to your production system than your development one; frequently the case for legacy systems.

If you do happen to find an aspect of behaviour that you like and want to capture, then by all means, do BDD. Talk through the examples, stick them in a wiki, automate them if you can, remembering that having conversations is more important than capturing conversations is more important than automating conversations. You might even find out why things behave a certain way, and come to like the existing behaviour better.

Otherwise, you might want to wait until you’re changing the behaviour to something you like.

This entry was posted in bdd. Bookmark the permalink.

12 Responses to Using BDD with Legacy Systems

  1. I have found that BDD is a luxury in legacy code that I usually can’t afford at first. This is not to say I’m anti-automated-unit-testing, just the opposite. Allow me to explain.

    I have found 4 benefits of unit testing

    1) specification
    2) feedback
    3) regression
    4) granularity

    Each of these is important (although different individuals will value them differently). I find BDD is most helpful in Specification. Particular in the discovery of specification and guiding a specification that locks implementation at the appropriate level (meaning if the test it breaks the user cares)

    In my experiences in legacy code, however, I have not usually been able to focus on any of these except #3) Regression. Maybe this is just because I haven’t learnt how yet, and someday we will look back and think “how naive”. But here’s my general approach.

    Create some locking tests.

    Locking tests consist of
    1) do something (consistently)
    2) capture what it produced
    3) verify it is producing the same thing

    I usually do this with some form of approval testing like approval tests ( github.com/approvals or nice video here http://www.youtube.com/watch?v=n-JSrvW4MVs )

    These test usually will forsake the other 3 values. for example:

    #1 Ignoring Spec
    What does this test do? I had 10,000 of code that ran the billing cycle. It was called from a very simple updateAccounts() method. It wasn’t repeatable because it modified the database, so I turned the database to readonly. this broke the code, but I added a flag to simply log update/create queries to file instead of executing them. now it “worked” creating a *very* long log file. I approved this as the golden master and afterwards could refactor (I got it down to 5,000 lines that week) Now if asked what this test does? the best I could tell you is that it produces sql code, and I am very positive it still produces the same sql that it did before it started. After the refactoring I can also tell you that in some cases is produced *very* similar but not quite exact code which is most likely a bug (someone fixed 7 out of 8 places) but I’m not completely sure that it is and so I didn’t go *fixing* it. what do these sql statements do to the database and how does it change it? I have no idea.I believe when it’s down to ~800 line these questions will be better to start asking…

    #2 Ignoring feedback
    Tests should be short right? I am usually quite happy to have a long test over no test. moreover if I can only test a checksum (say md5 of some data vs all data, or filesizes of a directory instead of the actual contents of the files) so be it. I used a process to lock down the directory listing of hundreds of files being created though a process. The test took 1 1/2 hours initially. I ran it through a code coverage script to iteratively remove run time scenarios that didn’t affect coverage until it got down to a 5 minute test. This is still insanely long (I want unit tests to be in the no tests

    #4 ignoring granularity.
    Why did that tests break? We were working on a php system of 6000 lines of code ( I wish I could say a 6,000 line method, but it wasn’t I was just a block of code, putting it into a method didn’t seem to interest the original author.) We locked the output, a big nasty piece of html. and started to refactor. about 20 minutes in it broke. We were using a diff tool to compare the results, but couldn’t figure it out as to why? So we hit undo a few times. Still broke. few more, Still broke. reverted all the way back to the beginning. Still broke. What? we finally figured out that this app ( a calendar view) updated every 1/2hour. the solution. we waited till 11:01. Locked the tests. coded till 11:25 and then waited until 11:31 and repeat. We worked this way all day. It is worth noting that we did at least get temporal granularity here. I could tell if the line I altered broke something if not how or why it broke it. But the best insight I got was merely “the test is now red”

    All of these tests suck compared to real BDD/unit tests. Many of them I discarded after they allowed me to refactor the section. Later, I would seek out more valuable longer term solutions and BDD is very useful then, but in legacy code I usually find I have to focus on *better*, *good* is a solution that is just too expensive to be practical just yet.

  2. Pingback: The Baeldung Weekly Review 4

  3. Pingback: Where to start using BDD on existing system? | Zapien Answers

  4. Giso says:

    “If you’re in one of those unfortunate environments with a bit of a blame culture, BDD will help introduce transparency into the quality of your process – or lack of it. Just so you’re warned.”

    I burst into laughing when I read that. It’s just so true. And people don’t want to know. They _resent_ being confronted with the insight that things they do don’t make sense (and never did).

    I just left a project where we literally spent the last four years trying to transform a central piece of business software from a no-tests bursting-at-the-seams nobody-knows-what-the-hell-the-thing-is-doing nobody-knows-what-the-hell-the-thing-should-be-doing big ball of mud into something that can be understood, is able to support the dreams of business stakeholders without being a patchwork of inter-dependant hacks and can be extended easily. I left the project after all this time because I was beginning to fear for my sanity (when outsiders comment on the lack of any intonation while you speak about your work or how your laughing has acquired a hysterical quality, that’s a big indicator).

    It was uphill work from the beginning. Resistance came from everywhere, even from within the development team (a fact that amazes me to this day).
    Business: “We need features F1, F2, …, F30. Go-live is to happen within a month.”
    Devs: “But …”
    Business:
    Devs: “Ok, we’ll do F12. Can you explain beyond the feature title how you want the software to behave?”
    Business: “I don’t know what you mean. It’s all in the requirement docs.”
    Devs: “Yeah, but imagine . When the user does what should happen?”
    Business: “Molest me not with your technical questions! Oh, by the way, can you tell me what happens when does and does ?”
    Devs: “Erm, we’re developers. Shouldn’t _you_ know how you conduct your business?”
    Business: “How can we possibly know what _you_ implemented in your software?”
    Devs: “There’s no one left from the original dev team. However, the software _was_ implemented according to _your_ specifications. But ok, we’ll look it up in the code… By the way, this very conversation demonstrates again how urgently we need to change how we do things around here.”
    Business: “Oh well, go ahead then. But be sure to have that feature ready at the end of the month!”
    Devs: “No, we meant ‘we all, you included’. We can’t possibly have that feature ready without your help. And about the other features: it’s not even possible to talk about them because the software was not built to support anything like them.”
    Business: “We don’t know what you mean by this. Leave us alone, we don’t have time for your technical problems…”

    In retrospect, I think the core problem is that people don’t want to be told that they’re doing a bad job. And often enough it’s not even the fault of individuals, but of the whole web of interconnected “bad-jobbery”. After about three years and a half we finally had enough people convinced that the only way forward was to develop a new, up-to-date model of the various domains our software was supporting. We already had refactored parts of the software towards a well-defined software architecture, but couldn’t go on without re-implementing the same old hackish behaviour, unless we got hold of that model.

    So, during the last 6 months we, that is, a business analyst and me, actually did sit down with various domain experts and business stakeholders. I used BDD a lot, I used Feature Injection a lot. Amongst us, we developed that model. Within that little island, it was the happiest time for me there.

    • Giso says:

      Oh, well. I knew I shouldn’t have used those angle brackets … So, some things got filtered out from that conversation. Sorry for that.

  5. Pingback: What is Behaviour-Driven Development? | Scrum & Kanban

  6. Pingback: The one where…we all started talking – techneuk

  7. Pingback: Development processes (Toolbox #6) - The Geeky Gecko

  8. Pingback: Моделирование на примере: общее понимание • ARUF

  9. Pingback: Modellierung durch Beispiel: Ein gemeinsames Verständnis • Wons

  10. Pingback: Моделирование на примере: общее понимание • Wonz

  11. Pingback: Modelar con el ejemplo: un entendimiento compartido • ARUF

Leave a comment