In the previous post in this series on Gherkin, I showed the tools that Gherkin Features provide for requirements tracking, and mentioned ambitious goals for the Behaviour-driven Development (BDD) movement. In this article, I want to expand on these BDD ideals, show Gherkin Scenarios, and explain how the dream of traceability led to Cucumber, a tool for automating acceptance test execution, tracing them back to Gherkin Scenarios. We’ll set aside my personal feelings on these tools for a future post.
The behaviour-driven development movement (closely associated with the Acceptance-Test Driven Development, aka ATDD) believes that customers don’t ever care about (good) code, they care about behaviours: More code does not matter, unless the result is behaviours useful to the client. This translates to testing: Who cares about code internals getting 100% test coverage, the point is what external system behaviour is being flexed, what value is being added to the project.
This highlights that BDD isn’t as much a developer’s movement (like TDD is often portrayed as), it’s a vision for project management. We want to make sure we build the right thing, rather than build software right.
To ensure we build the right thing, BDD tries to bring the 3 amigos1 together to spec out behaviours of the project, very early on, with code being simply a means to get those behaviours.
But then, the ATDD crowd comes in, and points out that “all this behaviour you want … how do we prove it’s correctly implemented? The TDD crowd has a point you know?“.
So the BDD crew thought up a dream of traceability: Specify behaviours ahead of time, with the product owner, to the point that acceptance tests are written before any line of code, and have tooling record full traceability of requirements all the way to project completion via automated tests. When the automated acceptance tests are green, the product is complete, ready to ship.
This push for tooling to manage requirements as code is in line with similar movements like infrastructure-as-code, and more recently documentation-as-code. These movements acknowledge that developers have fantastic tools and processes, like version control, continuous integration, and code review, and believe that more parts of the development cycle should align with these practices. First it was sysadmins and frustratingly irreproducible machines, then documentation moving away from Word’s binary files, and now Gherkin bringing requirements traceability tooling.
To see how this works out in practice, let’s come back to Gherkin, the DSL (Domain Specific Language) that BDD created to specify behaviours.
We covered Features before. As you can see, Features aren’t tests, they don’t cover edge cases, and aren’t sufficient to know precisely what to build. This is where a new Gherkin construct is useful: Scenarios.
Scenarios (sometimes called Examples) gives us a way to show how behaviour works as a testable unit, by breaking it down into precondition, trigger, and outcome.
Scenario: User signs up to an event
Given an event announced in Discord
When the user reacts to event with the "📅" emoji
Then the user is marked as attending the event
And an email reminder is sent to the user
Breaking it down:
Given
: Pre-condition, context the scenario is built around. Any resources
required to make the test happen is defined here.When
: Trigger for the specific behaviour being describedThen
: Outcome of the feature. Must be an externally assertable behaviour, no
prodding object internalsAnd
, But
: Syntactic sugar to avoid repetition. “But” is shorthand for “And not”.Gherkin has more tricks like Scenario Outlines, data tables, Backgrounds, tags for Feature grouping, but I won’t cover them here, you’d be better served reading the gherkin reference and then go over gherkin best practices. Suffice to remember that Gherkin has Features, good for requirements, and Features can be elaborated into Scenarios to create acceptance tests.
It’s important to note that Scenarios are hard to write well: the level of abstraction must be just right: not getting into the weeds of implementation concerns, but also not too disconnected from technical details, which often requires a developer to be around. So you can’t write these just between marketing department and the customer, you want a dev. To capture the behaviours that add the most business value, a product owner is important here as well. Finally, these must be testable, so a QA member is valuable … and so we come back to three amigos in a room.
Gherkin-the-syntax was created to serve as english text blueprints for the specification of acceptance tests. But BDD goes further: we have testing frameworks, so why not use Gherkin-the-text as a machine-enforced spec of the system behaviours, with each line of Gherkin Scenario mapping to the actual test code. This way, the project’s progress can be observed by how many acceptance tests are now green, and how many are still red.
So the tool Cucumber was born from the Ruby community, running acceptance tests by reading Gherkin. Here’s an example of this kind of traceability, taken from pytest-bdd, a Python reimplementation of Cucumber built as extension of the pytest framework:
Feature: Blog
A site where you can publish your articles.
Scenario: Publishing the article
Given I'm an author user
And I have an article
When I go to the article page
And I press the publish button
Then I should not see the error message
And the article should be published # Note: will query the database
from pytest_bdd import scenario, given, when, then
@scenario('publish_article.feature', 'Publishing the article')
def test_publish():
pass
@given("I'm an author user")
def author_user(auth, author):
auth['user'] = author.user
@given("I have an article", target_fixture="article")
def article(author):
return create_test_article(author=author)
@when("I go to the article page")
def go_to_article(article, browser):
browser.visit(urljoin(browser.url, '/manage/articles/{0}/'.format(article.id)))
@when("I press the publish button")
def publish_article(browser):
browser.find_by_css('button[name=publish]').first.click()
@then("I should not see the error message")
def no_error_message(browser):
with pytest.raises(ElementDoesNotExist):
browser.find_by_css('.message.error').first
@then("the article should be published")
def article_is_published(article):
article.refresh() # Refresh the object in the SQLAlchemy session
assert article.is_published
As you can see, decorators sprinkled over test code gives a way to link the tests back to Gherkin, which the tool then uses to report how many behaviours are validated by acceptance tests. In the case of an “all green” test report, the project can be shipped, because all required behaviours have been implemented.
Note of course that acceptance tests are not unit tests, as acceptance tests don’t say anything about the internal qualities of a codebase, just prove the external behaviour of it all. So BDD doesn’t make good code on its own, because code maintainability isn’t a direct objective of BDD.
BDD is a software movement that tries to reconciliate all parties of the project planning around the definition of Behaviours in a clear language, called Gherkin. It even pushes this dream towards tooling, Cucumber, which enforces these behaviours through acceptance tests that trace back to Gherkin. Requirements-as-code is an admirable goal, which I’m excited about!
However, as we’ll see in a later post, I believe the tooling doesn’t quite live up to the expectations. In the next post on this series on Gherkin, we’ll look at how Cucumber falls flat, and I’ll showcase my personal low-tech solution for grabbing 80% of the value for 20% of the effort.