BDD and Gherkin

Behaviour-Driven Development bridges the gap between business requirements and automated tests. Requirements are written in natural language (Gherkin), then automated using step definitions.

Behaviour-Driven Development bridges the gap between business requirements and automated tests. Requirements are written in natural language (Gherkin), then automated using step definitions. The same file serves as documentation, acceptance criteria, and executable specification.


The Three Amigos

BDD starts with a conversation between:

  • Product Owner — what behaviour is needed and why
  • Developer — what's technically feasible
  • QA — what could go wrong and what edge cases exist

They discuss concrete examples before writing a line of code. These examples become the Gherkin scenarios.


Gherkin Syntax

Feature: User login

  As a registered user
  I want to log into my account
  So that I can access my personalised dashboard

  Background:
    Given the database contains user "alice@example.com" with password "Secure123!"

  Scenario: Successful login with valid credentials
    Given I am on the login page
    When I enter email "alice@example.com" and password "Secure123!"
    And I click the "Log In" button
    Then I should be redirected to the dashboard
    And I should see the greeting "Welcome back, Alice"

  Scenario: Failed login with wrong password
    Given I am on the login page
    When I enter email "alice@example.com" and password "wrongpassword"
    And I click the "Log In" button
    Then I should see the error "Invalid email or password"
    And I should remain on the login page

  Scenario Outline: Account lockout after repeated failures
    Given I am on the login page
    When I fail to log in <attempts> times with wrong passwords
    Then my account should be <status>

    Examples:
      | attempts | status          |
      | 3        | still active    |
      | 5        | locked out      |
      | 6        | locked out      |

Keywords:

  • Feature — describes the feature being tested (one per file)
  • Background — steps that run before each scenario in the file
  • Scenario — a concrete example (test case)
  • Scenario Outline — parameterised scenario; data from Examples table
  • Given — precondition / initial state
  • When — user action or event
  • Then — expected outcome / assertion
  • And / But — continuation of the previous keyword (avoids repetition)

Step Definitions

Gherkin scenarios are linked to automation code via step definitions. Each step maps to a regex or string pattern.

Python (pytest-bdd):

from pytest_bdd import given, when, then, parsers
import pytest

@given(parsers.parse('I am on the login page'))
def navigate_to_login(page):
    page.goto("/login")

@when(parsers.parse('I enter email "{email}" and password "{password}"'))
def enter_credentials(page, email, password):
    page.fill('[data-testid="email"]', email)
    page.fill('[data-testid="password"]', password)

@when('I click the "Log In" button')
def click_login(page):
    page.click('[data-testid="login-btn"]')

@then(parsers.parse('I should be redirected to the dashboard'))
def assert_dashboard(page):
    page.wait_for_url("/dashboard")

@then(parsers.parse('I should see the greeting "{greeting}"'))
def assert_greeting(page, greeting):
    assert page.locator('[data-testid="greeting"]').text_content() == greeting

JavaScript (Cucumber.js):

const { Given, When, Then } = require('@cucumber/cucumber');
const { expect } = require('@playwright/test');

Given('I am on the login page', async function() {
  await this.page.goto('/login');
});

When('I enter email {string} and password {string}', async function(email, password) {
  await this.page.fill('[data-testid="email"]', email);
  await this.page.fill('[data-testid="password"]', password);
});

Then('I should be redirected to the dashboard', async function() {
  await this.page.waitForURL('/dashboard');
});

Java (Cucumber + JUnit 5):

@Given("I am on the login page")
public void navigateToLogin() {
    driver.get(baseUrl + "/login");
}

@When("I enter email {string} and password {string}")
public void enterCredentials(String email, String password) {
    driver.findElement(By.id("email")).sendKeys(email);
    driver.findElement(By.id("password")).sendKeys(password);
}

@Then("I should see the error {string}")
public void assertErrorMessage(String expected) {
    WebElement error = driver.findElement(By.className("error-message"));
    assertEquals(expected, error.getText());
}

File Organisation

features/
├── auth/
│   ├── login.feature
│   └── registration.feature
├── checkout/
│   ├── cart.feature
│   └── payment.feature
└── step_definitions/
    ├── auth_steps.py
    └── checkout_steps.py

Gherkin Anti-Patterns

Too many steps / too low level:

# Bad — describes implementation not behaviour
When I click the email field
And I type "alice@example.com"
And I press Tab
And I click the password field
And I type "Secure123!"
And I click the button with ID "login-submit"

# Good — describes intent
When I log in as "alice@example.com" with password "Secure123!"

Imperative instead of declarative:

  • Imperative: how the user interacts ("click button X", "enter text in field Y")
  • Declarative: what the user wants to accomplish ("log in as alice")

Declarative scenarios survive UI refactors; imperative ones break with every design change.

Shared state between scenarios: Each scenario must be independent. Use Background only for setup that every scenario genuinely needs. Don't rely on scenario order.


Frameworks

LanguageFrameworkGherkin runner
Pythonpytest-bdd
Pythonbehave
JavaScriptCucumber.js
JavaScriptPlaywright Test (has BDD plugin)via plugin
JavaCucumber-JUnit 5
JavaJBehave
C#SpecFlow
RubyRSpec + Cucumber

BDD in the Development Workflow

  1. Discovery — Three Amigos session; write scenarios before coding
  2. Formulation — refine scenarios into Gherkin (precise language matters)
  3. Automation — step definitions link Gherkin to implementation
  4. Execution — run in CI on every PR; failing scenario = failing acceptance criterion

Scenarios should run in CI. A failing scenario is equivalent to a failing unit test. The PR cannot merge until the feature matches its acceptance criteria.


Common Failure Cases

Scenarios written imperatively at UI interaction level Why: steps that describe clicking and typing rather than user intent break every time the UI is restyled, and they communicate nothing about the business rule being tested. Detect: step definitions contain locators or UI element names (click the button with ID "submit-btn"); scenarios need updating after visual redesigns. Fix: rewrite steps at the intent level (When I submit the order); push all locator knowledge into step definitions and Page Objects.

Step definitions that grow into test logic dungeons Why: without a clear boundary, business rules accumulate inside step definitions rather than being delegated to helper classes, making them impossible to reuse across scenarios. Detect: a single @when or @then function exceeds 30 lines or contains conditionals that branch on the scenario's data. Fix: step definitions should call domain helper functions; keep each step to under 10 lines of coordination code.

Scenarios with shared state via Background that every scenario doesn't actually need Why: a Background block that seeds data for one scenario's happy path makes all other scenarios depend on state they don't need, introducing invisible coupling that causes random failures under parallel execution. Detect: removing a Background step causes an apparently unrelated scenario to fail. Fix: put shared setup only in Background if it genuinely applies to every scenario; use per-scenario fixtures for state that's specific to one or two cases.

Writing Gherkin without a Three Amigos session Why: scenarios authored by QA alone capture QA's assumptions, not the product owner's intent or the developer's technical constraints; ambiguities survive until the step definition is written and the bug is already in the code. Detect: step definitions require significant interpretation beyond the scenario text, or developers reject scenarios as misrepresenting requirements during review. Fix: treat scenario authoring as a synchronous Three Amigos conversation, not an async handoff document.

Connections

Open Questions

  • What testing scenarios does this technique systematically miss?
  • How does this approach need to change when delivery cadence moves to continuous deployment?