Shift-Left Testing

Moving quality activities earlier in the SDLC — from deployment back to design — so defects are caught when they're cheapest to fix.

Moving quality activities earlier in the SDLC (from deployment back to design) so defects are caught when they're cheapest to fix.


The Cost of a Bug by Stage

Stage of detection       Relative cost to fix
────────────────────────────────────────────
Requirements             $1
Design                   $5
Development              $10
Unit test                $15
Integration test         $40
System test              $100
UAT                      $300
Production               $1,000 – $10,000+

Source: IBM Systems Sciences Institute (classic but directionally consistent across studies)

Implication: finding a bug in a story's acceptance criteria costs 1/1000th
of finding it in a production incident — invest accordingly.

Where to Shift Left

Traditional order:       Requirements → Design → Build → Test → Deploy
Shifted order:           Test thinking starts at Requirements

Activities that move left:
  Requirements           QA reviews ACs before sprint starts (Three Amigos)
  Design                 QA reviews API contracts and data models before build
  Development            Developers write tests first (TDD) with QA-reviewed ACs
  Build                  Static analysis and security scanning in pre-commit hooks
  Integration            Contract tests run on every PR against provider stubs
  Pre-deploy             E2E smoke suite runs on every main branch push

Three Amigos — Requirements Quality Gate

Three Amigos session before a story enters the sprint:
  - Developer: "Can I build this? Are there technical constraints?"
  - QA: "How will I test this? What are the edge cases?"
  - Product: "Is this the right thing to build? Does it meet the user need?"

Output: acceptance criteria that are specific, testable, and unambiguous.

Three Amigos AC checklist:
  [ ] Each AC is verifiable (has a clear pass/fail)
  [ ] Each AC specifies the user type (authenticated? admin? guest?)
  [ ] Error cases and boundary values are explicitly defined
  [ ] Non-functional requirements have numbers (< 200ms, > 99.9%)
  [ ] Integration points are named (which service? which field?)
  [ ] "And what if...?" questions answered before sprint starts

Bad AC:   "The checkout should be fast."
Good AC:  "GIVEN a user with items in their cart, WHEN they click Place Order,
           THEN the order is confirmed in < 2 seconds (p99 in staging under 10 RPS)."

Gherkin as a Specification Tool

# Written before implementation — not after
# This becomes both documentation and automated test

Feature: Discount code application

  Background:
    Given a product "Widget Pro" with price £99.99
    And a discount code "SAVE10" that gives 10% off

  Scenario: Valid discount code applied at checkout
    Given I have "Widget Pro" in my cart
    When I apply discount code "SAVE10"
    Then the discount amount is £10.00
    And the total becomes £89.99

  Scenario: Invalid discount code
    Given I have "Widget Pro" in my cart
    When I apply discount code "INVALID"
    Then I see "This code is not valid"
    And the cart total remains £99.99

  Scenario: Expired discount code
    Given I have "Widget Pro" in my cart
    And discount code "EXPIRED01" expired yesterday
    When I apply discount code "EXPIRED01"
    Then I see "This code has expired"

  Scenario: Discount applied to only eligible items
    Given I have "Widget Pro" (£99.99) and "Basic Widget" (£19.99) in my cart
    And discount code "SAVE10" applies only to "Widget Pro"
    When I apply discount code "SAVE10"
    Then the discount is £10.00
    And the total is £109.98

Pre-Commit Hooks as Shift-Left Gate

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.4.4
    hooks:
      - id: ruff              # linting
      - id: ruff-format       # formatting

  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.10.0
    hooks:
      - id: mypy
        additional_dependencies: [pydantic, sqlalchemy-stubs]

  - repo: https://github.com/PyCQA/bandit
    rev: 1.7.8
    hooks:
      - id: bandit
        args: ["-r", "src/", "--severity-level", "medium"]

  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.5.0
    hooks:
      - id: detect-secrets
        args: ["--baseline", ".secrets.baseline"]

  - repo: local
    hooks:
      - id: unit-tests-fast
        name: Fast unit tests
        entry: pytest tests/unit/ -x -q --timeout=5
        language: system
        types: [python]
        pass_filenames: false

API Contract Review (Design Phase)

# QA reviews OpenAPI spec before implementation begins
# Checklist for API contracts:

def validate_api_contract(spec_path: str) -> list[str]:
    import yaml
    issues = []

    with open(spec_path) as f:
        spec = yaml.safe_load(f)

    for path, methods in spec.get("paths", {}).items():
        for method, definition in methods.items():
            endpoint = f"{method.upper()} {path}"

            # Must have 4xx response defined
            responses = definition.get("responses", {})
            if not any(str(code).startswith("4") for code in responses):
                issues.append(f"{endpoint}: no 4xx error response defined")

            # Must have 5xx response defined
            if "500" not in responses and "default" not in responses:
                issues.append(f"{endpoint}: no 500/default error response defined")

            # Request body must have schema
            req_body = definition.get("requestBody", {})
            if req_body and "schema" not in str(req_body):
                issues.append(f"{endpoint}: request body missing schema")

    return issues

# Run in CI before implementation PR is opened

Shift-Left Metrics

Measure to know if shift-left is working:

Defect escape rate by stage:
  formula: bugs found in stage N+1 / bugs found in stage N
  target:  < 10% escape rate from dev into QA testing

Requirements defect rate:
  formula: ACs changed mid-sprint / total ACs planned
  target:  < 15% (high rate = Three Amigos sessions not deep enough)

Test coverage at PR merge:
  formula: coverage % when PR merges (not at end of sprint)
  target:  > 80% before merge, not retrofitted after

Time from code complete to test complete:
  formula: hours between "dev done" and "QA signed off"
  target:  < 24 hours (indicates test environment and data are ready)

Common Failure Cases

Pre-commit hook is too slow, so developers bypass it with --no-verify Why: running the full unit test suite on every commit takes more than 30 seconds, making the hook feel like an obstacle. Detect: git log --format="%H %s" | xargs -I{} git show {}:. 2>/dev/null can't easily detect this, but git log --all --grep="no-verify" in commit messages or a rising bug rate into QA is the signal. Fix: scope the pre-commit unit test step to tests/unit/ with --timeout=5 and -x (fail fast), keeping the hook under 15 seconds; move integration tests to the CI push hook instead.

API contract review is skipped because the OpenAPI spec is generated after implementation Why: teams generate their OpenAPI spec from code annotations rather than writing it first, so there is nothing to review before implementation starts. Detect: the OpenAPI spec file is only modified in the same commit that implements the endpoint, never before. Fix: require the OpenAPI spec to be committed in a separate PR before the implementation PR is opened; use the spec as the implementation contract, not a by-product.

Three Amigos sessions produce ACs that are not measurable Why: product managers define ACs in business language without specifying the user type, data conditions, or pass/fail threshold, leaving QA to guess at test specifics. Detect: ACs contain words like "fast," "appropriate," or "should work" without numerical thresholds or explicit personas. Fix: block the Three Amigos sign-off until every AC passes the checklist: verifiable outcome, named user type, defined error cases, and non-functional requirements with numbers where applicable.

Shift-left metrics show good escape rates but are measured at the wrong boundary Why: teams measure defect escape rate from QA into production but ignore the rate from development into QA, missing the shift-left signal entirely. Detect: QA is still finding large numbers of bugs that should have been caught by unit or integration tests — the escape rate from dev into QA is never tracked. Fix: add a "defect detection stage" field to every bug ticket and produce a monthly chart showing the distribution of detection stages; the target is to move the peak leftward over time.

Connections

qa-hub · qa/agile-qa · qa/test-planning · qa/bdd-gherkin · qa/defect-prevention · qa/continuous-testing · qa/qa-in-devops

Open Questions

  • What testing scenarios does this technique systematically miss?
  • How does this approach need to change when delivery cadence moves to continuous deployment?