Shift-Left Testing

Moving quality activities earlier in the SDLC — from deployment back to design — so defects are caught when they're cheapest to fix.

Updated Invalid Date·

shift-left early-testing requirements static-analysis tdd three-amigos

Moving quality activities earlier in the SDLC (from deployment back to design) so defects are caught when they're cheapest to fix.

The Cost of a Bug by Stage

Stage of detection       Relative cost to fix
────────────────────────────────────────────
Requirements             $1
Design                   $5
Development              $10
Unit test                $15
Integration test         $40
System test              $100
UAT                      $300
Production               $1,000 – $10,000+

Source: IBM Systems Sciences Institute (classic but directionally consistent across studies)

Implication: finding a bug in a story's acceptance criteria costs 1/1000th
of finding it in a production incident — invest accordingly.

Where to Shift Left

Traditional order:       Requirements → Design → Build → Test → Deploy
Shifted order:           Test thinking starts at Requirements

Activities that move left:
  Requirements           QA reviews ACs before sprint starts (Three Amigos)
  Design                 QA reviews API contracts and data models before build
  Development            Developers write tests first (TDD) with QA-reviewed ACs
  Build                  Static analysis and security scanning in pre-commit hooks
  Integration            Contract tests run on every PR against provider stubs
  Pre-deploy             E2E smoke suite runs on every main branch push

Three Amigos — Requirements Quality Gate

Three Amigos session before a story enters the sprint:
  - Developer: "Can I build this? Are there technical constraints?"
  - QA: "How will I test this? What are the edge cases?"
  - Product: "Is this the right thing to build? Does it meet the user need?"

Output: acceptance criteria that are specific, testable, and unambiguous.

Three Amigos AC checklist:
  [ ] Each AC is verifiable (has a clear pass/fail)
  [ ] Each AC specifies the user type (authenticated? admin? guest?)
  [ ] Error cases and boundary values are explicitly defined
  [ ] Non-functional requirements have numbers (< 200ms, > 99.9%)
  [ ] Integration points are named (which service? which field?)
  [ ] "And what if...?" questions answered before sprint starts

Bad AC:   "The checkout should be fast."
Good AC:  "GIVEN a user with items in their cart, WHEN they click Place Order,
           THEN the order is confirmed in < 2 seconds (p99 in staging under 10 RPS)."

Gherkin as a Specification Tool

# Written before implementation — not after
# This becomes both documentation and automated test

Feature: Discount code application

  Background:
    Given a product "Widget Pro" with price £99.99
    And a discount code "SAVE10" that gives 10% off

  Scenario: Valid discount code applied at checkout
    Given I have "Widget Pro" in my cart
    When I apply discount code "SAVE10"
    Then the discount amount is £10.00
    And the total becomes £89.99

  Scenario: Invalid discount code
    Given I have "Widget Pro" in my cart
    When I apply discount code "INVALID"
    Then I see "This code is not valid"
    And the cart total remains £99.99

  Scenario: Expired discount code
    Given I have "Widget Pro" in my cart
    And discount code "EXPIRED01" expired yesterday
    When I apply discount code "EXPIRED01"
    Then I see "This code has expired"

  Scenario: Discount applied to only eligible items
    Given I have "Widget Pro" (£99.99) and "Basic Widget" (£19.99) in my cart
    And discount code "SAVE10" applies only to "Widget Pro"
    When I apply discount code "SAVE10"
    Then the discount is £10.00
    And the total is £109.98

Pre-Commit Hooks as Shift-Left Gate

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.4.4
    hooks:
      - id: ruff              # linting
      - id: ruff-format       # formatting

  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.10.0
    hooks:
      - id: mypy
        additional_dependencies: [pydantic, sqlalchemy-stubs]

  - repo: https://github.com/PyCQA/bandit
    rev: 1.7.8
    hooks:
      - id: bandit
        args: ["-r", "src/", "--severity-level", "medium"]

  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.5.0
    hooks:
      - id: detect-secrets
        args: ["--baseline", ".secrets.baseline"]

  - repo: local
    hooks:
      - id: unit-tests-fast
        name: Fast unit tests
        entry: pytest tests/unit/ -x -q --timeout=5
        language: system
        types: [python]
        pass_filenames: false

API Contract Review (Design Phase)

# QA reviews OpenAPI spec before implementation begins
# Checklist for API contracts:

def validate_api_contract(spec_path: str) -> list[str]:
    import yaml
    issues = []

    with open(spec_path) as f:
        spec = yaml.safe_load(f)

    for path, methods in spec.get("paths", {}).items():
        for method, definition in methods.items():
            endpoint = f"{method.upper()} {path}"

            # Must have 4xx response defined
            responses = definition.get("responses", {})
            if not any(str(code).startswith("4") for code in responses):
                issues.append(f"{endpoint}: no 4xx error response defined")

            # Must have 5xx response defined
            if "500" not in responses and "default" not in responses:
                issues.append(f"{endpoint}: no 500/default error response defined")

            # Request body must have schema
            req_body = definition.get("requestBody", {})
            if req_body and "schema" not in str(req_body):
                issues.append(f"{endpoint}: request body missing schema")

    return issues

# Run in CI before implementation PR is opened

Shift-Left Metrics

Measure to know if shift-left is working:

Defect escape rate by stage:
  formula: bugs found in stage N+1 / bugs found in stage N
  target:  < 10% escape rate from dev into QA testing

Requirements defect rate:
  formula: ACs changed mid-sprint / total ACs planned
  target:  < 15% (high rate = Three Amigos sessions not deep enough)

Test coverage at PR merge:
  formula: coverage % when PR merges (not at end of sprint)
  target:  > 80% before merge, not retrofitted after

Time from code complete to test complete:
  formula: hours between "dev done" and "QA signed off"
  target:  < 24 hours (indicates test environment and data are ready)

Common Failure Cases

Pre-commit hook is too slow, so developers bypass it with --no-verify Why: running the full unit test suite on every commit takes more than 30 seconds, making the hook feel like an obstacle. Detect: git log --format="%H %s" | xargs -I{} git show {}:. 2>/dev/null can't easily detect this, but git log --all --grep="no-verify" in commit messages or a rising bug rate into QA is the signal. Fix: scope the pre-commit unit test step to tests/unit/ with --timeout=5 and -x (fail fast), keeping the hook under 15 seconds; move integration tests to the CI push hook instead.

API contract review is skipped because the OpenAPI spec is generated after implementation Why: teams generate their OpenAPI spec from code annotations rather than writing it first, so there is nothing to review before implementation starts. Detect: the OpenAPI spec file is only modified in the same commit that implements the endpoint, never before. Fix: require the OpenAPI spec to be committed in a separate PR before the implementation PR is opened; use the spec as the implementation contract, not a by-product.

Three Amigos sessions produce ACs that are not measurable Why: product managers define ACs in business language without specifying the user type, data conditions, or pass/fail threshold, leaving QA to guess at test specifics. Detect: ACs contain words like "fast," "appropriate," or "should work" without numerical thresholds or explicit personas. Fix: block the Three Amigos sign-off until every AC passes the checklist: verifiable outcome, named user type, defined error cases, and non-functional requirements with numbers where applicable.

Shift-left metrics show good escape rates but are measured at the wrong boundary Why: teams measure defect escape rate from QA into production but ignore the rate from development into QA, missing the shift-left signal entirely. Detect: QA is still finding large numbers of bugs that should have been caught by unit or integration tests — the escape rate from dev into QA is never tracked. Fix: add a "defect detection stage" field to every bug ticket and produce a monthly chart showing the distribution of detection stages; the target is to move the peak leftward over time.

Connections

qa-hub · qa/agile-qa · qa/test-planning · qa/bdd-gherkin · qa/defect-prevention · qa/continuous-testing · qa/qa-in-devops

Open Questions

What testing scenarios does this technique systematically miss?
How does this approach need to change when delivery cadence moves to continuous deployment?