Feature Flags

Decoupling deployment from release — ship code dark, control visibility independently.

Decoupling deployment from release. Ship code dark, control visibility independently.


Flag Types

Release flag (temporary):
  Purpose: hide incomplete feature until ready
  Lifetime: days to weeks; delete once fully rolled out
  Example: new_checkout_flow = True | False

Experiment flag (A/B test):
  Purpose: test hypothesis with user segment
  Lifetime: days to weeks; delete after analysis
  Example: checkout_button_color = "blue" | "green"

Ops flag (operational):
  Purpose: kill switch for degraded services, rate limits
  Lifetime: permanent; changed at runtime under load
  Example: enable_email_notifications = True | False

Permission flag (permanent):
  Purpose: enable feature for specific user tier
  Lifetime: permanent; managed as entitlement
  Example: advanced_analytics = based on plan

Simple In-Process Flags (No External Service)

# flags.py — environment-driven flags, no SDK dependency
import os
from functools import lru_cache

@lru_cache(maxsize=None)
def get_flags() -> dict[str, bool]:
    """Load flags from environment. Cache for process lifetime."""
    return {
        "new_checkout_flow": os.getenv("FLAG_NEW_CHECKOUT", "false").lower() == "true",
        "enable_recommendations": os.getenv("FLAG_RECOMMENDATIONS", "false").lower() == "true",
        "admin_panel_v2": os.getenv("FLAG_ADMIN_V2", "false").lower() == "true",
    }

def flag(name: str) -> bool:
    return get_flags().get(name, False)

# Usage
if flag("new_checkout_flow"):
    return new_checkout_handler(request)
return legacy_checkout_handler(request)

Unleash (Open Source)

# pip install UnleashClient
from UnleashClient import UnleashClient
from UnleashClient.strategies import Strategy

client = UnleashClient(
    url="https://unleash.myapp.com/api",
    app_name="order-service",
    custom_headers={"Authorization": "Bearer *:default.secret"},
)
client.initialize_client()

def is_enabled(flag_name: str, user_id: str | None = None) -> bool:
    context = {"userId": user_id} if user_id else {}
    return client.is_enabled(flag_name, context)

# Gradual rollout: Unleash "gradual rollout" strategy
# Configures 10% → 25% → 50% → 100% rollout in the Unleash UI.
# SDK handles the consistent hashing (same user always gets same bucket).

# Custom strategy — e.g., enable for specific company
class CompanyStrategy(Strategy):
    name = "company"

    def load_provisioning(self) -> list:
        return self.parameters.get("companies", "").split(",")

    def apply(self, parameters: dict, context: dict) -> bool:
        company_id = context.get("properties", {}).get("companyId")
        return company_id in self.load_provisioning()

LaunchDarkly

# pip install launchdarkly-server-sdk
import ldclient
from ldclient import Config, Context

ldclient.set_config(Config(os.environ["LAUNCHDARKLY_SDK_KEY"]))
ld = ldclient.get()

def evaluate_flag(flag_key: str, user_id: str, default=False) -> bool:
    context = (
        Context.builder(user_id)
        .kind("user")
        .set("email", get_user_email(user_id))
        .set("plan", get_user_plan(user_id))
        .build()
    )
    return ld.variation(flag_key, context, default)

def evaluate_multivariate(flag_key: str, user_id: str, default: str) -> str:
    context = Context.builder(user_id).build()
    return ld.variation(flag_key, context, default)

# A/B test example
button_color = evaluate_multivariate("checkout_button_color", user_id, "blue")
# LD assigns "blue" or "green" consistently per user based on rollout rules

# Flag with targeting rules (configured in LD dashboard):
# IF user.plan == "enterprise" → true
# IF user.email matches "*@internal.com" → true
# ELSE rollout 10% → true

Testing with Feature Flags

# Pattern 1: inject flag as parameter (most testable)
def checkout(user_id: str, cart: Cart, use_new_flow: bool = False) -> Order:
    if use_new_flow:
        return new_checkout_flow(user_id, cart)
    return legacy_checkout_flow(user_id, cart)

def test_new_checkout_flow() -> None:
    order = checkout("user_1", cart, use_new_flow=True)
    assert order.status == "confirmed"

# Pattern 2: mock the flag client
from unittest.mock import patch

def test_feature_flag_off() -> None:
    with patch("myapp.flags.flag", return_value=False):
        response = client.post("/checkout", json=cart_data)
        assert "legacy" in response.json()["flow"]

# Pattern 3: override via environment in integration tests
import pytest

@pytest.fixture
def with_new_checkout(monkeypatch):
    monkeypatch.setenv("FLAG_NEW_CHECKOUT", "true")
    # Clear lru_cache so the new env value is picked up
    from myapp.flags import get_flags
    get_flags.cache_clear()
    yield
    get_flags.cache_clear()

def test_new_checkout_end_to_end(with_new_checkout, client) -> None:
    response = client.post("/checkout", json=cart_data)
    assert response.json()["flow"] == "new"

Gradual Rollout Pattern

# Consistent user bucketing without an SDK
import hashlib

def get_user_bucket(user_id: str, flag_name: str) -> int:
    """Returns 0-99 consistently for a given user+flag combination."""
    key = f"{flag_name}:{user_id}"
    hash_val = int(hashlib.md5(key.encode()).hexdigest(), 16)
    return hash_val % 100

def is_in_rollout(user_id: str, flag_name: str, rollout_pct: int) -> bool:
    """True if user is in the first rollout_pct% bucket."""
    return get_user_bucket(user_id, flag_name) < rollout_pct

# Example rollout schedule (driven by ops flag config):
# Day 1: rollout_pct=1
# Day 3: rollout_pct=10  (watch metrics)
# Day 5: rollout_pct=50
# Day 7: rollout_pct=100 (flag removed, code path becomes default)

Flag Lifecycle Management

Creation → Rollout → Removal (the missing step most teams skip)

Signs a flag needs removal:
  - Flag has been at 100% for > 2 weeks
  - Both branches tested and stable
  - Flag is a release flag (temporary by design)

Removal process:
  1. Set flag to 100% in production (default)
  2. Remove flag evaluation from code (merge the winning path)
  3. Delete dead code path
  4. Archive flag in flag management system
  5. Update tests (remove flag-conditional test paths)

Anti-patterns:
  - Flag spaghetti: flags that depend on other flags
  - Zombie flags: release flags left in code for months
  - Missing cleanup: flagged code becomes permanent tech debt
  - Global singleton: flags evaluated in constructors (hard to test)

Common Failure Cases

Zombie flags accumulating as tech debt Why: release flags are created with no deletion plan; after a feature ships at 100%, the conditional code and both branches remain in production indefinitely. Detect: flags created more than 4 weeks ago that have been at 100% for more than 2 weeks with no open cleanup ticket. Fix: track flag creation date and owner; add an automated lint step that fails the build if a release flag's age exceeds the agreed threshold.

lru_cache holding stale flag values in long-running processes Why: get_flags() cached on first call never re-reads the environment, so a change to an env var has no effect until the process restarts. Detect: changing FLAG_X=true in the environment has no effect; only a process restart picks it up. Fix: either clear the cache explicitly after changing env vars in tests, or use a flag client (Unleash/LaunchDarkly) that polls for changes rather than a process-lifetime cache.

Flag spaghetti: flags that depend on other flags Why: team members combine if flag_a and flag_b and not flag_c inline, creating state space that is impossible to test exhaustively. Detect: a change to one flag breaks behaviour that seemed unrelated; reproducing a bug requires knowing the exact flag combination active at the time. Fix: flags should be independent; if a feature requires multiple flags, use a single enclosing flag and remove the inner ones.

Inconsistent bucket assignment in gradual rollout Why: using a non-deterministic bucketing function (e.g., random.random() < 0.1) means the same user gets a different experience on each request. Detect: a user reports the UI "flickering" between old and new versions on page refresh. Fix: always derive the bucket from a stable hash of flag_name + user_id so assignment is sticky.

Connections

se-hub · cs-fundamentals/api-security · cloud/blue-green-deployment · qa/continuous-testing · qa/shift-left-testing

Open Questions

  • What are the most common misapplications of this concept in production codebases?
  • When should you explicitly choose not to use this pattern or technique?