Synthesis

49 pages

AI Application Architecture Patterns

Seven blueprints (RAG chatbot, document pipeline, classification routing, agentic loop, multi-agent, eval pipeline, hybrid human-AI) cover 90% of production AI applications — real systems combine two or three.

architecturepatternsrag-chatbotagent

AI Engineering Learning Path

Four-stage curriculum for software engineers entering AI engineering — Foundations (1-2 weeks), Building (2-3 weeks), Production (2-3 weeks), Advanced (ongoing) — each stage has a concrete project to build.

learning-pathcurriculumroadmapbeginners

Data as a System

Data as a first-class system concern — lineage, contracts, freshness, ownership, and consistency across services. Most production bugs are data bugs, not code bugs.

data-engineeringdata-qualitydata-lineagedata-contracts

Debug: Agent Loop Not Terminating

Runbook for diagnosing agents that spin indefinitely, hit max iterations, or get stuck on a tool call.

debuggingagentslooptermination

Debug: Agent Not Using Tools

Runbook for diagnosing LLM agents that ignore available tools and answer from memory instead.

debuggingagentstoolsllm

Debug: Alert Firing Incorrectly

Runbook for diagnosing alerts that fire when nothing is wrong, or fail to fire when something is.

debuggingalertingobservabilityfalse-positive

Debug: API Timeout

Runbook for diagnosing intermittent or consistent API timeouts under load.

debuggingapitimeoutrunbook

Debug: Auth Failing

Runbook for diagnosing 401 and 403 errors, JWT failures, token expiry, and OAuth misconfiguration.

debuggingauthjwt401

Debug: Cache Inconsistency

Runbook for diagnosing stale, wrong, or missing data caused by cache inconsistency.

debuggingcacheredisstale-data

Debug: CI Pipeline Failing

Runbook for diagnosing CI pipelines that fail in CI but pass locally.

debuggingcipipelinegithub-actions

Debug: Cloud Cost Spike

Runbook for diagnosing unexpected cloud bill spikes and identifying the source of runaway spend.

debuggingcostcloudaws

Debug: CORS Error

Runbook for diagnosing CORS errors blocking browser requests to an API.

debuggingcorsbrowserpreflight

Debug: Data Pipeline Failing

Runbook for diagnosing silent data drops, stale outputs, or broken ingestion in data pipelines.

debuggingdatapipelineingestion

Debug: Database Migration Failing

Runbook for diagnosing DB migrations that stall, lock tables, or break running services.

debuggingdatabasemigrationschema

Debug: Deadlock

Runbook for diagnosing DB or code deadlocks where requests hang indefinitely waiting for locks.

debuggingdeadlockdatabaseconcurrency

Debug: DNS Resolution Failing

Runbook for diagnosing DNS resolution failures causing services to be unreachable by name.

debuggingdnsnetworkingkubernetes

Debug: Duplicate Writes from Retries

Runbook for diagnosing duplicate records or side effects caused by retries or at-least-once delivery.

debuggingretriesidempotencyduplicate-writes

Debug: Embedding Quality Degraded

Runbook for diagnosing RAG accuracy drops caused by embedding model changes, stale indexes, or distribution shift.

debuggingembeddingsragretrieval

Debug: Fine-Tuned Model Worse Than Base

Runbook for diagnosing a fine-tuned model that performs worse than the base model or regresses after training.

debuggingfine-tuningmodelregression

Debug: Flaky Test in CI

Runbook for diagnosing tests that pass locally but fail intermittently in CI.

debuggingtestingflakyci

Debug: Hallucination in Production

Runbook for diagnosing LLM confidently returning wrong, fabricated, or outdated information in production.

debugginghallucinationllmproduction

Debug: High CPU

Runbook for diagnosing a process with CPU pegged at 100% or consistently high under load.

debuggingcpuperformanceprofiling

Debug: High Error Rate After Deploy

Runbook for diagnosing a spike in errors or 5xx responses immediately after a deployment.

debuggingdeployerror-rateregression

Debug: Kubernetes Pod Not Starting

Runbook for diagnosing pods stuck in Pending, CrashLoopBackOff, or ImagePullBackOff.

debuggingkubernetespodcrashloopbackoff

Debug: LLM High Latency

Runbook for diagnosing slow LLM responses, stalling streams, or high time-to-first-token.

debuggingllmlatencystreaming

Debug: Memory Leak

Runbook for diagnosing process memory that climbs over time and never releases.

debuggingmemoryleakperformance

Debug: No Logs in Production

Runbook for diagnosing missing or incomplete logs in production when they work locally.

debugginglogsobservabilityproduction

Debug: Prompt Injection Detected

Runbook for diagnosing and responding to prompt injection attacks in LLM applications and agents.

debuggingsecurityprompt-injectionagents

Debug: RAG Pipeline Slow

Runbook for diagnosing RAG pipelines where retrieval is slow, adding seconds of latency before the LLM even starts.

debuggingragperformanceretrieval

Debug: RAG Returning Wrong Context

Runbook for diagnosing RAG pipelines returning irrelevant, incomplete, or hallucinated answers.

debuggingragretrievalembeddings

Debug: Scaling Not Triggering

Runbook for diagnosing autoscalers that fail to scale up under load or scale down after load drops.

debuggingautoscalingkuberneteskeda

Debug: Secret Leaked

Runbook for the first 30 minutes after discovering a leaked credential or secret in a public location.

debuggingsecuritysecretscredentials

Debug: Slow Query

Runbook for diagnosing slow database queries causing API latency or timeouts.

debuggingdatabaseperformanceslow-query

Debug: SSL Certificate Error

Runbook for diagnosing TLS handshake failures, expired certificates, and chain errors.

debuggingssltlscertificate

Debug: WebSocket Connection Dropping

Runbook for diagnosing WebSocket connections that drop, fail to connect, or cause reconnect storms.

debuggingwebsocketconnectionrealtime

Debugging Runbooks

Index of 32 production debugging runbooks organised by failure domain. Each runbook is a step-by-step guide for isolating and fixing a specific class of production failure.

debuggingrunbookstroubleshootingproduction

Engineering Tradeoffs

The decisions that separate senior engineers — when to cache, when to use RAG vs fine-tuning, when to scale up vs out, when to accept inconsistency.

system-designarchitecturedecision-makingsenior-engineering

Getting Started with AI Engineering

First working Anthropic API call in under 10 minutes — API key setup, SDK install, stateless multi-turn conversation pattern, streaming, and the nine most common beginner mistakes.

getting-startedsetupfirst-callanthropic

Graph Health Report — 2026-05-08 (vault:lint)

Link density audit across 467 content pages — score 100/100. 3 orphans (broken-link root cause), 81 under-linked, 137 hub pages.

graphhealthaudit

Knowledge Gap Report — 2026-05-08

Ranked list of knowledge gaps relative to active areas — what to research next.

gapsintelligence

LLM Cost Optimisation

Seven cost levers (prompt caching, model routing, Batch API, prompt compression, output token control, semantic caching, streaming) applied together typically reduce LLM API costs by 60-90% without quality loss.

costoptimisationcachingbatching

LLM Decision Guide

Opinionated decision tables for every major AI engineering choice — model (Sonnet 4.6 default), embedding (Cohere 65.2 MTEB), vector store (pgvector if on Postgres), agent framework (LangGraph for stateful Python), observability (Langfuse self-hosted), fine-tuning (Axolotl), and the prompting → RAG → fine-tune → agents escalation order.

decision-guidemodel-selectionarchitecturewhich-model

RAG vs Fine-Tuning

RAG manages knowledge (external, updateable, citable); fine-tuning shapes behaviour (style, terminology, format) — 57% of LLM-deploying organisations use RAG without fine-tuning; the most powerful pattern combines both.

ragfine-tuningdecision-guidecomparison

Reasoning Model Patterns

A production decision framework for when to use reasoning models and extended thinking — covering task fit, budget_tokens selection, cross-provider comparison, and cost/latency tradeoffs.

reasoning-modelsextended-thinkingo3budget-tokens

Request Flow Anatomy

The full anatomy of a web request from user to database and back — where latency accumulates, where failures happen, and where retries apply.

system-designdistributed-systemsnetworkingdebugging

Software Engineer to AI Engineer

The fastest path to AI engineering runs through software engineering fundamentals — debugging, APIs, testing, and data thinking transfer directly; the gap is knowing which AI primitives to reach for when.

careerlearning-pathsoftware-engineeringtransition

System Architecture: Three Ms and Four Cs

The Three Ms and Four Cs framework describing the Nexus/Axiom system architecture and workflow.

architecturemetaframeworknexus

Technical Communication

The communication layer that separates senior engineers from lead engineers — technical writing, stakeholder translation, ADRs, RFCs, and explaining tradeoffs without losing precision.

communicationtechnical-writingleadershipstakeholders

Vault Audit Report — 2026-05-03

Vault audit report — rule-based and semantic checks across all 377 pages. 2 broken links, 32 debug runbooks missing from index, 0 frontmatter issues, 3 complementary pairs (all cross-linked), 1 accuracy issue.

audithealthduplicatesaccuracy