Write and run a k6 load test
Write a k6 load test that ramps from 50 to 200 virtual users over 2 minutes, holds at 200 VUs for 3 minutes, then ramps down. Assert that p95 response latency stays below 300ms and error rate stays below 0.1%. Generate a summary report and identify the specific request type causing the bottleneck.
Why this matters
Load testing is the only way to discover performance problems before your users do. p95 latency is the metric that matters for user experience; it tells you what 1 in 20 users experiences, which in production means thousands of people. k6 is lightweight enough to run in CI and powerful enough to simulate realistic production load.
Before you start
- k6 installed (brew install k6 or the Windows/Linux equivalent)
- An HTTP API you can load test (your own or a public test API)
- Basic JavaScript familiarity; k6 scripts are JS
- Understanding of what virtual users and requests per second mean
Step-by-step guide
- 1
Write the load test script
Create a k6 script with a default function that makes an HTTP GET request to your endpoint. Add a check that the response status is 200. Add a sleep(1) between requests to simulate realistic user think time. Without the sleep, each VU will hammer the server as fast as possible; not what real users do.
import http from "k6/http"; import { check, sleep } from "k6"; export default function () { const res = http.get("https://your-api.example.com/users"); check(res, { "status is 200": (r) => r.status === 200, "response time < 500ms": (r) => r.timings.duration < 500, }); sleep(1); // simulate 1-second think time between requests } - 2
Define the load profile
Use the stages option to define the ramp: 0 to 50 VUs over 30 seconds, 50 to 200 VUs over 90 seconds, hold at 200 for 3 minutes, ramp down to 0 over 30 seconds. This shape gives the system time to warm up before hitting peak load.
export const options = { stages: [ { duration: "30s", target: 50 }, // warm up { duration: "90s", target: 200 }, // ramp to peak { duration: "3m", target: 200 }, // hold at peak { duration: "30s", target: 0 }, // ramp down ], }; - 3
Add thresholds
Add thresholds to the options: http_req_duration with p(95) < 300, and http_req_failed with rate < 0.001. k6 will exit with a non-zero code if either threshold is breached; this makes the load test a passable CI gate rather than just a report generator.
export const options = { stages: [ { duration: "30s", target: 50 }, { duration: "90s", target: 200 }, { duration: "3m", target: 200 }, { duration: "30s", target: 0 }, ], thresholds: { // p95 latency must stay below 300ms http_req_duration: ["p(95)<300"], // error rate must stay below 0.1% http_req_failed: ["rate<0.001"], }, }; // Run: k6 run load-test.js // Non-zero exit code if any threshold is breached - 4
Run and read the summary
Run the test and read the end-of-test summary carefully. Look at: http_req_duration (p50, p95, p99), http_req_failed (rate), vus_max (did you hit your target VU count?), and iterations (total requests made). If p95 fails, identify when it breached by looking at the time series, not just the aggregate.
// Output a JSON summary for CI parsing // k6 run --out json=results.json load-test.js // The end-of-test summary in stdout looks like: // http_req_duration......: avg=142ms min=41ms med=128ms max=3.1s p(90)=243ms p(95)=289ms // http_req_failed........: 0.05% ✓ 2 ✗ 3992 // vus_max................: 200 // iterations.............: 12048 33.46/s // // Check p(95) against your 300ms threshold. // Check http_req_failed rate against 0.001. - 5
Identify the bottleneck
Add groups to your script to separate different request types (GET /users vs POST /orders). Run again and compare p95 per group. The group with the highest p95 is your bottleneck. Check whether it is consistent across the run or spikes only at peak load; these have different root causes.
import http from "k6/http"; import { check, group, sleep } from "k6"; export default function () { group("GET /users", () => { const res = http.get("https://your-api.example.com/users"); check(res, { "users 200": (r) => r.status === 200 }); }); sleep(0.5); group("POST /orders", () => { const res = http.post( "https://your-api.example.com/orders", JSON.stringify({ product_id: 1, qty: 2 }), { headers: { "Content-Type": "application/json" } } ); check(res, { "orders 201": (r) => r.status === 201 }); }); sleep(0.5); } // k6 reports p95 per group in the summary; compare them to find the slow one