Debug a flaky Playwright test
Take a Playwright test that fails roughly 20% of the time due to a timing issue. Use the Playwright trace viewer to identify the exact race condition by examining the action timeline, network waterfall, and DOM snapshots. Fix the flakiness using proper waitFor conditions rather than arbitrary sleep() calls.
Why this matters
Flaky tests are the most corrosive thing in a test suite. They erode trust until the team starts ignoring failures entirely; at which point the suite provides no safety net. Debugging flakiness with the trace viewer rather than adding sleeps is the discipline that keeps a test suite trustworthy at scale.
Before you start
- Playwright installed with TypeScript
- A flaky test (write one deliberately: click a button before an API response has loaded)
- trace: 'on-first-retry' set in playwright.config.ts
- Understanding of what a race condition is
Step-by-step guide
- 1
Reproduce the flakiness reliably
Run the test 20 times using: for i in {1..20}; do npx playwright test your-test.spec.ts; done. Record how many times it fails. If it fails less than 3 times in 20 runs, reduce the sleep between setup and action or remove it entirely; you need the failure to be reproducible enough to debug.
- 2
Open the trace viewer
After a failure, run: npx playwright show-trace test-results/.../trace.zip. The trace viewer shows every action, a screenshot at each step, the network waterfall, and the console. Identify which action in the timeline was the last successful one before the failure.
- 3
Identify the race condition
Look at the network waterfall alongside the action timeline. If the test clicked a button before an API call returned, you will see the action happening before the network request completes. If a DOM element was not yet in the expected state, the screenshots will show you what it looked like when the assertion ran.
- 4
Fix with a proper waitFor
Replace the implicit timing assumption with an explicit wait: await expect(page.getByRole('button', {name: 'Submit'})).toBeEnabled() before clicking, or await page.waitForResponse('**/api/data') before asserting on its contents. The waitFor should express a semantic condition, not a time duration.
- 5
Verify the fix holds
Run the test 50 times. If any failures remain, the root cause is deeper than your fix addresses; return to the trace viewer with the new failure. A properly fixed flaky test should have a 0% failure rate, not a reduced failure rate.