The pyramid is the right shape, but the proportions are negotiable
The "test pyramid" — many unit tests, fewer integration tests, very few end-to-end tests — is correct as a default. It's also a default that gets misapplied. Different products and different teams should have different proportions, and the discipline is in choosing those proportions deliberately.Unit tests — what they actually validate
A good unit test verifies the behaviour of a small piece of logic in isolation. They're cheap to run, easy to debug when they fail, and stable.Where they're worth investing:
- Pure logic — calculations, parsers, transformations
- Algorithm correctness — sorting, deduplication, deduplication-with-edge-cases
- Boundary conditions — empty inputs, max sizes, off-by-one situations
Where they're often a waste:
- Mocking-heavy tests of orchestration code — they verify the mock setup, not the system
- Tests of trivial getters and setters
- Tests written purely to hit coverage targets without thinking about behaviour
The metric that matters isn't "% coverage". It's "would this test fail if I broke the behaviour it claims to verify".
Integration tests — the underrated layer
Integration tests verify that components work together as designed — your service against a real database, your handler against a real HTTP framework, your job runner against a real queue.Where modern teams invest the most leverage:
- Service-level integration tests — full HTTP request through a real router with a real database (test container)
- Workflow tests — kick off a multi-step business process, verify final state
- Contract tests — between services, verify the contract holds (Pact-style)
We've moved more of our investment here over the last few years. Modern test containers (Testcontainers, dockertest) make it cheap to spin up real Postgres, Redis, Kafka in CI. The fidelity is worth the slight speed cost.
End-to-end / UI tests — the few that matter
E2E tests run the full system, often through the UI. They're slow, flaky, and expensive. But the few critical user journeys must be E2E-tested:- "User can sign up, log in, perform the core action, log out"
- "User can complete a purchase"
- "Admin can perform the operation that, if broken, would page someone"
Five to ten E2E tests, well-maintained, cover most of the value. Forty E2E tests are usually a sign of overinvestment in the wrong layer.
Flakiness is a tax on the whole team
A flaky test isn't a noise problem — it's a culture problem. The team that learns to "just rerun it" stops trusting tests in general. Patterns to fight back:- Quarantine flakiness — flaky tests get isolated, fixed within a sprint, or deleted
- Track flake rate as a CI metric. Improvement is a line item.
- No "we know this one's flaky" comments in test code. Fix it or remove it.
- Async-by-default tests — explicit waits, not sleeps
Test data: the part nobody plans
- Factory functions that produce minimal valid objects for the unit under test
- Builders for complex test data shapes — readable test setup that names its variations
- Database fixtures loaded once per test class / suite when integration tests need realistic data
- Time-mocking infrastructure — don't let `time.Now()` make your tests flaky on Mondays
- Property-based testing for edge-case generation in domains where it fits (parsing, encoding, deterministic logic)
The QA function in 2026
QA as a separate "test the developer's work" function has receded. What remains valuable:- Test strategy across a product or product line — what to invest in where
- Exploratory testing during pre-release — humans find what test code doesn't
- Production validation — the team responsible for monitoring and verifying releases in production
- Customer-issue triage — the QA-adjacent team that owns "we got a bug report, is it real, what's the impact, who fixes it"
The "QA writes the tests, dev writes the code" handoff model is largely gone, and not missed.
One pattern we'd warn about
Coverage-target hunting. "We need 80 % coverage" produces tests that hit lines without verifying behaviour. The tests pass, the bug ships anyway.One pattern that always pays off
A "golden path" test for every critical feature, run as part of every CI build, with explicit ownership. When it breaks, someone's responsible for fixing it before merging anything else.What's your test layer balance? And — for the property-based testing folks — has anyone found a domain where fuzzing or property tests catch substantially more bugs than example-based unit tests?