Performance Testing (k6/NBomber) (EN)
Performance testing is not βmake it fastβ β it is about proving that a system meets SLOs under expected and extreme conditions.
At TL/Principal level you should cover:
- what to measure
- how to run experiments safely
- how to interpret results
1) Key concepts
- Latency: response time distribution (p50/p95/p99), not only average.
- Throughput: requests/sec.
- Error rate: timeouts, 5xx, saturation failures.
- Saturation signals: CPU, memory, GC, thread pool starvation, DB connections.
Golden rule
Always report percentiles and error rate, not just mean.
2) Test types
- Load test: expected traffic
- Stress test: beyond expected, find breaking point
- Soak test: long duration, find leaks/slow degradation
- Spike test: sudden burst
3) Designing a good experiment
- Define hypothesis (βp95 < 200ms at 500 rpsβ)
- Control variables (same build/config)
- Warm up caches
- Measure from stable state
- Repeat runs
Pitfall: benchmarking in unstable environments (noisy neighbors).
4) Tooling (conceptual)
- k6: great for HTTP load, scenarios, thresholds.
- NBomber: good for .NET-native load tests, flexible protocols.
(Implementation examples can be added later; this is the theory item.)
5) Interpreting results (lead lens)
When p95 spikes:
- check GC pauses and allocation rate
- check thread pool starvation
- check DB latency and connection pool
- check downstream dependencies
Always correlate:
- traces + logs + metrics
6) Performance test in CI/CD
TL policy:
- run small smoke performance tests on every PR
- run full suites nightly/on release candidates
- enforce thresholds (fail builds when SLOs regress)
7) Interview angle
Be ready to discuss:
- why p99 matters
- how you avoid false positives
- how you design performance budgets
8) Review checklist
- [ ] SLOs and thresholds defined.
- [ ] Tests report p50/p95/p99 + errors.
- [ ] Experiments are repeatable.
- [ ] Results are correlated with telemetry.
- [ ] Regressions block release.