"The graph shows we improved by 300%." Look again. The Y-axis starts at 90, not zero. The real improvement was 3%. Graphs are powerful communication tools, but they can mislead — intentionally or not. This article teaches you to identify and avoid misleading visualizations.
A graph can lie without containing a single false number.
Distortion Techniques
1. Truncating the Y-axis
Misleading Graph: Honest Graph:
Latency Latency
102 ┤ ╭─╮ 200 ┤
101 ┤ ╱ ╲ 150 ┤
100 ┤──╯ ╲ 100 ┤──────────
99 ┤ ╲ 50 ┤
98 ┼─────────╯ 0 ┼──────────
Jan Feb Mar Jan Feb Mar
"Latency dropped 4%!" "Latency stable"
Why it misleads:
- Exaggerates small variations
- Makes trivial change look significant
When it's acceptable:
- When variation is genuinely important
- With clear warning that axis is truncated
2. Inconsistent scale
Before: After:
p95 (ms) p95 (ms)
1000 ┤ 500 ┤
800 ┤ ╭─── 400 ┤
600 ┤ ╱ 300 ┤ ╭───
400 ┤──╯ 200 ┤ ╱
200 ┤ 100 ┤──╯
0 ┼───── 0 ┼─────
"Before was 800ms!" "Now it's 300ms!"
Reality: Before 800ms, after 300ms
But graphs have different scales
3. Selective time period
Last 3 months: Last year:
Error % Error %
5 ┤ 5 ┤ ╭─╮
4 ┤ 4 ┤ ╱ ╲
3 ┤ ╭── 3 ┤ ╱ ╲
2 ┤ ╱ 2 ┤ ╱ ╲
1 ┤──╯ 1 ┤╯ ╲──
0 ┼───── 0 ┼─────────────
Jan Feb Mar Jan Jul Jan
"Errors tripled!" "Back to normal after spike"
4. Aggregation that hides
Daily average: Hourly:
Latency (ms) Latency (ms)
200 ┤ 2000 ┤ ╭╮
150 ┤───────── 1500 ┤ ││
100 ┤ 1000 ┤ ╱╲╲
50 ┤ 500 ┤───╯ ╲───
0 ┼───────── 0 ┼───────────
Mon Tue Wed 0h 6h 12h 18h
"Latency stable at 150ms" "2s spike at 2pm"
5. Wrong metric choice
Throughput: Latency:
(req/s) (ms)
5000 ┤ ╭─── 5000 ┤ ╱
4000 ┤ ╱ 4000 ┤ ╱
3000 ┤ ╱ 3000 ┤ ╱
2000 ┤ ╱ 2000 ┤ ╱╱
1000 ┤╱ 1000 ┤───╯
0 ┼───── 0 ┼─────────
Load → Load →
"System scales well!" "System saturates at 3000 req/s"
Honest Graphs
Basic rules
1. Y-axis starts at zero:
- Except when justified AND signaled
2. Consistent scales:
- Same scale when comparing periods
3. Adequate time context:
- Period that shows complete pattern
- No cherry-picking start/end
4. Appropriate aggregation:
- Don't hide variance
- Show distribution when relevant
5. Relevant metrics:
- Show what matters for the question
- Include correlated metrics
Well-made latency graph
Essential elements:
┌────────────────────────────────────────┐
│ Checkout Latency - Last 24h │
│ │
│ ms │
│ 500 ┤ p99 │
│ 400 ┤ ╭╮ ╭╮ │
│ 300 ┤ ╱╲╲ ╱ ╲ p95 │
│ 200 ┤──╱ ╲──╱ ╲────── │
│ 100 ┤ p50 ───────────── │
│ 0 ┼───────────────────────────── │
│ 0h 6h 12h 18h 24h │
│ │
│ ⚠ Deploy at 2pm | Normal peak: 11am-1pm│
└────────────────────────────────────────┘
Includes:
- Multiple percentiles
- Complete period
- Axis starting at zero
- Event annotations
- Context (peak hours)
Dashboard that doesn't mislead
Recommended layout:
┌───────────────────────────────────────────┐
│ OVERVIEW - Order System │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ p95 │ │ Errors │ │ Throughput│ │
│ │ 180ms │ │ 0.3% │ │ 1.2K/s │ │
│ │ ↓12% │ │ ↓50% │ │ ↑15% │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ Latency by Percentile (7 days) │
│ [graph with p50, p95, p99] │
│ │
│ Latency Distribution (histogram) │
│ [shows distribution shape] │
│ │
│ Correlation: Latency vs Throughput │
│ [scatter plot with trend line] │
└───────────────────────────────────────────┘
Includes:
- Numbers and trend (Δ vs previous period)
- Multiple perspectives (time, distribution, correlation)
- Consistent comparison
Common Mistakes and Fixes
Bar comparison
❌ Wrong:
Bars with different colors,
no legend, truncated axis
✅ Correct:
Same scale, meaningful colors,
clear legend, axis from zero
Pie chart
❌ Avoid for:
- Many slices (>5)
- Precise comparisons
- Similar values
✅ Use for:
- Proportions of a whole
- Few segments
- When % is more important than absolute value
Trend lines
❌ Wrong:
Straight line on non-linear data
✅ Correct:
- Choose appropriate model
- Show confidence interval
- Don't extrapolate beyond data
Communicating Honestly
For executives
Do:
- Simplify without distorting
- Highlight what matters for decision
- Include minimum necessary context
- Indicate uncertainty
Don't:
- Exaggerate improvements
- Hide problems
- Use misleading scales
- Omit comparison period
For technical teams
Do:
- Show complete distribution
- Include correlations
- Allow drill-down
- Document methodology
Don't:
- Average without percentiles
- Excessive aggregation
- Hide outliers
Visualization Checklist
## Before publishing a graph
### Axes
- [ ] Y starts at zero (or justified)?
- [ ] Scales consistent between graphs?
- [ ] Labels clear?
- [ ] Units indicated?
### Data
- [ ] Period is representative?
- [ ] Aggregation is appropriate?
- [ ] Outliers handled correctly?
- [ ] Sample is sufficient?
### Context
- [ ] Baseline indicated?
- [ ] Relevant events annotated?
- [ ] Source clear?
- [ ] Date indicated?
### Honesty
- [ ] Is the first impression correct?
- [ ] Would someone without context understand?
- [ ] Am I showing reality?
Real Examples
Case 1: Improvement report
Misleading version:
"Latency reduced 400%"
[graph with truncated Y-axis]
Honest version:
"Latency p95 reduced from 250ms to 180ms (-28%)
after query optimization. Baseline: last week.
Measured in staging environment with similar load."
Case 2: Production dashboard
Misleading version:
[Only average latency]
"System healthy: 100ms"
Honest version:
[Percentiles p50/p95/p99]
"p50=80ms, p95=200ms, p99=1.2s
⚠ High p99 indicates issues for 1% of users"
Conclusion
Honest graphs:
- Axes from zero - except when clearly justified
- Consistent scales - for valid comparisons
- Representative period - no cherry-picking
- Adequate aggregation - don't hide variance
- Context included - baselines, events, methodology
Remember: a misleading graph destroys trust. An honest graph, even showing problems, builds credibility.
The goal isn't for the graph to look good. It's for reality to be understood.
This article is part of the series on the OCTOPUS Performance Engineering methodology.