In a monolithic application, function calls are practically instantaneous — nanoseconds. In distributed systems, each call traverses the network, adding milliseconds of latency. This 6 orders of magnitude difference changes everything.
This article explores how network latency affects performance and strategies to minimize its impact.
The network is the most expensive lie in distributed computing.
The Real Cost of the Network
Latency comparison
| Operation | Typical time |
|---|---|
| L1 cache access | 0.5 ns |
| RAM access | 100 ns |
| SSD read | 150 μs |
| Round-trip same datacenter | 0.5 ms |
| Round-trip same region | 1-5 ms |
| Round-trip cross-region | 50-150 ms |
| Round-trip intercontinental | 100-300 ms |
A network call in the same datacenter is 1 million times slower than RAM access.
The problem multiplies
User request
↓
API Gateway (1ms)
↓
Service A (2ms)
↓
Service B (2ms)
↓
Database (3ms)
↓
Response: 8ms just from network
And that's assuming everything works on the first try.
Network Latency Components
1. Propagation
Time for the physical signal to travel through the medium.
Speed of light in fiber ≈ 200,000 km/s
São Paulo → Virginia ≈ 8,000 km
Minimum time: 40ms (one way)
Round-trip: 80ms (theoretical minimum)
You can't beat physics.
2. Transmission
Time to put all bits on the medium.
1MB payload on 1Gbps link = 8ms
1MB payload on 100Mbps link = 80ms
3. Processing
Time in routers, firewalls, load balancers.
Each hop adds microseconds to milliseconds.
4. Queuing
Waiting when links are congested.
Can vary from 0 to hundreds of milliseconds.
Common Problems
Chatty protocols
Many small calls instead of few large calls.
// Bad: 100 network calls
for (id in ids) {
items.push(await fetch(`/api/items/${id}`));
}
// Good: 1 network call
items = await fetch(`/api/items?ids=${ids.join(',')}`);
N+1 in services
The same N+1 problem from databases, but between services.
// Orders service
orders = getOrders(userId) // 1 call
for order in orders:
customer = getCustomer(order.customerId) // N calls
Synchronous chained calls
A → B → C → D → Database
5ms 5ms 5ms 5ms = 20ms minimum
Total latency is the sum of all calls.
Retry storms
When timeouts cause retries that cause more load that causes more timeouts.
Slow service
↓
Timeout (2s)
↓
3 retries × 1000 clients = 3000 extra requests
↓
Service even slower
↓
Collapse
Mitigation Strategies
1. Reduce calls
Batching: group operations
// Instead of
await Promise.all(ids.map(id => getItem(id)));
// Use batch API
await getItems(ids);
Prefetching: fetch data before you need it
// While processing page 1, fetch page 2
const page1 = await getPage(1);
const page2Promise = getPage(2); // Already started
// ... process page 1 ...
const page2 = await page2Promise;
2. Parallelism
// Sequential: 15ms
const a = await serviceA(); // 5ms
const b = await serviceB(); // 5ms
const c = await serviceC(); // 5ms
// Parallel: 5ms
const [a, b, c] = await Promise.all([
serviceA(),
serviceB(),
serviceC()
]);
3. Aggressive caching
Avoid network calls when possible.
const cache = new Map();
async function getUser(id) {
if (cache.has(id)) return cache.get(id);
const user = await userService.get(id);
cache.set(id, user);
return user;
}
4. Compression
Reduce bytes transmitted.
JSON response: 100KB
Compressed (gzip): 15KB
Savings: 85% of bandwidth
5. Keep-alive connections
Avoid overhead of establishing connections.
New TCP connection: ~1-3ms
Existing connection: ~0ms overhead
6. Locality
Keep services that communicate a lot close together.
Service A and B in same zone: 0.5ms
Service A and B in different regions: 50ms
7. Asynchronous communication
When possible, don't wait for a response.
// Synchronous: blocks
await notificationService.send(email);
// Asynchronous: doesn't block
messageQueue.publish('send-email', email);
Timeouts and Retries
Configuring timeouts
Timeout = p99 latency × 2 (or more)
Too short: false positives Too long: stuck resources
Retry with backoff
async function fetchWithRetry(url, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fetch(url);
} catch (e) {
if (i === maxRetries - 1) throw e;
await sleep(Math.pow(2, i) * 100); // Exponential backoff
}
}
}
Circuit breaker
Stop trying when the service is clearly having problems.
if (failureRate > 50%) {
// Circuit open: fail fast
throw new Error('Service unavailable');
}
Essential Metrics
| Metric | Why it matters |
|---|---|
| Latency p50, p95, p99 | Real user experience |
| Request rate | Call volume |
| Error rate | Communication failures |
| Retries | Indicates stability issues |
| Connection pool usage | Connection pressure |
| Bytes in/out | Data volume |
Conclusion
The network is an inevitable reality in distributed systems. To minimize its impact:
- Reduce calls — batch, cache, prefetch
- Parallelize — when there's no dependency
- Compress — fewer bytes = less time
- Plan locality — nearby services = lower latency
- Use async — don't wait when you don't need to
- Configure timeouts — don't wait forever
- Monitor — you can't improve what you don't measure
Every network call is a bet. Minimize your bets.