Network latency: the invisible cost of communication

In a monolithic application, function calls are practically instantaneous — nanoseconds. In distributed systems, each call traverses the network, adding milliseconds of latency. This 6 orders of magnitude difference changes everything.

This article explores how network latency affects performance and strategies to minimize its impact.

The network is the most expensive lie in distributed computing.

The Real Cost of the Network

Latency comparison

Operation	Typical time
L1 cache access	0.5 ns
RAM access	100 ns
SSD read	150 μs
Round-trip same datacenter	0.5 ms
Round-trip same region	1-5 ms
Round-trip cross-region	50-150 ms
Round-trip intercontinental	100-300 ms

A network call in the same datacenter is 1 million times slower than RAM access.

The problem multiplies

User request
    ↓
API Gateway (1ms)
    ↓
Service A (2ms)
    ↓
Service B (2ms)
    ↓
Database (3ms)
    ↓
Response: 8ms just from network

And that's assuming everything works on the first try.

Network Latency Components

1. Propagation

Time for the physical signal to travel through the medium.

Speed of light in fiber ≈ 200,000 km/s
São Paulo → Virginia ≈ 8,000 km
Minimum time: 40ms (one way)
Round-trip: 80ms (theoretical minimum)

You can't beat physics.

2. Transmission

Time to put all bits on the medium.

1MB payload on 1Gbps link = 8ms
1MB payload on 100Mbps link = 80ms

3. Processing

Time in routers, firewalls, load balancers.

Each hop adds microseconds to milliseconds.

4. Queuing

Waiting when links are congested.

Can vary from 0 to hundreds of milliseconds.

Common Problems

Chatty protocols

Many small calls instead of few large calls.

// Bad: 100 network calls
for (id in ids) {
    items.push(await fetch(`/api/items/${id}`));
}

// Good: 1 network call
items = await fetch(`/api/items?ids=${ids.join(',')}`);

N+1 in services

The same N+1 problem from databases, but between services.

// Orders service
orders = getOrders(userId)         // 1 call
for order in orders:
    customer = getCustomer(order.customerId)  // N calls

Synchronous chained calls

A → B → C → D → Database
   5ms  5ms  5ms  5ms  = 20ms minimum

Total latency is the sum of all calls.

Retry storms

When timeouts cause retries that cause more load that causes more timeouts.

Slow service
    ↓
Timeout (2s)
    ↓
3 retries × 1000 clients = 3000 extra requests
    ↓
Service even slower
    ↓
Collapse

Mitigation Strategies

1. Reduce calls

Batching: group operations

// Instead of
await Promise.all(ids.map(id => getItem(id)));

// Use batch API
await getItems(ids);

Prefetching: fetch data before you need it

// While processing page 1, fetch page 2
const page1 = await getPage(1);
const page2Promise = getPage(2);  // Already started
// ... process page 1 ...
const page2 = await page2Promise;

2. Parallelism

// Sequential: 15ms
const a = await serviceA();  // 5ms
const b = await serviceB();  // 5ms
const c = await serviceC();  // 5ms

// Parallel: 5ms
const [a, b, c] = await Promise.all([
    serviceA(),
    serviceB(),
    serviceC()
]);

3. Aggressive caching

Avoid network calls when possible.

const cache = new Map();

async function getUser(id) {
    if (cache.has(id)) return cache.get(id);

    const user = await userService.get(id);
    cache.set(id, user);
    return user;
}

4. Compression

Reduce bytes transmitted.

JSON response: 100KB
Compressed (gzip): 15KB
Savings: 85% of bandwidth

5. Keep-alive connections

Avoid overhead of establishing connections.

New TCP connection: ~1-3ms
Existing connection: ~0ms overhead

6. Locality

Keep services that communicate a lot close together.

Service A and B in same zone: 0.5ms
Service A and B in different regions: 50ms

7. Asynchronous communication

When possible, don't wait for a response.

// Synchronous: blocks
await notificationService.send(email);

// Asynchronous: doesn't block
messageQueue.publish('send-email', email);

Timeouts and Retries

Configuring timeouts

Timeout = p99 latency × 2 (or more)

Too short: false positives Too long: stuck resources

Retry with backoff

async function fetchWithRetry(url, maxRetries = 3) {
    for (let i = 0; i < maxRetries; i++) {
        try {
            return await fetch(url);
        } catch (e) {
            if (i === maxRetries - 1) throw e;
            await sleep(Math.pow(2, i) * 100);  // Exponential backoff
        }
    }
}

Circuit breaker

Stop trying when the service is clearly having problems.

if (failureRate > 50%) {
    // Circuit open: fail fast
    throw new Error('Service unavailable');
}

Essential Metrics

Metric	Why it matters
Latency p50, p95, p99	Real user experience
Request rate	Call volume
Error rate	Communication failures
Retries	Indicates stability issues
Connection pool usage	Connection pressure
Bytes in/out	Data volume

Conclusion

The network is an inevitable reality in distributed systems. To minimize its impact:

Reduce calls — batch, cache, prefetch
Parallelize — when there's no dependency
Compress — fewer bytes = less time
Plan locality — nearby services = lower latency
Use async — don't wait when you don't need to
Configure timeouts — don't wait forever
Monitor — you can't improve what you don't measure

Every network call is a bet. Minimize your bets.