When you load a website, the data doesn't travel directly from a single server to your browser. If it did, users in Tokyo requesting content from a server in Virginia would experience crushing latency. The round-trip time alone would make modern web experiences impossible.

Content delivery networks solve this by fundamentally changing where data lives. They maintain thousands of servers positioned at the edges of the internet—in data centers near major population hubs, inside internet exchange points, even colocated within ISP networks. Your request travels a few dozen miles instead of thousands.

This isn't just about speed. CDNs absorb massive amounts of traffic that would otherwise hammer origin servers into oblivion. They transform the internet's geography from a hub-and-spoke model into something closer to a mesh, where content exists simultaneously in hundreds of locations. Understanding how they achieve this requires examining three interconnected systems: cache hierarchy, request routing, and origin protection.

Edge Cache Strategy: The Hot-Cold Architecture

CDN cache architecture follows a tiered model designed around a fundamental truth: not all content is equally popular. A viral video might receive millions of requests per hour. An obscure product image might get requested once a week. Treating these identically wastes resources.

The edge layer sits closest to users—servers deployed in hundreds or thousands of locations worldwide. These caches store hot content: assets requested frequently enough that keeping local copies makes economic sense. The edge tier prioritizes latency reduction above all else. Cache capacity is limited, so eviction policies aggressively remove content that hasn't been accessed recently.

Behind the edge sits a mid-tier cache layer, sometimes called regional caches or PoP aggregators. These servers cover larger geographic areas and store a broader content catalog. When an edge server experiences a cache miss, it queries the mid-tier rather than going directly to origin. This creates a cache hierarchy where popular content gets served from edge, moderately popular content comes from mid-tier, and only cold content requires origin requests.

The decision of what to cache where isn't static. CDNs use sophisticated heuristics combining request frequency, content size, time-to-live headers, and predicted demand patterns. Some systems employ machine learning to pre-position content before demand spikes—pushing videos to edge caches ahead of a scheduled premiere, for instance. The goal is maximizing cache hit ratio at each tier while respecting capacity constraints.

Takeaway

Effective caching isn't about storing everything everywhere—it's about placing content where the probability of request multiplied by latency savings exceeds the cost of storage and invalidation.

Request Routing: Finding the Optimal Path

Every CDN must solve a deceptively complex problem: when a user requests content, which of potentially thousands of edge servers should respond? The answer involves balancing proximity, server load, availability, and network conditions—all within milliseconds.

DNS-based routing remains the most common approach. When a browser resolves a CDN-hosted domain, the CDN's authoritative nameserver returns different IP addresses based on the requesting resolver's location. A user in Singapore receives an IP pointing to a nearby edge server; a user in Brazil gets routed to São Paulo. This works reasonably well but has limitations. DNS resolvers don't always accurately represent end-user location, and DNS TTLs introduce lag when traffic needs redirecting.

Anycast routing offers an alternative. Multiple edge servers advertise the same IP address, and BGP routing naturally directs packets to the topologically nearest instance. Anycast provides automatic failover—if a server goes down, BGP withdraws the route and traffic flows elsewhere. The tradeoff is less granular control. You're relying on internet routing, not your own optimization logic.

Modern CDNs typically combine both approaches. DNS routing handles initial assignment, while anycast provides resilience within regions. Some CDNs add a real-time intelligence layer, measuring actual latency between edge servers and user networks through continuous probing. If congestion develops on one path, routing shifts dynamically. The system maintains a constantly-updated map of network conditions, making routing decisions that account for current reality rather than static assumptions.

Takeaway

Request routing is a continuous optimization problem where proximity is necessary but not sufficient—true performance requires real-time awareness of network conditions, server health, and capacity headroom.

Origin Shield Protection: The Consolidation Layer

Without protection, origin servers face a paradoxical threat from their own CDN. Consider what happens during a cache purge or when content becomes suddenly popular. Hundreds of edge servers simultaneously experience cache misses and independently request the same content from origin. This thundering herd problem can overwhelm origin infrastructure precisely when traffic spikes.

Origin shield architecture interposes a consolidation layer between edge caches and the origin server. Instead of edge servers contacting origin directly, they route through designated shield nodes. When multiple edge servers simultaneously miss on the same content, the shield receives all those requests—but makes only one request to origin. The other requests wait, and all receive the same response once it returns.

This request coalescing provides dramatic origin load reduction. A cache purge affecting 500 edge servers no longer generates 500 origin requests. Flash traffic events that would trigger thousands of concurrent origin fetches get consolidated to a manageable volume. The origin sees smooth, predictable traffic patterns regardless of edge cache state.

Shield nodes also provide failover benefits. If an origin server becomes unreachable, shield caches can continue serving stale content while origin recovers—subject to configuration policies. Some CDNs allow shield nodes to serve content past its TTL during origin outages, trading freshness for availability. The shield layer transforms a potentially fragile origin into a more resilient system.

Takeaway

Origin protection isn't an optional feature—it's what allows modest origin infrastructure to serve content to millions of users without being crushed by the traffic patterns their own popularity creates.

CDN architecture embodies a core principle of distributed systems: you can't make the speed of light faster, so you move the data closer. Every layer—edge caches, mid-tier aggregators, origin shields—exists to ensure requests travel the minimum distance necessary.

The engineering decisions cascade. Cache hierarchy determines hit ratios. Request routing determines which cache gets queried. Origin protection determines whether sudden traffic changes cause outages or get absorbed gracefully.

These systems turn the internet from a client-server model into something more sophisticated: a content-aware network where popular data migrates toward demand automatically. Understanding CDN architecture means understanding how modern infrastructure handles global-scale distribution while appearing instantaneous to users.