TCP Congestion Control: Engineering Fairness Under Pressure

2nd. gen black Amazon Echo speaker on white panel

5 min read

TCP congestion control prevents network collapse by having each sender independently adjust transmission rates based on network feedback signals.

The congestion window mechanism implements additive increase, multiplicative decrease behavior that mathematically guarantees fair bandwidth convergence among competing flows.

Fast retransmit and fast recovery distinguish isolated packet losses from severe congestion, enabling proportionate responses that maintain throughput.

Algorithm evolution from Tahoe through Reno to CUBIC and BBR solved progressively more challenging network scenarios including high-bandwidth long-distance links.

Modern algorithms like BBR shift from loss-based to measurement-based congestion signals, reducing latency while maintaining high throughput.

Every time you load a webpage, stream video, or send an email, your data competes with billions of other flows for limited network bandwidth. Without coordination, this competition would collapse the internet within seconds—routers overwhelmed, packets dropped en masse, throughput plummeting to near zero.

TCP congestion control solves this distributed coordination problem without any central authority. Each sender independently adjusts its transmission rate based on signals from the network, and somehow millions of competing flows converge on fair bandwidth allocation. This emergent behavior isn't accidental—it's the result of carefully engineered algorithms that balance aggressive throughput seeking against conservative collapse avoidance.

Understanding congestion control reveals the fundamental tension in network engineering: every sender wants maximum bandwidth, but collective greed destroys the shared resource. The algorithms that navigate this tension have evolved through decades of real-world deployment, each generation solving problems its predecessors couldn't anticipate.

Window-Based Flow Control: The Throttle That Adapts

TCP controls transmission rate through a congestion window—the amount of unacknowledged data a sender can have in flight at any moment. Unlike a fixed rate limiter, this window expands and contracts based on network feedback, creating a self-adjusting throttle that responds to changing conditions.

The mechanism works through a feedback loop. When acknowledgments return quickly, the window grows, allowing more packets in flight. When packets are lost or delayed, the window shrinks dramatically. This creates additive increase, multiplicative decrease (AIMD) behavior: slow, linear growth during good conditions, aggressive cuts during congestion.

The sliding window also prevents buffer overflow at intermediate routers. Each router has finite queue space—when queues fill, packets drop. By limiting outstanding data to the congestion window size, TCP prevents any single flow from overwhelming network buffers. The window essentially represents the sender's current estimate of available network capacity.

This distributed approach means no router needs to track individual flows or enforce fairness policies. Each TCP sender independently probes for available bandwidth, backs off when it detects congestion, and gradually increases again. The mathematical properties of AIMD guarantee that competing flows converge toward equal bandwidth shares over time, achieving fairness without explicit coordination.

Takeaway
TCP's congestion window acts as a distributed admission control system—by limiting outstanding data based on network feedback, millions of independent senders collectively avoid overwhelming shared infrastructure without any central coordinator.

Loss Detection: Reading Network Signals

TCP must detect congestion before it can respond, but networks don't send explicit "slow down" messages. Instead, TCP infers congestion from two signals: timeouts when acknowledgments don't arrive within expected intervals, and duplicate acknowledgments when receivers signal gaps in received data.

Timeout-based detection is the fallback mechanism. TCP maintains a retransmission timer based on measured round-trip times. When this timer expires without an acknowledgment, TCP assumes severe congestion—the packet or its acknowledgment was lost somewhere in transit. The response is aggressive: reset the congestion window to minimum and begin slow start from scratch.

Duplicate acknowledgment detection enables much faster response. When a receiver gets an out-of-order packet, it immediately re-sends its last acknowledgment. Three duplicate ACKs indicate a single packet loss rather than severe congestion—the network is still delivering subsequent packets, just with a gap. This triggers fast retransmit: resend the missing packet immediately without waiting for timeout.

Fast retransmit pairs with fast recovery to maintain throughput during isolated losses. Instead of collapsing the window to minimum, TCP halves the window and continues transmission. This distinction matters enormously: timeout-triggered recovery can reduce throughput by 90%, while fast recovery might only cost 50%. On lossy wireless links, this improvement transformed TCP from nearly unusable to reasonably performant.

Takeaway
The shift from timeout-only detection to duplicate-ACK fast retransmit represents a fundamental insight: distinguishing isolated packet loss from systemic congestion enables proportionate responses that maintain throughput while still protecting the network.

Algorithm Evolution: From Tahoe to BBR

TCP Tahoe, released in 1988, established the foundational mechanisms: slow start for initial bandwidth probing, congestion avoidance for steady-state operation, and timeout-based loss detection. But Tahoe treated every loss as catastrophic, resetting to slow start regardless of cause. On networks with occasional random loss, performance suffered dramatically.

TCP Reno added fast retransmit and fast recovery in 1990, distinguishing between isolated losses and severe congestion. This simple change improved throughput by orders of magnitude on real-world networks. Reno remained the dominant algorithm for over a decade, but struggled with high-bandwidth, high-latency networks where its conservative window growth couldn't fill available capacity.

CUBIC, developed in 2006 and now Linux's default, replaced Reno's linear window growth with a cubic function. After detecting loss, CUBIC quickly recovers to the previous window size, then probes more aggressively for additional capacity. This makes CUBIC bandwidth-delay product aware—it can efficiently utilize high-capacity long-distance links that Reno couldn't fill.

Google's BBR, introduced in 2016, represents a philosophical shift. Rather than using loss as the primary congestion signal, BBR explicitly measures bandwidth and round-trip time, then calculates optimal sending rate directly. This approach avoids the buffer-bloat problem where loss-based algorithms fill router queues, adding latency. BBR maintains high throughput while keeping queues minimal—a significant improvement for latency-sensitive applications.

Takeaway
Each TCP congestion control algorithm emerged from specific network problems its predecessors couldn't solve—understanding this evolution reveals that "best" depends entirely on your network characteristics, whether you're optimizing for throughput, latency, or fairness.

TCP congestion control demonstrates that complex global coordination can emerge from simple local rules. Each sender follows the same algorithm, responds to the same signals, and the network self-organizes into fair bandwidth allocation without central planning.

The engineering insight runs deeper than networking. Stable distributed systems require feedback mechanisms that punish aggressive behavior more than they reward it. AIMD's multiplicative decrease ensures that overshooting costs more than undershooting gains, creating convergent rather than divergent dynamics.

Modern algorithms continue evolving as network characteristics change—but the fundamental principles remain. Whether you're designing congestion control, distributed databases, or any shared-resource system, TCP's forty-year evolution offers proven patterns for engineering fairness under pressure.