Echozone — TCP buffer sizing: what the defaults get wrong

I've tuned TCP buffers on probably two dozen servers at this point, and every time I do it I have to re-derive the same math because I forgot where I wrote it down. This is that writeup.

The BDP formula

Optimal buffer size = Bandwidth × RTT. This is the bandwidth-delay product (BDP). For a 1 Gbps link with 20ms RTT:

BDP = 1,000,000,000 bits/s × 0.020 s = 20,000,000 bits = 2.5 MB

The default Linux rmem_max is 212992 bytes — about 208 KB. On a 1 Gbps link with 20ms RTT you're leaving 92% of your pipe empty. For a 10 Gbps link it's worse.

But bigger isn't always better

The catch is memory. Each socket's buffer lives in kernel memory. If you have 100,000 concurrent connections and each socket has a 4 MB buffer, that's 400 GB of committed buffer space. The kernel won't actually allocate all of it unless it's used, but the commit limit still matters.

For a proxy server with many short-lived connections, small buffers (128–256 KB) are often better — they reduce memory pressure and let more connections coexist. I use large buffers (2–8 MB) only on servers doing bulk transfers or streaming.

The settings I actually use

# /etc/sysctl.d/tcp.conf
net.core.rmem_max = 67108864
net.core.wmem_max = 67108864
net.ipv4.tcp_rmem = 4096 262144 67108864
net.ipv4.tcp_wmem = 4096 262144 67108864
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq

The tcp_rmem and tcp_wmem values are min/default/max. The kernel auto-tunes within this range. Setting max to 64 MB gives room on high-BDP paths without committing that memory everywhere.

BBR is the other thing worth enabling. On paths with any queuing or loss, BBR significantly outperforms CUBIC. The FQ qdisc is required for BBR's pacing to work correctly.