Networking 101: TCP In More Depth
Understanding TCP Connections and Flow Control

Charlie Schluting
Wednesday, September 24, 2008 06:29:50 PM
Last week's introduction to
TCP promised that this article would enlighten, entertain and obviate all other
documentation. Well the last one isn't quite possible in this much space, but let's go
ahead and take a look at TCP operational issues, now that we know a little about what TCP
actually is.
We said that TCP gets "connected" before any data can be sent. To make that work, the
side that initiates a TCP connection will send a SYN (remember the Flags field) packet
first. This is simply a packet with no data, and the SYN flag turned on. If the other
side wants to talk on the port it received the SYN on, it will send back a SYN+ACK: SYN
and ACK fields set, and the ACK number set to acknowledge the first packet. Then, to
verify the receipt of the SYN+ACK, the sender will send one final ACK. The SYN, SYN+ACK,
ACK sequence is called the three-way handshake. After that happens, the connection is
established. The connection will remain active unless it times out or until either side
sends a FIN.
Closing a TCP connection can be done from either side, and requires that both sides
send a FIN to close their channel of communication. One side can close before the other,
or they can both happen at the same time. So, when one side sends a FIN, the other sends
FIN+ACK, to start the close of its side, and to ACK the first FIN. The person who sent
the first FIN will then FIN+ACK the second FIN, and the other person knows that the
connection is closed. There is no way for the person who sent the first FIN to get an ACK
back for that last ACK. You might want to reread that now. The person that initially
closed the connection enters the TIME_WAIT state; in case the other person didn't really
get the ACK and thinks the connection is still open. Typically, this lasts one to two
minutes.
And we've come to our first problem. If someone, say an attacker, leaves half-open or
half-closed connections on your Web server, this could be bad news. Memory is used up
with each connection, and opening thousands of bogus TCP connections could bring a server
to its knees. Of course, you can't really adjust the TCP timers without effecting the
proper operation of TCP. If you've ever heard of a TCP SYN attack, this is what it means.
To prevent this, most operating systems opt to limit the number of half-open connections,
for example in Linux it's normally 256 by default.
Now, since we promised to talk about the everlasting flow control problems, let's get
into windowing. TCP uses "positive ACK with retransmission" to guarantee reliability. The
sender will wait a certain amount of time, and if it doesn't get back an ACK for the
packet it sent, it retransmits it. There are a bazillion timers in TCP, by the way, this
is just another one. The concept of ACKs is important to flow control, because the TCP
sliding window protocol makes the ping-pong nature of ACKs efficient. If TCP were to send
a packet and wait for every ACK, it would essentially cut the throughput in half.
Ideally, we can send many packets at once, and then get back an ACK for all of them,
probably piggybacked on more data from the other side. But how do we know how much to
send? Well, the TCP window size controls how many packets can be held in the "sent but
not ACKed" state. If the window is large, we can send large amounts of packets without
waiting for an ACK. On the surface, this doesn't look like flow control, but it certainly
is.
The receiving side is the one that controls the window size. If it says zero, then the
sender cannot send any more data at all. If the window size is one, then we're back to
the simple "send and wait for ACK" protocol. If the last window size was zero, the sender
will send a probe to figure out when the window is open again. If the sender never gets
an ACK, it just keeps trying until, you guessed it, a timer expires. Remember, the window
size is a 16-bit field in the TCP header. And if you want a window size (in bytes) larger
than 16-bits will accommodate, there's also a TCP option "window scale" that allows it to
be multiplied by the scale factor. Without an extremely large window size, TCP has no
hope of filling a gigabit link. You should now be better prepared to understand the
gigabit
tuning article, too.
On the subject of TCP flow control, we can't neglect mention of the Nagle algorithm.
What would happen if you had a large TCP window over a telnet connection? You'd type a
command, then wait and wait and wait for a response. This is a major problem for
real-time applications. Furthermore, telnet can add to congestion, since a 1-byte packet
will include 40 bytes of header. RFC 896 defines the Nagle algorithm to attempt to
abolish tiny packets. The idea is that we should give data a chance to pile up before
sending, to be more efficient. It says that we can only have one unacked small segment,
and you can't send more data until you get an ACK. Telnet and interactive ssh connections
turn this off with the TCP_NODELAY socket option, so that you can get an immediate
response when you press a key.
Of course, we've neglected so many things about TCP. With the understanding from these
two articles, however, you should be prepared to understand other literature that assumes
you know TCP already. Congestion control, which is different from flow control, wasn't
covered here. You may want to read the TCP RFC if you're truly interested in knowing how
it all works, in excruciating detail.
In a Nutshell
- TCP is
very intelligent about flow control, which makes it extremely diverse and useful for
many applications.
- Flow
control in TCP mean "how much can I send without waiting for an ACK?" This is the TCP
window.
- Learning
about congestion control is left as an exercise for the reader. Note that TCP starts
slow, then speeds up. This isn't always optimal.
When he's not writing for Enterprise Networking Planet or riding his motorcycle, Charlie Schluting is the Associate Director of Computing Infrastructure at Portland State University. Charlie also operates OmniTraining.net, and recently finished Network Ninja, a must-read for every network engineer.
Article courtesy of Enterprise Networking Planet