Documente Academic
Documente Profesional
Documente Cultură
1. Introduction
At the Transport Layer (equivalent to Layer 4 in the OSI model), two protocols exist:
TCP (Transfer Control Protocol) - breaks information into datagrams and sends them, carrying out resends, if required, and reassembles received datagrams, it gives 'reliable' delivery, a connection-oriented service between applications.
UDP (User Datagram Protocol) - does the same as TCP but it does not carry out any checking or resending of datagrams, so it is described as 'unreliable', a connectionless service (See UDP).
2. TCP Header
TCP allows clients to run concurrent applications using different port numbers and at full-duplex thereby giving a multiplexing ability. TCP labels each octet of data with a Sequence Number and a series of octets form a Segment, the sequence number of the first octet in the segment is called the Segment Sequence Number. TCP provides reliability with ACK packets and Flow Control using the technique of a Sliding Window. During the setup of a TCP connection the maximum segment size is determined based on the lowest MTU across the network. The TCP header looks like this:
y y
Source and Destination ports - this identifies the upper layer applications using the connection. Sequence Number - this 32-bit number ensures that data is correctly sequenced. Each byte of data is assigned a sequence number. The first byte of data by a station in a particular TCP header will have its sequence number in this field, say 58000. If this packet has 700 bytes of data in it then the next packet sent by this station will have the sequence number of 58000 + 700 + 1 = 58701.
Acknowledgment Number - this 32-bit number indicates the next sequence number that the sending device is expecting from the other station.
y y y
HLEN - gives the number of 32 bit words in the header. Sometimes called the Data Offset field. Reserved - always set to 0. Code bits - these are flags that indicate the nature of the header. They are: o o o URG - Urgent Pointer ACK - Acknowledgement PSH - Push function, causes the TCP sender to push all unsent data to the receiver rather than sends segments when it gets around to them i.e. when the buffer is full. o o o RST - Reset the connection SYN - Synchronise sequence numbers FIN - End of data
Window - indicates the range of acceptable sequence numbers beyond the last segment that was successfully received. It is the allowed number of octets that the sender of the ACK is willing to accept before an acknowledgement.
Urgent Pointer - shows the end of the urgent data so that interrupted data streams can continue. When the URG bit is set, the data is given priority over other data streams.
Option - mainly only the TCP Maximum Segment Size (MSS) sometimes called Maximum Window Size or Send Maximum Segment Size (SMSS). A segment is a series of data bytes within a TCP header.
3. Port Numbers
Applications open Port numbers, used by TCP and UDP to keep tabs of different communications occurring around the network. Generally, port numbers below 255 were originally for public applications (Assigned Internet protocol numbers); 255 is reserved. Port numbers 256 to 1023 are for saleable applications by various manufacturers and are considered as 'Privileged', 'Well-Known' or Extended Assigned port numbers. Port numbers above 1024 (1024 is reserved) are not regulated, are considered as Unprivileged, or Registered, and these ports are commonly free to be used used by clients talking to Well-Known port numbers. Applications open port numbers (the TCP/IP model differs from the OSI model in that the Application layer sits straight on top of layer 4) and communicate to each other via these port numbers. A telnet server with IP address 10.1.1.1 uses port number 23, however if two clients operating from IP address 10.1.1.2 attach themselves to the server the server needs to distinguish between the two conversations. This is achieved by the clients randomly picking two port numbers above 1023, say 1024 and 1025. The client connection is referenced as a Socket and is defined as the IP address plus the port number, e.g. 10.1.1.1.TCP.1025 and 10.1.1.1.TCP.1026. The server socket is 10.1.1.1.TCP.23. This is how TCP multiplexes different connections. The following table lists some commonly used port numbers:
TCP Application Port Number FTP Telnet SMTP HTTP UDP DNS Bootp TFTP NTP SNMP 53 67/68 69 123 161 20 (Data), 21 (Control, or Program) 23 25 80
RFC 793 (TCP) and RFC 1323 (TCP Extensions) describe TCP in detail whilst both RFC 1500 and RFC 1700 define the Well-Known port numbers for both TCP and UDP. A good list can also be found at http://www.iana.org/assignments/port-numbers.
4. Sequence Numbers
Each octet has its own sequence number so that each one can be acknowledged if necessary. In practice octets are acknowledged in batches, the size of which is determined by the window size (see below). The sequence number is a 32-bit binary number, although very large there is a finite number range that is used (0 to 232-1), whereby it cycles back to zero. In order to track the sequence numbers required for checking, the arithmetic has to be performed as modulo 232.
5. TCP Operation
5.1 Three-way Handshake
If a source host wishes to use an IP application such as active FTP for instance, it selects a port number which is greater than 1023 and connects to the destination station on port 21. The TCP connection is set up via three-way handshaking:
This begins with a SYN (Synchronise) segment (as indicated by the code bit) containing a 32bit Sequence number A called the Initial Send Sequence (ISS) being chosen by, and sent from, host 1. This 32-bit sequence number A is the starting sequence number of the data in that packet and increments by 1 for every byte of data sent within the segment, i.e. there is a sequence number for each octet sent. The SYN segment also puts the value A+1 in the first octet of the data.
Host 2 receives the SYN with the Sequence number A and sends a SYN segment with its own totally independent ISS number B in the Sequence number field. In addition, it sends an increment on the Sequence number of the last received segment (i.e. A+x where x is the number of octets that make up the data in this segment) in its Acknowledgment field. This Acknowledgment number informs the recipient that its data was received at the other end and it expects the next segment of data bytes to be sent, to start at sequence number A+x. This stage is aften called the SYN-ACK. It is here that the MSS is agreed.
Host 1 receives this SYN-ACK segment and sends an ACK segment containing the next sequence number (B+y where y is the number of octets in this particular segment), this is
called Forward Acknowledgement and is received by Host 2. The ACK segment is identified by the fact that the ACK field is set. Segments that are not acknowledged within a certain time span, are retransmitted. TCP peers must not only keep track of their own initiated Sequence numbers but also those Acknowledgment numbers of their peers. Closing a TCP connection is achieved by the initiator sending a FIN packet. The connection only closes when an ACK has been sent by the other end and received by the initiator. Maintaining a TCP connection requires the stations to remember a number of different parameters such as port numbers and sequence numbers. Each connection has this set of variables located in a Transmission Control Block (TCB).
6. Sliding Window
6.1 Buffers
Buffers are used at each end of the TCP connection to speed up data flow when the network is busy. Flow Control is managed using the concept of a Sliding Window. A Window is the maximum number of unacknowledged bytes that are allowed in any one transmission sequence, or to put it another way, it is
the range of sequence numbers across the whole chunk of data that the receiver (the sender of the window size) is prepared to accept in its buffer. The receiver specifies the current Receive Window size in every packet sent to the sender. The sender can send up to this amount of data before it has to wait for an update on the Receive Window size from the receiver. The sender has to buffer all its own sent data until it receives ACKs for that data. The Send Window size is determined by whatever is the smallest between the Receive Window and the sender's buffer. When TCP transmits a segment, it places a copy of the data in a retransmission queue and starts a timer. If an acknowledgment is not received for that segment (or a part of that segment) before the timer runs out, then the segment (or the part of the segment that was not acknowledged) is retransmitted.
1. The current sequence number of the TCP sender is y. 2. The TCP receiver specifies the current negotiated window size x in every packet. This often specified by the operating system or the application, otherwise it starts at 536 bytes. 3. The TCP sender sends a datagram with the number of data bytes equal to the receiver's window size x and waits for an ACK from the receiver. The window size can be many thousands of bytes! 4. The receiver sends an ACK with the value y + x i.e. acknowledging that the last x bytes have been received OK and the receiver is expecting another transmission of bytes starting at byte y + x. 5. After a successful receipt, the window size increases by an additional x, this is called the Slow Start for new connections. 6. The sender sends another datagram with 2x bytes, then 3x bytes and so on up to the MSS as indicated in the TCP Options. 7. If the receiver has a full buffer, then the window size is reduced to zero. In this state, the window is said to be Frozen and the sender cannot send any more bytes until it receives a datagram from the receiver with a window size greater than zero. 8. If the data fails to be received as determined by the timer which is set as soon as data is set until receipt of an ACK, then the window size is cut by half e.g. from 4x to 2x. Failure could be due to congestion e.g. a full buffer on the receiver, or faults on the media. 9. On the next successful transmission, the slow ramp up starts again.
An 8KB window size would take 32ms to be transmitted on a 2Mbps serial link ((8192 * 8)/2048000 = 0.032s). The RTT is therefore 64ms. So for every 64ms, 8KB is transmitted because packets can only be sent for 32ms of that time as we are having to await for ACKs i.e. we are not able to use the full capability of the bandwidth. Multiply this up and we find that an 8KB window gives us a maximum data throughput of 8192 * 8 * 1000/64 = 1024000bps (1Mbps), irrespective of the potential speed of the link.
An 8KB window size would take 400ms on a satellite link one way. The RTT is therefore 800ms. So for every 800ms, 8KB is transmitted because packets can only be sent for 400ms of that time as we are having to await for ACKs i.e. again we are not able to use the full capability of the bandwidth. Multiply this up and we find that an 8KB window gives us a maximum data throughput of 8192 * 8 * 1000/800 = 81920bps or about 82kbps, irrespective of the potential speed of the link. This is because of the enormous delay.
An 8KB window size would take 7s to be transmitted on a 9600bps serial link ((8192 * 8)/9600 = 6.83s). Most of the 8KB window will be buffered because of the serialisation delay as bits are sent much more slowly.
The TCP 16-bit window size field allows a maxmimum size of 65535 bytes for the window size so 64KB can be sent every RTT. For a satellite link with 800ms RTT the maximum throughput with this maximum sized window is given by 65535 * 8 * 1000/800 = 655350bps or about 660Kbps. An expensive 2Mbps satellite link would not be fully utilised. One of the TCP Options allows you to scale the window size up to a 30-bit field this is the Window Scale Option described in RFC 1323. It would be preferable to have a window size appropriate to the size of the link. There would be less buffering, the ACKs would return more quickly and more of the bandwidth would be used. Ideally you are looking for a Window Size >= Bandwidth * RTT. So a 128Kbps serial line with a RTT of 40ms would require a Window size of at least 128000/8 * 0.04 = 640 bytes. Similarly, a 2Mbps link with a 20ms RTT would need a window size of at least 2000000/8 * 0.02 = 5000 bytes. So a 128Kbps satellite link with a RTT of 800ms would require a Window size of at least 128000/8 * 0.8 = 12800 bytes. A technique such as this (although more complex) is used by the Packeteer product that spoofs the TCP connections between client and server and modifies the window sizes according to the characteristics of the links between them.
Type of segment 160.221.172.250 SYN Seq.no. 17768656 (next seq.no. 17768657) Ack.no. 0 Window 8192 LEN = 0 bytes SYN-ACK
160.221.73.26
Seq.no. 82980009 (next seq.no. 82980010) Ack.no. 17768657 Window 8760 LEN = 0 bytes
ACK
Seq.no. 17768657 (next seq.no. 17768657) Ack.no. 82980010 Window 8760 LEN = 0 bytes
Seq.no. 17768657 (next seq.no. 17768729) Ack.no. 82980010 Window 8760 LEN = 72 bytes of data Seq.no. 82980010 (next seq.no. 82980070) Ack.no. 17768729 Window 8688
LEN = 60 bytes of data Seq.no. 17768729 (next seq.no. 17768885) Ack.no. 82980070 Window 8700 LEN = 156 bytes of data Seq.no. 82980070 (next seq.no. 82980222) Ack.no. 17768885 Window 8532 LEN = 152 bytes of data FIN Seq.no. 17768885 (next seq.no. 17768886) Ack.no. 82980222 Window 8548 LEN = 0 bytes FIN-ACK Seq.no. 82980222 (next seq.no. 82980223) Ack.no. 17768886 Window 8532 LEN = 0 bytes ACK Seq.no. 17768886 (next seq.no. 17768886) Ack.no. 82980223 Window 8548 LEN = 0 bytes The value of LEN is the length of the TCP data which is calculated by subtracting the IP and TCP header sizes from the IP datagram size.
1. The session begins with station 160.221.172.250 initiating a SYN containing the sequence number 17768656 which is the ISS. In addition, the first octet of data contains the next sequence
number 17768657. There are only zeros in the Acknowledgement number field as this is not used in the SYN segment. The window size of the sender starts off as 8192 octets as assumed to be acceptable to the receiver. 2. The receiving station sends both its own ISS (82980009) in the sequence number field and acknowledges the sender's sequence number by incrementing it by 1 (17768657) expecting this to be the starting sequence number of the data bytes that will be sent next by the sender. This is called the SYN-ACK segment. The receiver's window size starts off as 8760. 3. Once the SYN-ACK has been received, the sender issues an ACK that acknowledges the receiver's ISS by incrementing it by 1 and placing it in the acknowledgement field (82980010). The sender also sends the same sequence number that it sent previously (17768657). This segment is empty of data and we don't want the session just to keep ramping up the sequence numbers unnecessarily. The window size of 8760 is acknowledged by the sender. 4. From now on ACKs are used until just before the end of the session. The sender now starts sending data by stating the sequence number 17768657 again since this is the sequence number of the first byte of the data that it is sending. Again the acknowledgement number 82980010 is sent which is the expected sequence number of the first byte of data that the receiver will send. In the above scenario, the sender is intitially sending 72 bytes of data in one segment. The network analyser may indicated the next expected sequence number in the trace, in this case this will be 17768657 + 72 = 17768729. The sender has now agreed the window size of 8760 and uses it itself. 5. The receiver acknowledges the receipt of the data by sending back the number 17768729 in the acknowledgement number field thereby acknowledging that the next byte of data to be sent will begin with sequence number 17768729 (implicit in this is the understanding that sequence numbers up to and including 17768728 have been successfully received). Notice that not every byte needs to be acknowledged. The receiver also sends back the sequence number of the first byte of data in its own segment (82980010) that is to be sent. The receiver is sending 60 bytes of
data. The receiver subtracts 72 bytes from its previous window size of 8760 and sends 8688 as its new window size. 6. The sender acknowledges the receipt of the data with the number 82980070 (82980010 + 60) in the acknowledgement number field, this being the sequence number of the next data byte expected to be received from the receiver. The sender sends 156 bytes of data starting at sequence number 17768729. The sender subtracts 60 bytes from its previous window size of 8760 and sends the new size of 8700. 7. The receiver acknowledges receipt of this data with the number 17768885 (17768729 + 156) since it was expecting it, and sends 152 bytes of data beginning with the sequence number 82980070. The receiver subtracts 156 bytes from the previous window size of 8688 and sends the new window size of 8532. 8. The sender acknowledges this with the next expected sequence number 82980070 + 152 = 82980222 and sends the expected sequence number 17768885 in a FIN because at this point the application wants to close the session. The sender subtracts 152 bytes from its previous window size of 8700 and sends the new size of 8548. 9. The receiver sends an FIN-ACK acknowledging the FIN and increments the acknowledgement sequence number by 1 to 17768886 which is the number it will expect on the final ACK. In addition the receiver sends the expected sequence number 82980223. The window size remains at 8532 as no data was received from the sender's FIN. 10. The final ACK is sent by the sender confirming the sequence number 17768886 and acknowledges receipt of 1 byte with the acknowledgement number 82980223. The window size finishes at 8548 and the TCP connection is now closed. From the above you can see that if you have applications where data flow is largely unidirectional, you can have a scenario where there could be a long series of ACKs where the sequence numbers are the same as far as the data receiver is concerned. Also, you may have a frozen window whilst the application catches up which means that in the meantime acknowledgements are sent by the receiver with a window size of 0 until buffer space is freed up and an acknowledgement is sent with the window size ramped up again, thereby allowing the sender to send data again and the sequence numbers start increasing again. The above example is a clean straightforward bi-directional data transfer session, however you often have multiple TCP sessions to sort through using different ports and sequence numbers, plus in any one session segments could be resent, sent in a row or the window is frozen due to the stack buffer being full
all of which can make it interesting tracking sequence numbers. Be aware that the ACK only has to acknowledge the last sequence number received, so if four segments have been sent in a row, only one ACK is required. If sequence numbers do not arrive then the whole segment is lost with all the bytes of data within it, plus any segments that may have been sent in a row before the lost segment. You will notice in the above example that the window size steadily decreased, this indicates that no data had been processed off the TCP stack by the time the session had finished. On a longer session you should see the window size creep up again as the buffer is emptied by the application. In the example the window sizes could easily be followed because the segment packets followed each other, however most often acknowledgements do not always follow and may be acknowledging more than one segment, this makes it more tricky to follow. The above description details the simplest case of TCP connections, however you can get more complex scenarios where simultaneous connections are set up, or segments get lost or resent. The judicious use of RST (Reset) helps clean these connections up. You can follow step by step these different scenarios in RFC 793.