Sunteți pe pagina 1din 75

1

User Datagram Protocol (UDP)


Transmission Control Protocol (TCP)
Malathi Veeraraghavan (original set by Jorg Liebeherr)
2
Orientation
TCP and UDP are end-to-end protocols
Which means they are executed on the end points (hosts)
Application
TCP/UDP
IP
HOST
Network 1
protocols
Network 1
protocols
Network 2
protocols
IP
Application
TCP/UDP
IP
HOST
Network 2
protocols
IP Router
3
Transport Protocols in the Internet
UDP - User Datagram Protocol
datagram oriented
unreliable, connectionless
simple
unicast and multicast
useful for multimedia applications
used for control protocols
network management
(SNMP), routing (RIP),
naming (DNS), etc.
TCP - Transmission Control
Protocol
stream oriented
reliable, connection-oriented
complex
only unicast
used for data applications:
web (http), email (smtp), file
transfer (ftp), SecureCRT,
etc.
The Internet supports 2 transport protocols
4
UDP - User Datagram Protocol
UDP extends the host-to-to-host delivery service of IP to an
application process-to-application process delivery service
It does this by multiplexing and demultiplexing packets from
multiple application-to-application communication sessions
UDP
IP IP
IP
router
IP
router
IP
router
UDP
Applications Applications
5
UDP packet format
IP header UDP header UDP data (payload)
UDP message length Checksum
20 bytes 8 bytes
0 15 16 31
Source Port Number Destination Port Number
Port numbers identify sending and receiving applications (processes).
Maximum port number is 2
16
-1= 65,535
Message Length is between 8 bytes (i.e., data field can be empty) and
65,535 bytes (length of UDP header and data in bytes)
Checksum is for UDP header and UDP data
6
Port Numbers
UDP (and TCP) use port numbers to identify applications
There are 65,535 UDP ports per host.

IP
TCP UDP
User
Process
Demultiplex
based on
Protocol field in IP
header
User
Process
User
Process
User
Process
User
Process
User
Process
Demultiplex
based on
port number
7
TCP
Service offered by TCP
Byte stream service
Method for sharing link bandwidth among many flows
TCP Header
TCP Connection Establishment and Termination
Flow control
Error control
Congestion control
8
TCP = Transmission Control Protocol
Provides a reliable unicast end-to-end byte stream over an
unreliable internetwork.


TCP
IP Internetwork
B
y
t
e

S
t
r
e
a
m
B
y
t
e

S
t
r
e
a
m
TCP
9
Byte Stream Service
To the lower layers, TCP handles data in "segments"
To the higher layers TCP handles data as a sequence of
bytes and does not identify boundaries between bytes
So: Higher layers do not know about the beginning and
end of segments !
TCP
Application
1. write 100 bytes
2. write 20 bytes
queue of
bytes to be
transmitted
TCP
queue of
bytes that
have been
received
Application
1. read 40 bytes
2. read 40 bytes
3. read 40 bytes
Segments
10
TCP is reliable
Byte stream is broken up into chunks which are called segments
Detecting errors/losses:
TCP has checksums for header and data. Segments with invalid
checksums are discarded
Timeout mechanism waiting for ACK
Correcting errors/losses:
Retransmit when packet loss/error is detected
11
Main source of "lost" segments
When a buffer in a line card of an IP router is full, new
incoming packets headed for an output port on that line card
will simply be dropped
recall packet-based multiplexing and packet switch design
As new communication sessions start up, there is potential for
congestion, i.e., buffer overflows
12
TCP Congestion control: important function
TCP at the sending host constantly adjusts its sending rate based on its
assessment of the path congestion
Conflicting goals:
Use high sending rate to lower file-transfer delay
But if every TCP sender sends at a high rate (greedy), this can cause
buffer overflows
TCP senders are constantly trying to find the optimal rate.
This is called congestion control.
Added into TCP in 1988
Van Jacobson and Karels proposed a congestion control scheme in
response to a series of congestion collapses on the Internet starting in
1986
13
An example to show impact of congestion
control
Traceroute to University of Oregon
Understand that a "flow" (TCP session) can pass through
many IP routers
Obtain a quick understanding of today's network topology
University (UVa)
Regional network (MAX)
Backbone (core) network (Internet2)
A "formula" for capturing effect of congestion and losses on
Throughput
File-transfer delay
14
Traceroute from UVa to Univ. of Oregon
<rivanna.ee.virginia.edu> traceroute www.uoregon.edu
traceroute to www.uoregon.edu (128.223.142.89), 30 hops max, 38 byte packets
1 gilmer-router-all.acc.Virginia.EDU (128.143.10.254) 0.405 ms 0.308 ms 0.355 ms
2 carruthers-6509a-x.misc.Virginia.EDU (128.143.222.46) 27.092 ms 2.091 ms 0.480 ms
3 new-internet-x.misc.Virginia.EDU (128.143.222.93) 0.483 ms 0.466 ms 0.983 ms
4 nwv-vortex-i2.misc.Virginia.EDU (192.35.48.34) 3.857 ms 3.847 ms 3.856 ms
5 max-router-mclean.104.networkvirginia.net (192.70.138.102) 4.106 ms 4.070 ms 3.977 ms
6 i2-lvl3.maxgigapop.net (206.196.178.46) 4.107 ms 4.090 ms 3.980 ms
7 so-0-0-0.0.rtr.atla.net.internet2.edu (64.57.28.6) 28.093 ms 17.582 ms 20.971 ms
8 so-0-2-0.0.rtr.hous.net.internet2.edu (64.57.28.43) 40.958 ms 40.938 ms 40.953 ms
9 so-3-0-0.0.rtr.losa.net.internet2.edu (64.57.28.44) 82.188 ms 72.924 ms 72.943 ms
10 vl-101.xe-0-0-0.core0-gw.pdx.oregon-gigapop.net (198.32.165.65) 94.794 ms 94.787 ms 94.797 ms
11 vl-105.uonet9-gw.eug.oregon-gigapop.net (198.32.165.92) 97.415 ms 97.402 ms 97.419 ms
12 ge-5-2.uonet1-gw.uoregon.edu (128.223.3.1) 98.182 ms 97.899 ms 97.795 ms
13 www.uoregon.edu (128.223.142.89) 97.546 ms 97.533 ms 97.422 ms
15
University, regional and core networks
University: UVa
http://odin.itc.virginia.edu/~cricket/cricket/mini-graph.cgi?target=%2Frouter-
interfaces
see internet-1 and vl20 I2/Abilene via VORTEX (interface to MAX)
Regional: MAX (Mid-Atlantic Crossroads)
http://wiki.maxgigapop.net/twiki/bin/view/MAX/MRTGGraphs
see link to Internet2
Backbone: Internet2 traffic monitoring web site
http://atlas.grnoc.iu.edu/atlas.cgi?map_name=Internet2%20IP%20Layer
click on any link between routers
Regional: Oregon Gigapop
University: Univ. of Oregon
16
Example showing impact of congestion
control
File transfer delay is determined by
bottleneck link rate, r = min(r
1
, r
2
, r
3
, r
4
, r
5
)
packet loss rate on end-to-end path, p

round-trip time (RTT)
r
1
r
5
r
4
r
3
r
2
.....
..... Other links
.....
.....
IP router
IP router
IP router
IP router
Data
segment
ACK
RTT
17
Throughput - approximate formula for large
transfers
Throughput is effective rate, r
effective
Parameters
r: Bottleneck link rate
RTT: Round-Trip Time
MSS: Maximum Segment Size (maximum size of TCP payload)
p: Packet loss on the path
The macroscopic behavior of the TCP congestion avoidance algorithm
by Mathis, Semke, Mahdavi & Ott in Computer Communication Review,
27(3), July 1997
|
|
.
|

\
|
=
p RTT
MSS
r r
effective
1
, min
18
File-transfer delay (approximation)
For large files:
File-transfer delay: D (sec)
File size: S (bits)
Throughput: r
effective
(b/s)
As before, interpret the word "delay" as
time taken to transfer the file.
Cannot use this for small file transfers
effective
r
S
D =
19
In-class exercise
On a given path, if
packet loss rate = 1% (moderate load)
round-trip time = 50ms (wide-area session: DC to LA)
bottleneck link rate = 100Mbps (Fast Ethernet)
What is the throughput?
How long will it take to transfer a 100MB file?
What is the throughput if the RTT is 1ms, while the rest of the
parameters stay the same?
What if in the wide-area (first) scenario, packet loss rate is 0?
MSS = 1460B; with 20-byte TCP header and 20-byte IP header, we
get the MTU limit of 1500B for an Ethernet frame
20
TCP
Service offered by TCP
TCP Header
TCP Connection Establishment and Termination
Flow control
Error control
Congestion control
21
TCP Format
IP header TCP header TCP data
Sequence number (32 bits)
DATA (optional)
20 bytes 20 bytes
0 15 16 31
Source Port Number Destination Port Number
Acknowledgment number (32 bits)
window size
header
length
Flags
Options (if any)
2
0

b
y
t
e
s
4 bits
TCP segments have a 20 byte plus options header with >= 0 data bytes
shaded fields omitted for this class
22
TCP header fields - Port Numbers
Port Number:
A port number identifies the endpoint of a connection.
A pair <IP address, port number> identifies one endpoint of a
connection.
Two pairs <client IP address, client port number> and
<server IP address, server port number> identify a TCP
connection.
TCP
IP
Applications
23 104 80 Ports:
TCP
IP
Applications
7 16 80 Ports:
IP Network
23
TCP header fields - Sequence Number
Sequence Number (SeqNo):
Sequence number is 32 bits long.
So the range of SeqNo is
0 <= SeqNo <= 2
32
-1 ~ 4.3 Gbyte
Sequence number carried in the TCP header of a segment
Position of the first byte of the segment in the overall
byte stream (TCP segments are variable length)
Initial Sequence Number (ISN) of a connection is set
during connection establishment

Segment 1
(Seq. No. 1)
Segment 2
(Seq. No. 501)
Segment 3
(Seq. No. 1501)
1 500 501 1500 1501 1550
24
TCP header fields - Ack. No.
Acknowledgment Number (AckNo):
Acknowledgments are piggybacked, i.e.,
a segment from A B contains an acknowledgement
for a segment sent in the B A direction
The AckNo in the B A segment header contains the
SeqNo for the next segment expected at B for the A B
flow
Example: The acknowledgment for a 1500-byte segment
with the sequence number 0 is AckNo=1500
A host uses the AckNo field to send acknowledgements.
If a host sends an AckNo in a segment it sets the ACK
flag
25
TCP header fields - Ack. No. Contd.
Example:
Sender sends two segments with bytes 1..1500 and
1501..3000, but receiver only gets the second segment.
What is the sequence number of the first segment?
What is the sequence number of the second segment?
What is the ACK number sent in response by the
receiver when it receives the second segment?
26
TCP header fields - Header Length
Header Length (4 bits):
Length of header in 32-bit words
Note that TCP header has variable length (minimum of 20
bytes)


27
TCP header fields - Flags
Flag bits (3 other flags omitted for this class)
ACK: Acknowledgement Number is valid
SYN: Synchronize sequence numbers
Sent in the first packet when opening a connection
FIN: Sender is finished with sending
Used for closing a connection
Both sides of a connection must send a FIN
28
TCP header fields
Window Size:
Each side of the connection advertises its receiving
window size
also called flow window or advertised window
Window size is the maximum number of bytes that a
receiver can accept (i.e., space left in the receive buffer).
Maximum window size is 2
16
-1= 65535 bytes
29
TCP header fields - Options
Options - an important one included for this class:
Maximum Segment Size (MSS):
Sets the maximum length of the TCP segment payload
(DATA field) where application data is carried
This option can only appear in a SYN segment






Maximum
Segment Size
kind=2
1 byte
len=4
1 byte
maximum
segment size
2 bytes
30
TCP
Service offered by TCP
TCP Header
TCP Connection Establishment and Termination
Flow control
Error control
Congestion control
31
Connection Management in TCP

Opening a TCP Connection
Closing a TCP Connection
32
TCP Connection Establishment
TCP uses a three-way handshake to open a connection:
(1) ACTIVE OPEN: Client sends a segment with
SYN bit set
port number of client, port number of server
initial sequence number (ISN) of client
(2) PASSIVE OPEN: Server responds with a segment with
SYN bit set
initial sequence number of server
ACK for ISN of client
(3) Client acknowledges by sending a segment with:
ACK ISN of server
33
Three-Way Handshake
aida.virginia.edu
mng.virginia.edu
S
Y
N
(S
eq
N
o
= x)

SYN (SeqNo = y, AckNo = x + 1 )
ack (y + 1 )
34
First data segment sequence number
The data segment following the three-way handshake will
start with the sequence number following that of the SYN
segment
Example: if aida sends a data segment first to mng
aida.virginia.edu
mng.virginia.edu
D
ata (S
eq
N
o
= x+
1, A
ckN
o
= y+1)

Data (SeqNo = y+1, AckNo = x + 1 +S)
Size of payload = S bytes
35
TCP Connection Termination
Each end of the data flow must be shut down independently
(half-close)
If one end is done it sends a FIN segment. This means that
no more data will be sent

Four steps involved:
(1) X sends a FIN to Y (active close)
(2) Y ACKs the FIN,
(at this time: Y can still send data to X)
(3) and Y sends a FIN to X (passive close)
(4) X ACKs the FIN.
36
TCP
Service offered by TCP
TCP Header
TCP Connection Establishment and Termination
Flow control
Error control
Congestion control
37
TCP flow control
Flow Control: How to prevent the sender
from overrunning the receiver buffer?
Flow Control in TCP
TCP implements sliding window flow control
Window size is usually sent within acknowledgements.
38
Window Management in TCP
The receiver returns two parameters to the sender in an ACK


The interpretation is:
I am ready to receive new data with
SeqNo= AckNo, AckNo+1, ., AckNo+Win-1
Receiver can acknowledge data without increasing the
advertised window
Receiver can change the window size without acknowledging
data
AckNo
window size
(win)
32 bits 16 bits
39
Sliding Window: Example



3K
2K
SeqNo=0
Receiver
Buffer
0 4K
Sender
sends 2K
of data
2K
AckNo=2048 Win=2048
Sender
sends 2K
of data 2K
SeqNo=2048
4K
AckNo=4096 Win=0
AckNo=4096 Win=1024
S
e
n
d
e
r

b
l
o
c
k
e
d
payload
40
Sliding Window: In-class example
ack 1025 win 3072
Max-Data(seq 1)
4K bytes
Sender
Receiver
win 4096
How many more
maximum-sized
segments can it
send now?

4K bytes
Max-Data(seq 1025)
How many segments
can it send now?
Max-Data(seq 2049)
Max-Data(seq 3073)
1K
MSS = 1024 bytes
Max-Data(seq n): Maximum-
sized data segment with
sequence number n
41
TCP
Service offered by TCP
TCP Header
TCP Connection Establishment and Termination
Flow control
Error control
Congestion control
42
TCP error control
ARQ scheme with positive cumulative ACKs
Already seen that the header has sequence numbers and
acknowledgment numbers
Now, we will focus on the retransmission timer.
How long should the sender wait for an ACK after sending a
segment before it concludes that the segment is lost and
retransmits it?


43
TCP Retransmission Timer
Retransmission Timer:
The setting of the retransmission timer is crucial for
efficiency
Timeout value too small -> results in unnecessary
retransmissions
Timeout value too large -> long waiting time before a
retransmission can be issued

A problem is that the delays in the network are not fixed
Therefore, the retransmission timers must be adaptive
44
Measuring TCP Retransmission Timers
aida.virginia.edu rigoletto.virginia.edu
ftp session
from aida
to rigoletto
Transfer file from aida to rigoletto
Unplug Ethernet cable in the middle of file transfer
45
tcpdump Trace
10:42:01.704681 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:01.705603 aida.40001 > rigoletto.ftp-data: . 162649:164109(1460) ack 1 win 17520
10:42:01.706753 aida.40001 > rigoletto.ftp-data: . 164109:165569(1460) ack 1 win 17520
10:42:02.741764 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:05.741788 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:11.741828 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:23.741951 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:42:47.742176 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:43:35.742587 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:44:39.743140 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:45:43.743702 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:46:47.744271 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:47:51.752138 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:48:55.745547 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:49:59.746123 aida.40001 > rigoletto.ftp-data: . 161189:162649(1460) ack 1 win 17520
10:51:03.745839 aida.40001 > rigoletto.ftp-data: R 165569:165569(0) ack 1 win 17520
Sequence no.
46
Interpreting the Measurements
The interval between retransmission
attempts in seconds is:
1.03, 3, 6, 12, 24, 48, 64, 64, 64,
64, 64, 64, 64.
Time between retransmissions is
doubled each time (Exponential
Backoff Algorithm)
Timer is not increased beyond 64
seconds
TCP gives up after 13th attempt
and 9 minutes (total timeout,
tcp_ip_abort_interval is 2 mins in
Solaris and can be programmed by
administrator - 9 mins is the
commonly used old timeout value)
0
100
200
300
400
500
600
S
e
c
o
n
d
s
02468
1
0
1
2
Transmission Attempts
47
Adaptive mechanism
The retransmission mechanism of TCP is adaptive
The retransmission timers are set based on round-trip time (RTT) measurements
that TCP performs
Segment 1
Segm
ent 4
ACK for Segment 1
Segment 2
Segment 3
ACK for Segment 2 + 3
Segment 5
ACK for Segment 4
ACK for Segment 5
R
T
T

#
1
R
T
T

#
2
R
T
T

#
3
The RTT is based on time
difference between segment
transmission and ACK
But:
TCP does not ACK each
segment
Cant start a second RTT
measurement if timing on
one segment is in progress
Each connection has only
one timer
48
Computation of RTO in adaptive scheme
Retransmission timer is set to a Retransmission Timeout (RTO) value.
RTO is calculated based on the RTT measurements.
The RTT measurements are smoothed by the following estimators A (mean RTT value) and
D (smoothed mean deviation of RTT):

Err = M - A
A A(1-g)+gM
D D(1-h)+ h|Err|
RTO = A + 4D
The gains are set to h=1/4 and g=1/8
In the formula for computing the new smoothed mean RTT A, 0.125
times the newly measured value (M) is added to 0.875 times the old smoothed
value of A
49
Karns Algorithm
If an ACK for a retransmitted segment is
received, the sender cannot tell if the ACK
belongs to the original or the retransmission.
The RTT measurement started for the
original transmission should be terminated.
There will be no RTT measurement for the
original or retransmitted segment
Therefore A and D cannot be updated when
the ACK is received, and hence no new RTO
computation at this point.
Dont confuse this with the RTO being
doubled when the segment is retransmitted
following the exponential doubling
rule.
seg
m
en
t

A
C
K

retransm
ission
of segm
ent
Timeout !
R
T
T


?
R
T
T


?
RTT measurement is suspended
RTO is doubled
50
In-class exercise
Assume A=1, D=1 (initial values)

RTO= 5
RTO= ?
RTO=?
RTO= ?
Segment 1
ACK for Segment 1

Segment 2
Segment 2 (retransmitted)
ACK for Segment 2 + 3

R
T
T

=
2
X (packet lost)
?
51
RTO computation (adaptive)
Assume A=1, D=1 (initial values)
Err = 2 -1 =1 (since M, the measured RTT is 2)

RTO = A+4D=1.125+4 = 5.125
This is why in the figure below when segment 2 is lost, it is
retransmitted after 5.125 sec.
Segment 1
ACK for Segment 1

Segment 2
Segment 2 (retransmitted)
ACK for Segment 2

R
T
T

=
2
X (packet lost)
R
T
O

=
5
.
1
2
5
A = 0.8751 + 2 0.125= 1.125;
D = 1+0.25 (1-1)= 0.75 1 + 0.25 1 = 1
52
RTO computation (doubling)
Assume A=1, D=1 (initial values)

Segment 1
ACK for Segment 1

Segment 2
Segment 2 (retransmitted)
ACK for Segment 2 + 3

R
T
T

=
2
X (packet lost)
?
RTO=A+4D=5
RTO=A+4D=5.125
(adaptive: new A = 1.125; D=1)
RTO=10.25
(doubling)
RTO=10.25
(Karn's algorithm)
5.125 sec since that is the
retransmission time out value
Retransmitted segment;
hence no RTT measurement
53
In-class exercise
At t
1
: RTO = 6 sec; A = 2; D = 1
Just after t
2
: RTO= ?
At t
3
: RTO = ?
RTT #1 RTT #3
S
e
g
m
e
n
t

1
A
C
K

f
o
r

S
e
g
m
e
n
t

1
S
Y
N
S
Y
N
Time-
out !
S
Y
N

+

A
C
K
S
e
g
m
e
n
t

2
S
e
g
m
e
n
t

3
A
C
K

f
o
r

S
e
g
m
e
n
t

2
A
C
K

f
o
r
S
e
g
m
e
n
t

3
RTT #2
S
e
g
m
e
n
t

4






.
S
e
g
m
e
n
t

5













.
S
e
g
m
e
n
t

6








.
A
C
K

f
o
r
S
e
g
m
e
n
t

4
t
1
t
2
t
3
t
4
t
5
t
6
t
7
t
8
A
C
K
t
9
3 sec
54
Thus there are two schemes for determining RTO
and two schemes for controlling RTT measurement
RTO
Exponential backoff (RTO doubling) if a segment is
retransmitted
Adaptive RTO as a function of RTT (A+4D)
RTT measurement
Karns algorithm
no RTT measurement on retransmitted segment
Cannot start a second RTT measurement if timing on one segment is
in progress
55
TCP
Service offered by TCP
TCP Header
TCP Connection Establishment and Termination
Flow control
Error control
Congestion control
56
What is congestion control?
In a connectionless packet-switched network, recall that no
reservations are made for bandwidth before data is sent
This can lead to input and output queues at packet switches
(IP routers on an Internet path) filling up and consequently
packets being dropped.
Such a condition is referred to as congestion.
Congestion control mechanisms are needed to recover the
network from a congested state.
57
Congestion Control
TCP implements congestion control at the sender
The sender has two parameters for congestion control:
Congestion Window (cwnd; Initial value is MSS bytes)
Threshold Value (ssthresh; Initial value is 65536 bytes)

The number of outstanding segments is set as follows:
Number of outstanding segments = MIN (flow window, congestion window)


flow window: available space in the receive-side buffer
MSS: Maximum Segment Size (set with option field in TCP header)
ssthresh: Slow Start Threshold
outstanding: sent but not yet acknowledged
58
Slow Start
For every ACK
received, the TCP
sender increases
cwnd by the number
of bytes
acknowledged
If flow window is very
large, cwnd
determines how
many segments can
be sent
Waiting for an ACK
decreases effective
rate
segment 1
ACK for segment 1
cwnd =
1xMSS
cwnd =
2xMSS
segment 2
segment 3
ACK for segments 2 + 3
cwnd =
4xMSS
segment 4
segment 5
segment 6
ACK for segments 4+5+6+7
cwnd =
8xMSS
segment 7
59
Normal operation of Slow Start / Congestion
Avoidance
If cwnd <= ssthresh then
/* Slow Start Phase */
Each time an ACK is received for a Max-size segment:
cwnd cwnd + MSS
else /* cwnd > ssthresh */
/* Congestion Avoidance Phase */
Each time an ACK is received for a Max-size segment:
cwnd cwnd + MSS * MSS / cwnd + MSS / 8
endif


60
In-class example
(MSS = 512; flow window = 5120)
Enter congestion avoidance
cwnd=512; ssthresh=2560
cwnd=2560
cwnd=?
cwnd=?
cwnd = ?
how many max-sized
segments
can be sent now?
cwnd=1536
cwnd= 2048
61
Computation of cwnd on previous slide
Up to and including ack 2561, this TCP connection is in slow
start, and cwnd is increased by 1 MSS bytes each time an
ACK is received.
Note that when cwnd = ssthresh, slow start is still applied.
Hence when ack 2561 is received, cwnd = 2560+512 = 3072.
When the last ack shown on the previous slide is received,
the TCP connection is in congestion avoidance since cwnd is
> ssthresh. Therefore, cwnd = cwnd + MSS MSS / cwnd +
MSS / 8 = 3072 + 512 512/3072+512/8=3222

62
Growth of cwnd
Assume that ssthresh = 4
0
2
4
6
8
10
12
14
t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7
Roundtrip times
C
w
n
d

(
i
n

s
e
g
m
e
n
t
s
)

ssthresh
63
Slow Start: impacts delay of small files
T
pkt
sp
RTT
RTT
RTT
Client Server
1
3
2
4
5
6
7
8
9
10
11
12
13
14
15
16
17
SYN
ACK
SYN
ACK
ACK
RTT
When the number of
outstanding packets
equals BDP
(Bandwidth-delay
product), the transfer
will reach a streaming
state no idle periods
between packet
transmissions
64
Small file transfers
Ethernet frame, carrying an IP datagram, which
transports a TCP segment: 1518 bytes
Ethernet header + trailer: 18 bytes
Ethernet MTU: 1500 bytes
IP header: 20 bytes (w/o options)
IP payload: 1480 bytes
TCP header: 20 bytes (w/o options)
TCP payload: 1460 bytes
Small file transfer delay (contrast with large
file transfer throughput equation)
Assume MSS = 1460 bytes
Assume lossless transfer and that the receive buffer is
not a bottleneck
Link rate: 1 Gbps
Emission delay for one Ethernet frame: 12s
if RTT = 50ms, then at what cwnd value will data start
streaming?
ACK arrives in time for sender to send a new frame
What is the minimum file size needed to reach this cwnd?
What is the file transfer delay of a 10KB file?
65
66
When congestion occurs: Congestion Avoidance
Algorithm
When congestion occurs (indicated by timeout),
ssthresh is set to half the current window size (the
minimum of the flow window (FW) and cwnd):
ssthresh = min(cwnd,FW) / 2 but at least 2MSS
cwnd is changed according to:
cwnd = 1 MSS bytes (in case of timeout only)
When new data is acknowledged, cwnd is increased
according to whether it is in slow start or CA
67
Improved scheme for reacting to congestion:
Fast retransmit/Fast recovery algorithm
When the sender receives three duplicate ACKs, it assumes
congestion has occured and reacts (instead of waiting for the
retransmission timer to timeout)
This algorithm is called Fast retransmit/Fast recovery
Details not included in this class.
Malathi Veeraraghavan
Originals by Jrg
Liebeherr
68
Accelerated retransmissions (Fast retransmit)
TCP allows accelerated retransmissions (Fast Retransmit)
If receiver gets a segment out of order, it sends an ack with
the expected sequence number. If sender receives one or
two duplicate ACKs, it thinks segments are misordered.
When expected segment is received at receiver, it sends
the correct ACK. But if the third duplicate ACK is received
at sender, it assumes lost segments and retransmits
immediately without waiting for expiry of retransmission
timer. Hence it is called fast retransmit.
Malathi Veeraraghavan
Originals by Jrg
Liebeherr
69
Fast Retransmit
After the third duplicate ACK
(meaning fourth ACK) is received
by the sender, it transmits a single
segment without waiting for a
timeout to expire.
Data (100:200)
ACK 100
ACK 100
ACK 100
Data (100:200)
ACK 100
Malathi Veeraraghavan
Originals by Jrg
Liebeherr
70
Fast Recovery (segsize s MSS)

If 3rd duplicate ACK (this means fourth ACK with same ack no.) is received:
ssthresh = min(cwnd, receivers advertised window)/2
cwnd = ssthresh + 3 segsize; then retransmit segment
Reason: TCP receiver has to issue an ACK every time it receives a new segment. Therefore when the sender
receives 3 duplicate ACKs it implies that three segments got through the network successfully; Therefore it
inflates the cwnd.

For each additional duplicate ACK received:


cwnd = cwnd + segsize
and transmit a segment if allowed by new value of cwnd

When an ACK arrives that acknowledges new data set cwnd = ssthresh; (this should be the ACK for the
retransmission from step 1); additionally, it will ack intermediate segments between lost packet and receipt of third
duplicate ACK, so set cwnd = cwnd + segsize; now in CA phase
Malathi Veeraraghavan
Originals by Jrg
Liebeherr
71
Example: duplicate ACKs
(congestion avoidance algorithm and fast retransmit/recovery algorithm)
In case of duplicate ACKs, both congestion avoidance algorithm and fast
retransmit/recovery algorithms apply
cwnd=3222; ssthresh=2560
PSH 3073:3585
(512) ack 10
X
PSH 3585:4097 (512) ack 10
PSH 4097:4609 (512) ack 10
PSH 4609:5121 (512) ack 10
ack 3073
ack 3073
ack 3073
ack 3073
cwnd=3222; ssthresh=1536
cwnd=3222; ssthresh=1536
cwnd=1536+3*512=3072; ssthresh=1536
PSH 3073:3585 (512) ack 10
ack 5121
cwnd=ssthresh=1536; ssthresh=1536;
cwnd=2048

For reason for last cwnd increase to 2048, see last case in Fig. 21.11 of Stevens
72
Sending rate (cwnd vs. time)
A typical plot of cwnd for a TCP connection:
c
w
n
d

time
73
Relate cwnd to effective rate (bits/sec)
In the first round, effective rate = 1MSS/RTT
Sends one segment and waits for ACK
In the second round, effective rate = 2MSS/RTT
Since cwnd = 2, it sends 2 segments and waits for ACK
This control is used to manage the rate at which senders
send data into the network
Sender keeps increasing the sending rate to fully exploit
available bandwidth
If a segment is lost, sender interprets this as "congestion"
in the network, and so drops its sending rate by decreasing
cwnd
74
Test your understanding
Sending side:
1. what is the formula for determining how many segments a TCP sender is allowed to
send?
2. what does a sender do when it receives an ACK?
updates cwnd and RTO, if an RTT measurement was taken
3. how does it update its RTO?
4. what happens when a loss is detected at the sender by a retransmission timer time-
out?
how does it change the RTO?
how does it change the cwnd?
is any other parameter changed; if so, how? ssthresh
Receiving side:
1. what values are set in the acknowledgment number and window size header fields?
In a bidirectional flow, each end will have both a "sending side" and a "receiving side"
75
Supplemental reading
"TCP/IP Illustrated Vol. 1: The Protocols" by W.R. Stevens
Chapter 11: UDP
Chapter 17: TCP
Chapter 18: TCP Connection Establishment and
Termination (up to Section 18.5)
Chapter 20: Bulk data flow
Chapter 21: TCP timeout and retransmission (up to
Section 21.6)

S-ar putea să vă placă și