Sunteți pe pagina 1din 502

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Data Communications and Networking


COSC 264
Introduction to Communication Networks
Dr. Andreas Willig1
Dr. Muhammad Asad Arfeen2
1 Dept.

of Computer Science and Software Engineering


University of Canterbury, Christchurch
2 Dept. of Computer and Information Systems Engineering
NED University of Engineering & Technology, Karachi

UoC, 2014

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Outline

Fundamentals
Circuit-Switching
Packet-Switching
Virtual-Circuit-Switching
Excursion

Excursion

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

About this Module

Goals:
Understand the reasons for creating networks
Know major design issues
Understand basic operation modes of networks
This module is based on [9, Chap. 10, 11] and [6]

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Outline

Fundamentals
Circuit-Switching
Packet-Switching
Virtual-Circuit-Switching
Excursion

Excursion

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Fundamentals of Networks
So far we have mostly studied communication between two

stations, using a single channel

How can we communicate in a larger population of users /

stations / terminals? Say: all persons in a country?

Fundamental aim of networks:


Provide good reachability and service quality to users. . .
. . . but do so at reasonable costs for the users . . .
. . . and giving revenue to network providers
Networks are getting more useful to users as the number

of users increases

Key question
How to build and operate large-scale networks?

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

A First Look on Network Design

We first consider a very simple model of a network: as a

simple communication graph:

Nodes represent stations / switching elements . . .


Edges represent direct communication links

Design goal: given a number N of stations, create links so

as to make the graph connected!

Question
What are desirable additional objectives for such a graph and
what are important constraints?

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

First Option: Fully Meshed Network

Each of N stations has separate


link/cable to each other station

Total number of links: N(N1)


2
Advantage: cutting one link does
not disable communication with
other partners

Disadvantage: costs!!
Disadvantage: poor scalability!!

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Second Option: A Minimum Spanning Tree

Total number of links: N 1


Advantage: minimal # of links to

enable communication between all


nodes

Disadvantage: one link failure can


partition network

Disadvantage: stations see

signals for communications they


are not involved in security
issue!

Disadvantage: stations need the


ability to forward foreign
communications additional
complexity

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Third Option: A Star Network

Total number of links: N


One additional network element:
a switching element

Advantage: one link failure does


not partition remaining network

Disadvantage: switching element


is now single point of failure

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Fourth Option: A Partially Meshed Network

Total number of links: between N


and

N(N1)
2

One link failure does not partition


network

One switching element might fail


without compromising
connectivity (much)

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

A First Conclusion

Conclusion
By carefully designing partially meshed networks we can
achieve a good balance between costs (# of links, # of
switching elements) and network properties like resilience
against link failures or switching element failures

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

A More Realistic Model

Links are labeled with available


capacities (e.g. measured in
Mbit/s)

Links could also be labeled with


other quantities, e.g. delays,
transmission costs, . . . , or any
combination of these

End nodes are labeled as A, B,


C, . . .

Missing in this figure, but

nonetheless present is the


generated traffic demand

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

A More Realistic Model (2)

Suppose that nodes E and C


want to communicate at a
sustained rate of 3 Mbit/s

A possible route (which has to


be identified!!) is shown

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

A More Realistic Model (3)

Now suppose that nodes D and

B also want to communicate at a


sustained rate of 3 MBit/s

Again, a suitable route has to be


identified

Why this (the green) one?


What is implicitly assumed here?
How would route setup

considerations change if the


communication flows are bursty
instead of sustained?

Do you know of an example of


bursty traffic?

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Important Tasks in Networking

A primary task in any network is to identify suitable routes


Subject to constraints on capacities, allowable delay, . . .
Respectful to existing traffic flows
Remember setting up the route from D to B
This is called routing
A second task is to avoid traffic overload situations in

specific network areas, or at least to react properly to them


This is called congestion control
The switching element close to B and C will likely not

accept much more traffic


Do you know of a real-life example of congestion control?

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Important Tasks in Networking (2)

A third important task is to avoid overflowing a receiver

with data from the sender

This is called flow control

A fourth important task is to deal with errors:


Transmitted data can be garbled or lost = error control
Stations, switching elements or links fail = resiliency

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Further Important Aspects of Networking

Network management (short-to-medium term)


React to link / station failures
Monitor network usage and performance to identify
bottlenecks, problems, . . .
Collect information necessary for billing and accounting
Network design and strategy (long term)
Decide on deployment of new links / switching elements
Often as reaction to changing user communities / usage
patterns, or new applications
Decide on additional traffic control mechanisms
Example: blocking P2P traffic
Decide on billing / pricing schemes

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Some Common Communication Patterns

Unicast

Broadcast
Multicast

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Communication Patterns Unicast

Only two nodes in the network involved

One is the transmitter, the other the receiver, but nodes

can also have both roles

Examples: phone connections, viewing a web page

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Communication Patterns Broadcast

One node as sender, all other as receivers


Examples: Radio, TV, . . .

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Communication Patterns Multicast

One node as sender, several, but not all others as

receivers

Often, in multicast groups all nodes can act as sender


Example: Internet chat, phone conferences

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Traffic: A Key Network Design Factor

Any network is expected to carry a certain class of traffic


Examples:
The POTS (plain old telephone system) carries voice traffic
The Internet carries WWW, P2P, interactive video, . . .

Important point
The design of a network is strongly influenced by the traffic it is
supposed to carry!

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Outline

Fundamentals
Circuit-Switching
Packet-Switching
Virtual-Circuit-Switching
Excursion

Excursion

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Voice/CBR Traffic

In modern telephone systems voice is transmitted digitally,

the analog voice signal is A/D-converted with fixed


sampling rate and resolution

ISDN: 8 kHz sampling rate, 8 bit resolution = 64 kbit/s


Cellular phone systems add voice compression, e.g. the

voice coders in GSM generate between 4.75 and 12 kbit/s


The voice data is generated at a fixed rate
This is not always true, some systems transmit nothing
when speaker is not active, we then have on-off behaviour
This is called a continuous bit rate (CBR) data stream

Other examples of CBR data: CBR video, periodic sensor

measurements, . . .

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Voice/CBR Traffic (2)

This fixed rate must be supported by the network,

otherwise users get dissatisfied with voice quality


Network must support bounded round-trip delay, otherwise
users get annoyed (boundary is at 250 ms)
See [5] for information on acceptable voice transmission

Phone calls last about three minutes on average

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Circuit Switching
A good way to support voice / CBR traffic is to set up a

dedicated connection or circuit between end points for the


duration of the connection
Think of this as a private cable

The lifetime of a connection encompasses three phases:


Connection setup: identify route, set aside resources
(buffers, processing capacity, bandwidth) in switching
elements and links, so that resources are guaranteed
Connection usage: use the established connection to
transmit CBR data the pre-reserved resources guarantee
that this connection is not influenced by other connections
Connection teardown: free the reserved resources
How do you trigger these steps in the POTS?
Switching elements in CS networks are called switches

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Circuit Switching Important Properties


A routing decision is made only once (at connection setup)

and never/rarely modified

A connection has its resources guaranteed

Any bandwidth not used by a connection cannot be

re-used by other connections, depending on data rate


dynamics this can result in poor utilization

Connection setup takes time, it does not pay off when only

very little data needs to be transmitted

Connection setup may fail when no route or not enough

resources are available in the network

Most complex step is connection setup, connection usage

is a low-complexity task for switches

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Circuit Switching Admission Control

An important part of connection setup is that switching

elements check whether enough resources are available


for the new connection without compromising the
resources already granted to existing connections

The outcome can be positive or negative, result is

signalled to user wishing to set up connection

This procedure is called admission control

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

A Closer Look at Switches

Switches are conceptually

separated into two logical units

Switch controller performs

complex processing during


connection setup (e.g. routing,
signaling, resource reservation,
switching fabric configuration)

Switching fabric is used by


established connections, it
forwards data from inputs to
outputs

(compare [9, Fig. 10.7])

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Switches Signaling
Signaling: control information exchanged to set up or tear

down a circuit

In the POTS certain signaling events between your phone

and the next office / switch (e.g. off-hook, dialing) use the
same channel as the speech data would use

Nowadays, inter-switch signaling often uses another

channel / network than the speech data

Definition
In in-band signaling user-data and signaling data share the
same communications channel, in out-of-band signaling
user-data and signaling data are transmitted on different
channels / networks.

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Space Switching Fabrics


Fabric has N inputs, N outputs
Signal pathes are physically
separated from each other
(divided in space)

For each of the N 2 possible

connections a small switch exists


at a crosspoint, which is closed
when a connection is
established, otherwise it is open

For each input line at most one


crosspoint can be closed

Problem: space switches for


large N are costly!!!

(compare [9, Fig. 10.5])

Problem: crosspoint utilization is


1/N at most

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Space Switching Fabrics (2)

Fabric has N inputs, N outputs,


but is composed of several
interconnected smaller fabrics

This kind of design is called a


three-stage space switch

The number of crosspoints is


much smaller than N 2

During connection setup a free


path through switching fabrics
has to be found

(compare [9, Fig. 10.6])

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Space Switching Fabrics (3)

Definition
A switch is non-blocking when at all times a free output port
can be reached from any input port. A switch is blocking when
it does not have this property.
The (full) space switch is non-blocking
The three-stage space switch is blocking
Find an example!!
What would happen if we remove the middle stage?

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Switches Further Remarks

Design of three-stage space switches must fulfill

constraints on blocking probability

Depends on assumptions about the call arrival process


Much literature available (e.g. [2])

There are also other types of switches, e.g. Time-Division

Switches: switching fabric uses high-speed link with time


subdivided into time-slots, one connection receives regular
time slots

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Outline

Fundamentals
Circuit-Switching
Packet-Switching
Virtual-Circuit-Switching
Excursion

Excursion

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Data Traffic

Many data applications naturally have time-varying rates


Called Variable-Bit-Rate (VBR) or bursty traffic

WWW is an example: users alternate between clicking and

thinking/reading, a significant fraction of time no data is


transferred at all

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Data Traffic over Circuit-Switched Networks

CS-networks are not well suited to VBR traffic:


Reserved rate is not well utilized during traffic pauses
Reserved rate might be too small during traffic peaks
No re-use of underutilized connections by other
connections

Conclusion
A more flexible networking mechanism is needed for data
traffic!!!

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Packet Switching
Data flows are segmented into packets

Packets are basic unit of transmission, not connections


A packet consists of:
A packet header containing meta-information about the
packet, e.g. address fields (see below)
The packet payload
Possibly a packet trailer for error detection / correction
Packets are transmitted individually and independently

There is no notion of a connection, packets can be sent

immediately without having to set up any state / resource


reservation in the network

Analogy: letter transfer in postal network, envelopes

correspond to packet headers

The Internet is a packet-switched network!

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Packet Switching Consequences


Let us call a sequence of packets between the same

source-destination-pair a flow

Each packet is routed individually, different packets in the

same flow can take different routes

Each switching element (called router in packet-switched

networks) makes a routing decision for each packet


Each packet must include information facilitating routing,

e.g. header fields for source and destination address


Routers do not attempt to identify flows, nor do they store

any per-flow state


Packets do not necessarily arrive in the same order as they

have been sent (packet reordering) [3]


Many flows can share a link, bandwidth not utilized by one

flow can be used by others


= This is called statistical multiplexing

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Packet Switching Consequences (2)

Lack of resource reservation means: there are no(t many)

guarantees for packet delivery

Internet/IP best effort service: packet is delivered maybe


IPs lack of guarantees is compensated (in parts) by TCP

Important Point
Routers in packet-switched networks perform more complex
processing during information transfer than switching fabrics in
circuit-switched networks!

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Packet Switching Congestion


Since flow data rates and routes often cannot be predicted

in advance, routers buffer some packets to prevent packet


dropping in temporary overload situations

Routers only have a finite amount of memory, and when

overload situation sustains, packet dropping is inevitable,


this is called congestion

Important question: which packets to drop?


Congestion control schemes either try to avoid

congestion or to deal with it in a graceful manner, e.g. by:


Subject packet flows to admission control
Sending signals to traffic sources to reduce data rates
Make good decisions about which packets to drop (e.g.

based on priorities) applications must help here!!


Modify the pricing during congestion

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Packet Switching Choice of Packet Size

Packet overheads (header, trailer) have a fixed size

Payload size is variable (within bounds)


Tradeoff:
Small payload size leads to high overhead
Small payload size leads to reduced susceptibility to errors
Packet size limits can be technology- or application-driven
Example: too long packets might block important alarm
messages for an unacceptably long time (blocking times)

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

A Closer Look at Routers

(compare [6, Fig. 2.15])

Excursion

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

A Closer Look at Routers (2)

For each new packet router must make a routing decision


Generally, routing decision can be based on many factors:
Source address, destination address
Packet priorities, type of data carried (video, web, . . . )
Link states of various candidate outputs
Queue lengths at various candidate outputs
Reachability of destination at various candidate outputs
But: this is too complex for high-performance routers

processing millions of packets per second

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

A Closer Look at Routers (3)


Traditional Internet approach:
Routing is based on destination addresses only
Routers consult a forwarding table, giving for each
destination address (range) the identity of the outgoing link
Forwarding table lookup can be made fast [10]
Forwarding table is populated as a result of a routing

protocol, which is executed by the routers control unit


collaboratively with other routers
Routing protocol operates at relatively large timescales
(seconds, minutes), so the forwarding table is stable for
many packets in succession
This distinction between routing and forwarding is an

example of separation between mechanism and strategy

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Outline

Fundamentals
Circuit-Switching
Packet-Switching
Virtual-Circuit-Switching
Excursion

Excursion

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

State of Affairs

Circuit-switching:
Can give guaranteed bandwidths
No statistical multiplexing, no re-use of resources
Data forwarding is low-complexity operation for switches
Routing is done only once
Packet-switching:
Cannot (easily) give any guarantees
Allows statistical multiplexing
Data forwarding is higher-complexity operation for routers
Routing is done for every packet
Can we marry these approaches and have their benefits

without having their problems?

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

State of Affairs (2)

No!!

Excursion

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

State of Affairs (3)

. . . at least not fully . . .

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Virtual-Circuit-Switching
Virtual circuit switching is a combination of packet- and

circuit-switching
Major characteristics:

It is connection-oriented, i.e. a connection needs to be

established before data transfer can commence


Information in a connection is transmitted in packets
A connection can accommodate VBR traffic
Statistical multiplexing capabilities are offered!
All packets in a connection follow the same path, packet
processing in switches is simpler than for routers
Often there are no (strict) bandwidth guarantees for a
connection

VCS technologies: ATM [9, Chap. 11], [7], [4], (G)MPLS

(specified amongst others in RFCs 3031, 3032)

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Simplified Timeline for Circuit-Switching

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Simplified Timeline for Packet-Switching

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Simplified Timeline for Virtual-Circuit-Switching

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

VC Setup and Usage

The approach to set up and use virtual circuits is common

in many VCS technologies

We will explain here a caricature of the label setup process

in MPLS, a similar technique has been adopted in ATM

Given: a source station S, a destination D and a set of

intermediate switches s1 , s2 , . . . , sn along the route


between S and D

A label is a link-local unique connection identifier of fixed

width (e.g. 20 bit in MPLS shim header)

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Idealized VC Connection Setup

The following procedure establishes a path and a reverse

path between S and D

Station S starts the process, allocates an un-used label l0

for link S s1 and includes this in a setup-request packet


Switch s1 now:
determines its incoming link for this request, here: i1
determines the next switch towards D (here: s2 ) and

outgoing link to it, here: o1


allocates an un-used label l1 for the link o1
re-writes the label l0 in the setup-request by l1
stores the quadruple (l0 , i1 , o1 , l1 ) in a forwarding table

Switches s2 , . . . , sn behave similarly

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Idealized VC Connection Setup (2)

Destination receives setup-request with label ln and:


stores it as pertaining to connection between S and D
creates a setup-response, sends it to switch sn with label ln
Switch sn behaves as follows:
It determines the incoming link, which here is just on
It consults the forwarding table with key (on , ln ), finds the
output port in and label ln1
It re-writes the label ln in the setup-response packet to ln1
and forwards the packet over link in back to switch sn1
Switches sn1 , sn1 , . . . .s1 behave similarly
Upon arrival of the setup-response at S the connection is

established

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

VC Connection Usage
Suppose S wants to transmit a packet to D

S prepends the label l0 to the packet and sends it to s1


Switch s1 now:
determines the incoming port i1 for that packet
consults the forwarding table with key (i1 , l0 ) and finds
output port o1 and output label l1
rewrites the label l0 in the packet with l1 , and
forwards the packet via output o1 .
The other switches behave similarly

Key point
The switches perform a very simple operation, often much
simpler than what routers have to do!!!

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

ATM Asynchronous Transfer Mode

Developed in the 80s and 90s of the last century [7], [4]
It is defined in ITU-T standards, e.g. I.150, I.432

Nowadays mainly used in high-speed backbone networks


Aim: integrated broadband network for voice- and data
Major characteristics:
Virtual-circuit-switched
Transmission based on fixed-size packets called cells
Different types of connections are offered (CBR real-time,
VBR real-time and non-real-time), some of them with
network-internal resource reservation
= Quality-of-Service support!
ATM performs admissission control

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

ATM Cell
This is an ATM cell at boundary

between user terminal and first


switch in-network cells have no
GFC, but a longer VPI

GFC: control traffic flow rates


PT: distinguishes between userand management cells, which
may or may not have
experienced congestion

CLP = Cell Loss Priority (guides

which cells should be preferrably


discarded upon congestion)

VPI and VCI together make up a


label similar to the MPLS label

ATM switches can also operate


on the VPI field only

No payload checksum??

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

ATM Physical Layer


Data rates in I.432: 622.08, 155.52, 51.84, 25.6 Mbps
ATM is a point-to-point technology

Standard I.432 specifies two transmission schemes


Cell-based Physical Layer:
An ATM interface emits cells continuously
Problem: receiver must identify cell boundaries
(synchronization)
Receiver hunts bit-per-bit for bit-subsequence where HEC
fits to previous 32 bits
Once a matching HEC is found, synchronization is achieved
Bit errors in header do not only invalidate cell, but could
also lead to loss of synchronization
In an SDH-based physical layer ATM cells are embedded

into SDH frames

SDH/SONET is an optical transmission technology with a

TDMA structure [8]

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Packet- or Cell-Switches

Cell switches operate on fixed-size packets, called cells

(like in ATM)

Packet switches / routers operate on variable-length

packets (like IP routers have to)

But: many modern IP routers internally work as cell

switches by splitting IP packets into a sequence of cells

All following explanations refer to cell-switching, but can be

equally applied to packet switching

The following is partly based on [6, Sec. 1.2]

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Cell-Switches

We now discuss a few issues in


cell switching, considering an
example

A cell switch has N input lines, N


output lines and a switching
matrix in between

Switch operates in time-slotted


manner, slot-time is equal to
cell-transmission time

Cell arrivals and cell departures


are synchronized to slot
boundaries

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Cell-Switches (2)

Our example is a 4 4 switch


(i.e. N = 4)

Inputs and outputs are numbered


from 1 to 4, respectively

Inputs and outputs have the


same rate

Cells are marked e.g. as a3,

where a identifies the cell and 3


identifies output link

Assumption: switch is
non-blocking

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Cell-Switches The Need for Queues

In each time slot at most one cell can arrive per input link

These cells can have destination conflicts: up to N arriving

cells can be destined for the same output during one slot

Output serves only one cell, where to store excess cells?


Two approaches:
Input queueing: input links are equipped with FIFO
buffers, and head-of-queue cell is transferred to its output
only when it is the only cell for the given output, or if it has
been randomly selected among all cells for the same output
Output queueing: output links are equipped with FIFO
buffers, all input cells towards same output are transferred
in parallel

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Cell Switches Output Queueing

Implementation of output queueing requires that buffer

memory of the output can accommodate up to N cells


during a slot time

Limitations in memory speed create bounds for N

Conclusion
Output-queued switches have limited scalability!!

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Cell Switches Input Queueing

In this example cell c4 (the only

cell for output 4!) has to wait until


cells a1 and b3 have been
transferred but a1 suffers from
destination conflicts

Called head-of-line (HOL)


blocking

Requirements in memory access


speed are much more relaxed
than for output-queueing

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Excursion: Throughput of Input-Queued Switches


HOL blocking reduces throughput by how much?
Assumptions (from [6, Chap. 10]):
Input links are saturated, i.e. always have a new cell
Output links of each cell are selected randomly (uniform
distribution, independent of all other cells)
TN [0, 1] denotes the throughput, i.e. average number of

cells leaving any output for an N N switch


Result:

lim TN = 2 2 0.586
N

This indicates switch capacity, i.e. maximum cell arrival

rate for which average queue lengths are bounded


Output-queued switches achieve TN = 1 for any technically
feasible N
Question: can you think of an improvement of
input-queueing?

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Outline

Fundamentals
Circuit-Switching
Packet-Switching
Virtual-Circuit-Switching
Excursion

Excursion

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

An Unspoken Assumption

So far we have implicitly assumed that:

Assumption
Switching elements / routers do not modify the user data
either the data is forwarded as it is or it is thrown away entirely!!
The relatively new field of network coding modifies this
assumption by letting routers compute new packets out of two
or more other packets [1], [11], [12]

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Network Coding Example Traditional Approach

Given: two wireless stations A

and B and a router in between

A and B cannot hear each other,

but both can hear the router (and


vice versa)

A wants to transmit a message x


to B, and B wants to transmit
message y to A

x and y have the same length


Router simply forwards packets,
four packets required in total

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Bibliography

Network Coding Example Network Coding

Instead of simply forwarding the


received packets, the router
computes a new packet as x
XOR y and broadcasts the
resulting packet to A and B

A can compute y as x XOR (x

XOR y), i.e. by XORing his own


packet with the one broadcast by
the router

Similarly, B can compute x


Only three packets required!

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Rudolf Ahlswede, Ning Cai, Shuo-Yen Robert Li, and Raymond W. Yeung.
Network information flow.
IEEE Transactions on Information Theory, 46(4):12041216, July 2000.
John C. Bellamy.
Digital Telephony.
John Wiley and Sons, Chichester, UK, third edition, 2000.
Jon C. R. Bennett, Craig Partridge, and Nicholas Shectman.
Packet Reordering is Not Pathological Network Behaviour.
IEEE/ACM Transactions on Networking, 7(6):789798, December 1999.
Martin DePrycker.
Asynchronous Transfer Mode - The solution for Broadband-ISDN.
Prentice Hall, New York , London, second edition, 1994.
Olivier Hersent, David Gurle, and Jean-Pierre Petit.
IP Telephony Packet-based multimedia communications systems.
Addison-Wesley, Harlow / England, London, 2000.
Anurag Kumar, D. Manjunath, and Joy Kuri.
Communication Networking An Analytical Approach.
Morgan Kaufmann Publishers, San Francisco, 2004.
Raif O. Onvural.
Asynchronous Transfer Mode Networks - Performance Issues.
Artech House, Boston, London, second edition, 1995.
Mike Sexton and Andy Reid.
Broadband Networking ATM, SDH and SONET.

Bibliography

Fundamentals

Circuit-Switching

Packet-Switching

Virtual-Circuit-Switching

Excursion

Artech House, Boston, 1997.


William Stallings.
Data and Computer Communications.
Prentice Hall, Englewood Cliffs, New Jersey, fourth edition, 2006.
George Varghese.
Network Algorithmics: An Interdisciplinary Approach to Designing Fast
Networked Devices.
Morgan Kaufmann Publishers, San Francisco, 2004.
Raymond W. Yeung.
Information Theory and Network Coding.
Springer, New York, 2008.
Raymond W. Yeung, Shuo-Yen Robert Li, Ning Cai, and Zhen Zhang.
Network Coding Theory.
now Publishers Inc., Boston / Delft, 2006.

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Data Communications and Networking


COSC 264
Network Architectures and Protocol Basics
Dr. Andreas Willig1
Dr. Muhammad Asad Arfeen2
1 Dept.

of Computer Science and Software Engineering


University of Canterbury, Christchurch
2 Dept. of Computer and Information Systems Engineering
NED University of Engineering & Technology, Karachi

UoC, 2014

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Outline

Protocol Layering
The Concept of Layering
The OSI Reference Model
The TCP/IP Reference Model
Elements of Service and Protocol Design
Service Primitives
A few Standard Protocol Mechanisms

Bibliography

Protocol Layering

Elements of Service and Protocol Design

About this Module

We look at architectures for packet-switched networks


Goals:
Understand protocol layering and two reference models
Understand the concepts of services, protocols and the
relationship between both
This module is based on [8, Chap. 2], [4]
Further references: [3], [2], [9], [1], [5]

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Outline

Protocol Layering
Elements of Service and Protocol Design

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Networking Software
The Internet and POTS are among the most complex

technical systems, they require vast amounts of software


Structuring principles help to organize networking
software to achieve:

Modularity and software re-use


Independence of network technologies (Transparency)
Separation of concerns
Correctness

Layering
A key structuring principle for networking software is layering:
the functionality is decomposed into a chain of layers so that
layer N offers services (through an interface) to layer N + 1
and itself is only allowed to use services offered by layer N 1.

Protocol Layering

Elements of Service and Protocol Design

Outline

Protocol Layering
The Concept of Layering
The OSI Reference Model
The TCP/IP Reference Model
Elements of Service and Protocol Design

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Layering Concepts

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Layering Concepts (2)

A layer N offers an N-service interface


The next higher layer N + 1 is only allowed to use the N-interface, but not any of
the lower interfaces (e.g. the N 1 interface) this applies to all layers!

The N-interface offers services at service access points (SAP)


The N-interface can offer several SAPs, this allows to multiplex between different
layer N + 1 protocols or different layer N + 1 connections or sessions

Example: POTS offers voice and fax services, uses telephone sockets as SAPs

Protocol Layering

Elements of Service and Protocol Design

Layering Concepts (3)

The layer N-service is implemented through an N-protocol


The N-protocol makes direct use of N 1 services
The N-protocol makes no assumption whatsoever on what is on layer N + 1
It exchanges protocol data units (PDUs) with a peer N-protocol entity it
constructs these PDUs itself and hands them over to its local N 1-layer to
deliver them to peer N-protocol entity (which in turn receives it from its local
N 1 layer)

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

General Layout of a PDU

The N-PDU is constructed by the N-protocol entity


It carries the data handed over by layer N + 1 for transmission, also referred to
as user data or N-SDU (service data unit)

The sending N-protocol entity adds an N-protocol header which carries control
information (e.g. sequence numbers, addresses, flags) important for the
N-protocol but not the receiving N + 1 layer

It might furthermore add an N-protocol trailer (usually a checksum)


The receiving N-protocol entity removes the N header and trailer and hands over
the N + 1 data to its local layer N + 1 entity

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Layered PDU Processing

An N-PDU is treated as payload / user data by the N 1 layer


Each layer adds own header and trailer before handing down to lower layer
Receiving layer removes its header / trailer before handing payload to upper layer
How would you support efficient PDU processing in operating systems?

Protocol Layering

Elements of Service and Protocol Design

Bibliography

About Interfaces
Interfaces specify a service that a certain layer offers
Example:
The socket interface offers reliable, in-sequence and
byte-oriented data transfer through an interface somewhat
resembling a file system interface
The TCP/IP protocol stack implements this service
Applications just use the socket interface and are not
concerned with the operation of the TCP protocol

Important Point
Standardized interfaces allow higher layers to ignore the
operation and properties of lower layers
Several (not only networking) standard documents tend to

specify interfaces, and not procedures / algorithms

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Cross-Layer Approaches

How many layers should a protocol stack have?


Having many layers . . .
. . . can lead to clean and modular software
. . . incurs more overhead (processing, headers)
Existing protocol stack architectures have no more than

five to seven layers, some have less


Recently, cross-layer designs have become popular:
Non-neighbored layers are allowed to exchange information
and use their services
Done for performance reasons
But: can lead to unwanted interactions between layers [5]

Protocol Layering

Elements of Service and Protocol Design

Outline

Protocol Layering
The Concept of Layering
The OSI Reference Model
The TCP/IP Reference Model
Elements of Service and Protocol Design

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The OSI Seven Layer Model

OSI = Open Systems Interconnection


Set of standards and protocols created
by ISO

See [9]
The model was not commercially
successful, but helped greatly to clarify
networking architectures and
discussion

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The OSI Seven Layer Model A Second View

Lowest three layers exchange PDUs between physically connected hosts


Upper four layers exchange protocol messages between end hosts (perhaps
over several intermediate nodes, called routers)

This already hints at a network architecture where end nodes are interconnected
through routers!

Protocol Layering

Elements of Service and Protocol Design

Bibliography

OSI RM Physical Layer

Concerned with transmission of digital data (e.g. bits,

bytes) over a physical medium


Often involves specification of:
Cable types (wired) or frequencies / bandwidth (wireless)
Connectors
Electrical specifications
Modulation / demodulation and signal specification
Carrier- or bit synchronization methods

Protocol Layering

Elements of Service and Protocol Design

OSI RM Link Layer


Task: (reliable) transfer of messages over physical link
Link layer messages are often called frames
Often involves specification of:
Framing:
delineation of frame start and end
choice of frame size
Error control (e.g. coding or retransmission-based)
Coding is also often regarded as a PHY functionality
Medium access control
distributes right to send on shared channel to several
participants
often considered as a separate sub-layer of link layer
Flow control
Avoid overwhelming a slow receiver with too much data

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

OSI RM Network Layer

Concerned with:
Providing a link technology-independent abstraction of
entire network to higher layers
Addressing and routing
End-to-end delivery of messages
Network- and higher-layer messages are called packets
Often involves specification of:
Addressing formats
Exchange of routing information and route computation
Depending on technology: establishment, maintenance and
teardown of connections

Protocol Layering

Elements of Service and Protocol Design

Bibliography

OSI RM Transport Layer

Concerned with:
(reliable, in-sequence, transparent) end-to-end data transfer
programming abstractions (interface) to higher layers
Often involves specification of:
Error-control procedures (Question: why again?)
Flow control procedures
Congestion control procedures
Protect network against overloading
Can also be considered a network-layer issue

Protocol Layering

Elements of Service and Protocol Design

Bibliography

OSI RM Session and Representation Layer


Session layer:
Concerned with establishing communication sessions
between applications
A session can involve several transport layer connections in
parallel or sequentially
A session might control the way in which two partners
interact, for example enforce that partners speak
alternatingly
Representation layer:
Translates between different representations of data types
used on different end hosts
Example: host A uses low-endian ints, host B big-endian

Protocol Layering

Elements of Service and Protocol Design

OSI RM Application Layer

Contains application support functions or functionalities

used in many applications


Examples:
File transfer services
Directory services
Transaction processing support (e.g. two-phase commit)

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Outline

Protocol Layering
The Concept of Layering
The OSI Reference Model
The TCP/IP Reference Model
Elements of Service and Protocol Design

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The TCP/IP Reference Model

This model is used in the Internet


The Internet follows the so-called
end-to-end principle: Layers 3 and
below are kept simple, most
complexity resides in transport layer

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The TCP/IP Reference Model A Second View

This reference model also uses a network architecture where end nodes (called
hosts) are interconnected through routers!

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The Application Layer

Consists of applications using services of transport layer


Accesses transport layer through socket interface
There are well-known application-layer protocols, e.g.:
SMTP (email)
HTTP (web)
FTP (file transfer)
RTP (real-time video and audio)

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The Transport Layer


Provides end-to-end communications to applications
Offers its services through socket interface
Standard transport layer protocols:
TCP: reliable, in-sequence byte-stream transfer
UDP: unreliable, un-ordered message transfer

but other protocols can be used as well (e.g. SCTP)


SAPs are called ports, used for application multiplexing

Several applications / processes can use transport service


One application is bound to one port
Ports are identified by numbers
The PDUs generated by TCP / UDP are called segments
TCP / UDP segments include the port number
TCP / UDP receiver delivers incoming segment to the
application denoted by the port number

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The Transport Layer (2)

TCP has mechanisms for:


Error control (retransmission-based) and in-order delivery
Flow control
Congestion control
UDP has none of these features
TCP and UDP hand over segments to the Internet layer

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The Internet Layer

This is a key part of the TCP/IP reference model


Uses IP (Internet Protocol), its PDUs are called datagrams
All higher-layer segments are encapsulated in datagrams
The IP protocol:
specifies an addressing scheme
provides end-to-end delivery of datagrams (forwarding)
does not specify how routing is done, left to dedicated
protocols
has no mechanisms for error, flow and congestion control
can send IP datagram over any network interface

Protocol Layering

Elements of Service and Protocol Design

The Internet Layer (2)

Everything over IP, IP over everything

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The Physical and Network Interface Layer

The physical layer is similar to the PHY of the OSI RM


The Network Interface Layer:
Accepts IP datagrams and delivers them over physical link
Receives IP datagrams and delivers them to local IP layer
Includes medium access control, framing, address
resolution
Might also include link-layer error- and flow control

Protocol Layering

Elements of Service and Protocol Design

End-to-End Principle

The Internet layer protocol (IP) is very simple


Transport protocols run only in the hosts
This is the end-to-end principle [7]:
Keep routers simple
Realize reliability, sequencing etc. only in end hosts

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Outline

Protocol Layering
Elements of Service and Protocol Design

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

The Five Elements of a Protocol


Compare [4]
A protocol specification should explain:
the service provided
the assumptions about the environment in which it operates
the vocabulary of messages (PDUs) used by the protocol
the encoding (binary representation) of the messages
the procedure rules guarding the consistency of exchanges
of messages and service primitives
The rules are the hardest to get right

Important Point
A protocol implements a service. Higher layers only use the
service and are not exposed to the internals of the protocol!

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Why is Protocol Engineering so complex?


Network protocols run as a distributed system of entities

(computers, routers) exchanging messages


Networking protocols need mechanisms to deal with:

Loss of messages
Failure of network links
Crash of entities
Large differences in processing speeds among entities
Incompatible data representations
Equipment from different vendors
Random delays in message transmission
Errors induced into messages (e.g. by the channel)
...

Many of these are non-issues in software running on single

computer!

Protocol Layering

Elements of Service and Protocol Design

Protocol Engineering
Major steps in protocol engineering
Design of service and protocol
Initial performance evaluation
Formal specification of service and protocol design
Often as finite state automata, e.g. in SDL language
Validation: is the (formal) design doing the right things?
Does it fulfill the service?
Presence of deadlocks? Liveness?
Can undesirable situations occur?
Protocol implementation
Verification and testing: does the implementation

correspond to the formal specification?


Performance tuning, e.g. optimization of parameters

See [6], [4]

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Outline

Protocol Layering
Elements of Service and Protocol Design
Service Primitives
A few Standard Protocol Mechanisms

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Service Providers and Service Users


An N-protocol implements an N-service
Stated differently: the N-protocol is the N-service provider!
An N+1-protocol (or the application) is the N-service user
Service provider and user:
talk to each other through service primitives
have to obey rules in the usage of services
Example: before a telephone can use any send voice data
service, it must have used connection setup service before

Stated differently: service provider and user also run a

protocol for exchanging service primitives!


Standard service primitives for a service S:

S.request
S.indication
S.response
S.confirmation

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Confirmed Service
Service user at A issues an
S.request service primitive,
possibly carrying user data

The service provider for S (a


protocol) generates one or more
PDUs and sends them to host B

Service user at B is informed


about As service request
through an S.indication primitive

Service user at B prepares


response (possibly with data),
gives it to local service provider
through S.response

Bs response is made known to


As service user through
S.confirm primitive

Key point: response comes from


Bs service user!

Do you know an example?

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Unconfirmed Service

Service user at A issues an


S.request primitive

Service provider for S generates


one or more PDUs and sends
them to host B

Service user at B is informed


through an S.indication primitive

Service user at A has no clue


whether service request has
reached B

Do you know an example?

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Confirmed Delivery Service

Roughly similar to confirmed


service

Key difference: it is Bs service


provider generating a response,
not Bs service user!

Thus, As service user has no


information about the behaviour
of Bs service user

Do you know an example?

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Confirmed Transmission Service

Roughly similar to the


unconfirmed service

Key difference: service user at A


gets confirmation that any PDUs
related to its service request
have indeed been sent

Do you know an example?

Protocol Layering

Elements of Service and Protocol Design

Outline

Protocol Layering
Elements of Service and Protocol Design
Service Primitives
A few Standard Protocol Mechanisms

Bibliography

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Multiplexing
Multiplexing allows to transmit
data from several N SAPs over a
single N 1 SAP

When several N SAPs are used


in parallel, the N protocol entity
needs to make scheduling
decisions to decide which N
SAP to serve next

Sending N entity needs to


include an SAP identifier into the
N PDU to allow receiver entity to
deliver an incoming N-PDU to
the right SAP

Example: TCP supports several


SAPs through port numbers,
port numbers are part of TCP
header

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Splitting

An N-entity can transmit data


received from higher layers via
N-SAP over several N 1 SAPs

Allows transmission of data over


several channels to increase
throughput and / or reliability
through parallel transmission

N-entity needs to make


scheduling decisions on which
N 1 SAP to use for a given
PDU

Additional mechanisms for


sequencing might become
necessary

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Fragmentation and Reassembly


PDUs often have a limited size
on the lower layers this is usually
for physical reasons

To make PDU sizes transparent


to higher layers, an N-layer can
accept large N-SDUs and
partition the data into several
N-PDUs (fragments), each
having own header, and transmit
them separately

Fragments must be numbered to


allow receiver correct
re-assembly

Question: How should the


receiver deal with losses of
PDUs?

Disadvantage: higher overhead

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Blocking and Deblocking


Sometimes higher layers
produce very small N-SDUs

Instead of putting each N-SDU


into separate N-PDU, transmitter
waits until several N-SDUs are
present (blocking) and puts
them into one N-PDU to save
overhead

Receiver entity decomposes


received N-PDU (deblocking)
and delivers several N-SDUs to
higher layers, this requires
markers in the N-PDU
separating the N-SDUs

Question: when should sender


stop collecting N-SDUs and
send an N-PDU?

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Sequence Numbers
An N-entity can maintain a sequence number
For each newly constructed PDU the sequence number is

written into the N-PDU header, afterwards the sequence


number is incremented
Sequence numbers allow the receiver to:
Detect duplicate PDUs (and drop them)
Detect lost PDUs (and possibly request their retransmission

from sender)
Put N-PDUs back in the right order when network has

reordered them
Implementation issues:
Sequence number space is finite, wrapovers need to be
handled
Choice of initial sequence number

Protocol Layering

Elements of Service and Protocol Design

Bibliography

Mung Chiang, Steven H. Low, A. Robert Calderbank, and John C. Doyle.


Layering as Optimization Decomposition: A Mathematical Theory of Network
Architectures.
Proceedings of the IEEE, 95(1):255312, January 2007.
Douglas E. Comer.
Internetworking with TCP/IP Principles, Protocols and Architecture, volume 1.
Prentice Hall, Upper Saddle River, New Jersey, fifth edition, 2006.
John Day.
Patterns in Network Architecture A Return to Fundamentals.
Prentice Hall, Upper Saddle River, New Jersey, 2008.
Gerard J. Holzmann.
Design and Validation of Computer Protocols.
Prentice Hall, Englewood Cliffs, 1992.
Vikas Kawadia and P. R. Kumar.
A Cautionary Perspective on Cross-Layer Design.
IEEE Wireless Communications, 12(1):311, February 2005.
Miroslav Popovic.
Communication Protocol Engineering.
CRC Press, Boca Raton, Florida, 2006.
Jerome H. Saltzer, David P. Reed, and David D. Clark.
End-to-end arguments in system design.
ACM Transactions on Computer Systems, 2(4):277288, November 1984.

Protocol Layering

Elements of Service and Protocol Design

William Stallings.
Data and Computer Communications.
Prentice Hall, Englewood Cliffs, New Jersey, fourth edition, 2006.
Hubert Zimmermann.
OSI Reference ModelThe ISO Model of Architecture for Open Systems
Interconnection.
IEEE Transactions on Communications, 28(4):425432, April 1980.

Bibliography

LANs

MAC

Bridges, Switches

Data Communications and Networking


COSC 264
Local Area Networks
Introduction
Dr. Andreas Willig1
Dr. Muhammad Asad Arfeen2
1 Dept.

of Computer Science and Software Engineering


University of Canterbury, Christchurch
2 Dept. of Computer and Information Systems Engineering
NED University of Engineering & Technology, Karachi

UoC, 2014

Bibliography

LANs

MAC

Bridges, Switches

Outline
LANs
Introduction
LAN Protocol Architecture
Topologies
MAC
Fundamentals
Orthogonal Schemes: FDMA, TMDA, SDMA, CDMA
Random Access Protocols
Other Schemes
Bridges, Switches
Repeaters and Hubs
Bridges and Switches

Bibliography

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Preliminaries

The following slides are based mainly on [44], [38]


The older, but still very good book [7] covers some related

topics, book is available in pdf format here:


http://web.mit.edu/dimitrib/www/datanets.html

LANs

MAC

Bridges, Switches

Outline

LANs
Introduction
LAN Protocol Architecture
Topologies
MAC
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Local Area Networks (LANs)


LANs are packet-switched networks
Packets are often called frames in LAN context
They have limited geographical extension, usually 1 km
Offer a shared transmission medium to multiple stations
Often controlled by only one owner / administrative entity
Offer low cost for station attachment
Support higher rates than usually experienced over

wide-area networks
Some application areas:
Connect desktop computers to share files, emails, . . .
Allow several computers to share printers, file servers, . . .
Interactive video or telephony between local users

LANs

MAC

Bridges, Switches

Bibliography

LANs vs. WANs

WAN = Wide Area Network:


have national, continental or global geographical extension
typically controlled by several administrative entities
Often use high-capacity fibers for long-haul links
WAN Examples:
Internet
POTS
In the Internet, LANs are an elementary unit
Internet = Network of Networks!
LANs are attached to Routers, Routers are interconnected
via other LANs or via point-to-point connections

LANs

MAC

Bridges, Switches

Outline

LANs
Introduction
LAN Protocol Architecture
Topologies
MAC
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

LAN Protocol Architecture

LAN standards typically specify the following layers:


Physical layer (PHY), specifying transmission media,
network adapters, data rates and network topologies
Medium Access Control (MAC) sublayer, specifying how
stations share the transmission medium
Optional: Link layer (often called logical link control),
providing error-control, flow-control, etc.
This is the protocol architecture followed by the IEEE

standards, which are the dominant LAN standards

LANs

MAC

Bridges, Switches

Bibliography

The IEEE 802 Standards Series


IEEE = Institute of Electrical and Electronics Engineers
IEEE is a professional association, not a national or

international standards body


Nonetheless, ISO adopted several IEEE standards
Some important standards series (there are many more):
IEEE 802.1: Overview, Bridging, Network Management,
IEEE 802.2: Logical Link Control (LLC) [35]
IEEE 802.3: Ethernet [20]
IEEE 802.11: Wireless LAN [22]
IEEE 802.15: Wireless Personal Area Network (WPAN),
incl. Bluetooth [21], IEEE 802.15.4 [29], . . .
See http://standards.ieee.org/getieee802

LANs

MAC

Bridges, Switches

The IEEE 802 Standards Series (2)

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

The PHY
The physical layer is responsible for:
Encoding / modulation and demodulation / decoding
Preamble generation and removal
Determination of frame boundaries
Preambles:
allow receiver to acquire carrier- and bit-/symbol
synchronization
Use a well-known symbol pattern
Some mechanisms for frame-boundary determination:
Surrounding idle times on the medium
Unique start- and end-delimiters in frames
Length fields
Coding violations
PHY specification in IEEE standards includes transmission

media, connectors, data rates, used signals, . . .

LANs

MAC

Bridges, Switches

Bibliography

The PHY Example Frame Structure

SYNC field is preamble


Start-frame-delimiter (SFD) indicates
start of PHY frame

Signal field indicates modulation


used in MPDU part

Service field reserved for future


use

Length field indicates length of


frame (in ms)

CRC covers only PHY header


(from IEEE 802.11, [22, Fig. 15-1]):

LANs

MAC

Bridges, Switches

Bibliography

The PHY Transmission Media


Various transmission media, wired and wireless, are used

in IEEE 802 standards


Wired transmission media:
Twisted-Pair Cabling (e.g. switched Ethernet), up to 10 Gb/s

supported in Ethernet
Baseband coaxial cable (e.g. in classical Ethernet)
Broadband coaxial cable (e.g. cable TV)
Optical fibers, commercially available dense WDM systems

support up to 80 channels with 10 - 100 Gb/s each [33]


Wireless transmission media:
Radio-frequencies (e.g. 2.4 GHz or 5.2 / 5.75 GHz bands),
IEEE 802.11n supports up to 600 Mb/s
Infrared, IrDA supported 1 Gb/s over very short distances

LANs

MAC

Bridges, Switches

The MAC

MAC will be covered shortly ...

Bibliography

LANs

MAC

Bridges, Switches

Logical Link Control LLC

LLC: link layer protocol specification for IEEE 802.x

standards
LLC focuses on frame transmission to direct, single-hop

neighbours
Major responsibilities:

Error control
Flow control
Framing
Service provisioning to higher layers

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

LLC Error Control

Frames can be corrupted due to:


Thermal noise in the receiver
Too low signal strength at the receiver
Collisions on the channel
Interference from external transmitters
Faulty processing
...
Many applications dont tolerate erroneously received data
Countermeasure: Error-control [32], [30], [7], [31]
Error control will be discussed in more detail later

LANs

MAC

Bridges, Switches

Bibliography

LLC Flow Control


Goal: protecting a slow receiver from having the sender

transmit data at a too high rate


Receiver could have a slow processor, little memory, . . .
A few approaches:
Explicit signaling:
Binary signaling: You may / may not send me data, for
example the RR/RNR (Receiver Ready, Receiver Not
Ready) frames in HDLC
Credit-based flow control: You can send me up to hundred
bytes before i start to drop, used for example in TCP
Implicit signaling:
Delay or suppress error-control feedback

LANs

MAC

Bridges, Switches

Outline

LANs
Introduction
LAN Protocol Architecture
Topologies
MAC
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Bus Topology

Stations are attached via tap line to bus


Bus is a broadcast medium, signals are heard by all stations
Parallel transmissions collide, all packets are garbled
Frames include destination address field to indicate intended receiver
Terminating resistances prevent signal reflections
Bus length usually limited to a few dozens / hundreds of m, e.g. to
prevent excessive signal attenuation
By cascading several buses (using branch points) more complex
broadcast toplogogies can be built (e.g. trees)
Examples: classical Ethernet, Token Bus (IEEE 802.4 [17])

LANs

MAC

Bridges, Switches

Bibliography

Ring Topology
Network consists of stations, relays
and point-to-point links between
relays

Relays pass through frames almost


immediately, they act as repeaters,
they also copy signals to attached
station and inject stations signals
into ring

Data flows in well-defined direction


Ring size often too small to carry
more than one frame at a time,
access to ring regulated by MAC
scheme

Examples: Token Ring (IEEE 802.5


[18]), Fiber Distributed Data Interface
(FDDI, ANSI Std. X3T9.5), Resilient
Packet Ring (IEEE 802.17)

LANs

MAC

Bridges, Switches

Bibliography

Star Topology

Stations are attached to a central


unit, often with separate cables for
transmit and receive directions

Central unit can act either as a


repeater or as a bridge/switch

A repeater copies incoming signals


to all other outputs

A bridge/switch copies incoming


frame only to output where
destination is

Discussed in more depth later

LANs

MAC

Bridges, Switches

Bibliography

Wireless Topologies

Wireless transmission media are much more prone to

errors than wired media [39], [52]


Bit-error rates of up to 102 are not uncommon

Wireless links are influenced by propagation environment


Doors, moving faces in the surrounding, walls, . . .

Important Property
Wireless transmission channels are time-varying, the links are
volatile and unpredictable, there is usually no static wireless
topology

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Fundamentals
Orthogonal Schemes: FDMA, TMDA, SDMA, CDMA
Random Access Protocols
Other Schemes
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Medium Access Control (MAC)


MAC protocols are schemes which allow a number of

users to coordinate the access to a common channel


MAC protocols are a vast subject, some references (barely

the tip of the iceberg!) are: [14], [41], [34], [10], [53], [47],
[2], [26], [28], [11], [48], [25], [49], [50] [4], [6], [3], [12]
The MAC layer is often regarded as a separate sub-layer

between PHY and link-layer


This view is supported by the fact that the MAC has a

distinguished task not covered by any other layer


MAC protocols are heavily influenced by the properties of

the underlying transmission medium

LANs

MAC

Bridges, Switches

Bibliography

MAC Definition
We are given:
A number of users / stations wishing to communicate
A shared communications channel / resource that can only
be used by one station at a time
No other means for information exchange between stations

Definition
MAC protocols are rules by which distributed stations
coordinate access to a common channel to share it efficiently
and in a manner satisfying given performance requirements
Example: 100 blind persons in a room how to distribute the
right to talk?

LANs

MAC

Bridges, Switches

Bibliography

Important Assumptions

The shared channel is a broadcast medium, i.e.

transmission of one station is heard by all other stations


Not necessarily true for wireless transmission media

In case of parallel transmissions all contending

transmissions are garbled, i.e. cannot be reliably decoded


Not necessarily true for wireless transmission media
Often not true for CDMA systems, also not in OFDMA

LANs

MAC

Bridges, Switches

Bibliography

MAC Design Desiderata


Small medium access delay: time between arrival of packet

to empty station and start of successful transmission


Depends on overheads: collisions, waiting times, . . .
For lightly loaded medium a small access delay is desirable
Hard real-time applications require bounded access delay

In real-time applications: support for packet priorities, i.e.

distinction between important and less important packets


Local priorities: station makes local decisions, but As

important packets can be blocked by Bs unimportant ones


Global priorities: all stations reach consensus (how?) about

which station has most important packet


Fairness and fair re-use of unused resources
Efficiency: low overhead, high throughput
Stability: increasing load should not decrease throughput

Note: you usually do not get all of these at the same time . . .

LANs

MAC

Bridges, Switches

Bibliography

MAC vs Duplexing

Channel duplexing schemes occur naturally when

full-duplex operation is required, e.g. in voice conversations


These schemes allow a station to separate its transmitted

signals from its received signals


Difference between duplexing and the MAC problem:
MAC coordinates transmissions of multiple users
Duplexing coordinates parallel transmission and reception

for a single user


But: MAC mechanisms can be used to implement

duplexing

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Fundamentals
Orthogonal Schemes: FDMA, TMDA, SDMA, CDMA
Random Access Protocols
Other Schemes
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Orthogonal Schemes

In orthogonal schemes the behavior of one station does

not influence the behavior / throughput / transmission


success / . . . of other stations
The four main (mostly) orthogonal schemes are:

FDMA = Frequency Division Multiple Access


TDMA = Time Division Multiple Access
SDMA = Space Division Multiple Access
CDMA = Code Division Multiple Access

We will not discuss SDMA and CDMA

Bibliography

LANs

MAC

Bridges, Switches

Frequency Division Multiple Access (FDMA)

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

FDMA (2)
The given channel bandwidth is subdivided into N

sub-channels
Between the sub-channels and at the fringe of the channel
there are guard bands:
Reduction of adjacent-channel interference

A sub-channel is exclusively assigned to a station i on a

long-term basis for transmission of data, no other station


is allowed to transmit on this channel
To receive data, a station must:
Either possess one separate receiver for each channel, or
have a single tunable receiver that must be switched to a

specific channel before data can be received on it


Problems: coordination/rendez-vous, tuning times

LANs

MAC

Bridges, Switches

Bibliography

FDMA (3)
If totally available bandwidth is Bb s,station iisassigned N1 of

B on a long-term basis (neglecting guard bands)

Medium access delay for a new packet arriving to an

empty station i is always zero, since i can start


transmission immediately without risk of collision
B
If a packet has size N
bits, its transmission takes exactly

one second, i.e.:

Transmission Delay = 1
where transmission delay is the time until transmission of a
frame completes (measured after arrival to empty station)

LANs

MAC

Bridges, Switches

Bibliography

FDMA Advantages

N stations can transmit in parallel


There is no need for time synchronization between the N

transmitters

LANs

MAC

Bridges, Switches

FDMA Disadvantages

Need for N receivers or tunable receivers increases

system complexity
Frequency synchronization required
There is no re-use, i.e. channels unused by one station

cannot be used by others

Conclusion
FDMA is good for CBR but bad for VBR traffic

Bibliography

LANs

MAC

Bridges, Switches

Time Division Multiple Access (TDMA)

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

TDMA (2)

Each station uses the whole frequency band (except some

guard bands at the fringe of the spectrum), but only at


certain times:

Time is subdivided into superframes of duration TSF


Each superframe is subdivided into N time-slots
There are short guard times between time slots
One or more time slots are exclusively and long-term
assigned to a station i for transmission

Stations must be time-synchronized to avoid overlapping

transmissions, guard times are required to compensate


(small) synchronization errors

LANs

MAC

Bridges, Switches

Bibliography

Access and Transmission Delay in TDMA

Neglecting guard times, each station gets the full channel

bandwidth Bb sforafractionof 1 N of time


Assume that:

station i owns one time slot


TSF = 1 second
a time-slot suffices to transmit NB bits
a packet of NB bits arrives at random time to empty station i

Medium access delay = waiting time until station is next

slot starts
Access Delay =

TSF
= 0.5s
2

LANs

MAC

Bridges, Switches

Bibliography

Access and Transmission Delay in TDMA (2)


The time to transmit the packet is N1 seconds
Assuming no channel errors we have:

Transmission Delay = Access Delay +


= 0.5s +

1
s
N

1
s 1s
N

for N > 2 this is a true inequality

Conclusion
In this example in TDMA we start later and finish sooner than
with FDMA!!

LANs

MAC

Bridges, Switches

Bibliography

TDMA Advantages and Disadvantages


Advantages:
It is easier to achieve asymmetric bandwidth assignments
in TDMA than in FDMA: using multiple time-slots is much
simpler than transmitting on multiple frequencies in parallel
TDMA tends to have better transmission delays than FDMA
No tunable receivers required
Disadvantages:
Tight time-synchronization between stations required
High expected access delay even in otherwise idle systems
Not possible to re-use unused time slots

Conclusion
TDMA is good for CBR but bad for VBR traffic

LANs

MAC

Bridges, Switches

Bibliography

Orthogonal Schemes Discussion


These schemes separate users perfectly from each other
Typically they assign resources (frequency, time) to users

on longer timescales, which may be permanent or for the


duration of a call
In so-called demand-assignment schemes resources are

also exclusively assigned to stations, but on much shorter


timescales (e.g. duration of a data burst in data traffic) [24]

Conclusion
In orthogonal schemes resource (de)allocation is considered a
rare event, in demand-assignment (DA) schemes not. DA
schemes must be much more efficient in signaling resource
(de)allocation.

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Fundamentals
Orthogonal Schemes: FDMA, TMDA, SDMA, CDMA
Random Access Protocols
Other Schemes
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Random Access Protocols


Random Access protocols:
do not attempt to reserve channel resources for longer time
do not require a central station or (much) shared state
do not access the medium at predictable times
often have low complexity (e.g. only little signaling, if any)
typically involve some random element
Random access protocols are used standalone and also

as building blocks for more complex protocols, e.g.:


ALOHA / slotted ALOHA is used for signaling bandwidth

requests in demand-assignment protocols


In GSM a mobile uses slotted ALOHA to request call setup

Important Point
Random access protocols accept risk of collisions to save
coordination overhead and have overall improved efficiency!

LANs

MAC

Bridges, Switches

Bibliography

ALOHA / Slotted ALOHA

ALOHA [1] is one of the earliest MAC protocols, developed

1970 at the University of Hawaii


Assumptions:
N uncoordinated transmitters
One receiver (e.g. a base station)
If two packets overlap at receiver, a collision occurs

LANs

MAC

Bridges, Switches

The Pure ALOHA Protocol

When a new packet arrives at an empty station:


a checksum is computed and appended to the frame
the frame is then transmitted immediately, there is no
coordination with other stations
an acknowledgement timer is started
The receiver sends an immediate ack upon successful

reception of a packet upon collisions or transmission


errors it remains quiet
If the transmitter receives an ack, the frame is removed

and the ack timer is canceled

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

The Pure ALOHA Protocol (2)


When ack timer expires, transmitter enters backoff mode:
The transmitter chooses a random backoff time
It waits for this time without further action
At backoff timer expiry the frame is re-transmitted
The ack timer is set again, backoff mode is left
Question: why is the backoff time chosen randomly?
When the number of failed trials exceeds a threshold, the

frame is dropped
The precise choice of random distribution for backoff times
is critical for delay, throughput and stability! [13], [27]
Often the random distribution depends on the number of

subsequent collisions seen by the frame


An example backoff strategy will be discussed later!

When a new packet arrives to backlogged station, it is

stored in queue and served after all previous packets

LANs

MAC

Bridges, Switches

Bibliography

Advantages of Pure ALOHA

Quite simple to implement


If network load is small:
new frames are sent immediately = access delay is zero
they can use the full channel bandwidth
and the probability of collision at the receiver is low

Conclusion
For low network loads most packets can have the minimum
possible transmission delay

LANs

MAC

Bridges, Switches

Bibliography

Disadvantages of Pure ALOHA


Consider stations A and B sending frames of same length:

When B starts its frame during the (two frame times long)

vulnerability period of As frame, the frames collide


When more stations are active / load is increased, the

collision probability increases


ALOHA cannot distinguish between collisions and channel

errors destroying a frame

LANs

MAC

Bridges, Switches

Bibliography

Throughput

Suppose all packets have the same length, packet transmission time is
Define througput as the (average) number of successfully received packets
during time in the absence of channel errors

Which of these (idealized) curves is the throughput of ALOHA? And which one
the throughput of FDMA? How about the other curves?

LANs

MAC

Bridges, Switches

Bibliography

Slotted ALOHA

Slotted ALOHA is similar to ALOHA, but:


Time is subdivided into time slots
A time slot is sufficient to accommodate frame
transmission, two round-trip times and ack transmission
All stations are time-synchronized
Any frame transmission has to start at slot boundary

Conclusion
The vulnerability period is reduced to one time-slot, slotted ALOHA
has better throughput

LANs

MAC

Bridges, Switches

Bibliography

The CSMA-Family of Protocols


CSMA = Carrier Sense Multiple Access [25], [49]
Common assumption: all stations can determine the state

of the medium (almost) instantaneously:


Busy: at least one station is currently transmitting
Idle: no station is transmitting

This operation is called carrier-sensing (CS) or


clear-channel assessment (CCA)
Common approach: Listen-before-talk
Before station transmits a frame, it performs CS operation
If channel is busy, station defers transmission according to

one of several possible strategies


The maximum number of deferrals (or backoffs) a station

might experience for a frame is often bounded


CSMA protocols do not eliminate collisions completely, but

reduce their rate or their impact

LANs

MAC

Bridges, Switches

Bibliography

The CSMA-Family of Protocols (2)


CSMA approach is especially useful when medium

propagation time is small compared to packet length, since


other stations notice transmission quickly after it started
Typically satisfied in LANs, propagation delay is small
By this, collisions can only occur when two stations start

transmitting at almost the same time (time difference


smaller than propagation delay)
When propagation time is large compared to packet length,

the sender might have already stopped transmission when


receiver senses busy carrier for first time
Example: multi-access satellite configurations
Here LBT is almost useless, ALOHA is reasonable

LANs

MAC

Bridges, Switches

Bibliography

Nonpersistent CSMA
If a station senses a busy medium, it:
draws a backoff time from a given random distribution
defers from channel activities during backoff time, and
then senses the medium again and starts over
If the station detects an idle medium, it starts transmitting

immediately
In case of a collision again a backoff time is chosen and
process starts over
Question: how to diagnose collisions?

Performance problem: with high probability a medium is

idle for some time after transmission has finished, this


lowers utilization

LANs

MAC

Bridges, Switches

Bibliography

p-persistent CSMA
Be p (0, 1) a parameter known to all stations
If a station senses a busy medium, it defers until the end of

the ongoing transmission, when medium becomes idle


A station divides time on idle medium in small time slots
At the beginning of a time slot a station performs a random
experiment: with probability p it starts transmission, with
probability 1 p it defers for one further slot
When station defers, it checks medium during remaining
slot time: when another station started transmission, station
waits for end of this transmission and starts over
Time slot just large enough to accommodate these activities
In case of collision, process starts over
Question: How would you choose p?
Performance problem: again, medium will be idle for some

time after transmission has finished

LANs

MAC

Bridges, Switches

Bibliography

1-persistent CSMA / CSMA-CD


CSMA-CD was chosen for classical Ethernet
If a station senses a busy medium, it defers until end of

ongoing transmission
When medium becomes idle, station sends unconditionally
This avoids idle times after previous transmission
But if two or more stations start, we surely have collision

and we need collision resolution procedure


While transmitting, sender tests channel for collisions
In case of a collision:
Transmission is aborted
A jamming signal is sent to inform all stations about collision
A collision resolution procedure is started, e.g.:
backoff schemes (used in Ethernet and WLANs)
tree algorithms [8]

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Fundamentals
Orthogonal Schemes: FDMA, TMDA, SDMA, CDMA
Random Access Protocols
Other Schemes
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols
Abstract view on polling systems:
One central station / base station / hub
N clients / stations, each having a packet queue
The hub has two basic tasks:
Query state of packet queues (e.g. # backlogged packets)
Grant bandwidth to stations based on results
Querying a station should be less costly than data

transmission (e.g. in terms of bandwidth)


Polling protocols are used to support time-bounded

services or minimum guaranteed bandwidth services e.g.


in the IEEE 802.11 WLAN w/ HCCA
References: [15], [9], [16], [42], [43], [45], [46]

LANs

MAC

Bridges, Switches

Polling Protocols (2)

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols (3)

Two logical channels are used for querying stations and

data transfer, but both can be mapped to one physical


channel
A station must be known to the hub in order to be polled

= a registration protocol (e.g. ALOHA) is needed!


Polling protocols can differ in:
Polling sequence: in which sequence are stations polled
Service types: may a station send one or multiple packets

after it has been granted bandwidth?


Querying mechanisms: how is a station / a group of

stations queried?

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols Polling Sequences

Round robin: all stations are polled one after another in a

circular fashion
= fair to all stations
Table-driven polling: hub has an arbitrary list specifying
polling sequence; if the end of the list is reached, the hub
starts over
Allows uneven bandwidth distributions
Inflexible when traffic demands change
Often used in hard real-time systems

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols Service Types

k -limited / time-limited service: a station is allowed to

transmit at most k packets or for at most t seconds


Exhaustive service: a station may transmit packets as long

as its queue is nonempty


Gated service: a station may send as many packets as are
in the queue in the moment the grant / bandwidth
assignment is received
packets arriving during a stations service have to wait for

the next round

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols Querying Mechanisms

Separate polling:
the hub sends short poll packet to one station i
if station has nonempty queue, it starts transmitting
if station has empty queue, it remains quiet or returns extra
NULL packet to hub
Separate polling has significant but constant overhead,

causes comparably long medium access delays for low


loads

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols Querying Mechanisms (2)


Group testing is discussed in [51], [5], [8]
Instead of sending a poll request to one station at a time, it

is sent to a group of stations


Any station from group may answer with request packet:
If no station answers, hub tests the next group
if one station answers, hub grants bandwidth to this station
if more than one station answers, a collision resolution

procedure is invoked; examples:


query each station in the group separately
split group in two and group-test each sub-group separately

Group testing shortens medium access delays in case of

light network load if groups are large

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols Querying Mechanisms (3)

Piggybacking: if a station sends a data packet, it indicates

its request for further bandwidth in the packets header


= no need for extra signalling packets!
Piggybacking is often used as an additional mechanism to

one of the other mechanisms


Piggybacking alone is not sufficient, since a station cannot

request bandwidth for new packets arriving to empty queue

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols Querying Mechanisms (4)

The query mechanisms described so far have two

important properties:
Querying is initiated by the base station (except

piggybacking)
They are deterministic: for each station there is an upper

bound on the time before it can send its request

LANs

MAC

Bridges, Switches

Bibliography

Polling Protocols Querying Mechanisms (5)

Reservation protocols [40]: another frequent case is that

querying is initiated by a station:


station sends reservation request packet to hub
hub either accepts or rejects the request indicated by

appropriate response packets


one example setup is that all stations have to share a

common signalling channel, e.g. using the ALOHA protocol,


while the data channel can be used exclusively
Reservation protocols work well if signalling is a rare event
Examples: MASCARA [36] and DQRUMA [24]

LANs

MAC

Bridges, Switches

Bibliography

Token-Passing Protocols
Assumption: stations are attached to a broadcast medium
The right to initiate data transmissions is passed between

stations using special token frames, no central station


After receiving a token a station may send data for limited
time, afterwards passes token to its successor
In absence of errors this guarantees a bounded medium

access time, required in real-time applications


The stations form a logical token passing ring which

constitutes predecessor / successor relationships


All stations must have consistent view on logical ring

Organization of the logical ring requires ring maintenance

mechanisms (involving special control frames)

LANs

MAC

Bridges, Switches

Bibliography

Token-Passing Protocols Problems

Maintaining a logical ring can be hairy, if:


not all stations can hear each other (partially meshed
topology)
mobility is involved (variable topology)
Loss of token / control frames can create severe problems

[23], [55]
Token-passing protocols tend to be very complex

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Bridges, Switches

Bibliography

LANs

MAC

Bridges, Switches

Coupling LANs

It is often required to couple existing LANs, e.g. upon

merging different departments


Coupling requires special coupling devices
Types of coupling devices:
PHY layer: repeaters, hubs
MAC layer: bridges, layer-2 switches
Network layer: router
Application layer: gateway
See [37], [54]

Bibliography

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Bridges, Switches
Repeaters and Hubs
Bridges and Switches

Bibliography

LANs

MAC

Bridges, Switches

Repeaters

A repeater amplifies a signal on the analog level


Any noise present in the signal is amplified as well
Repeaters add their own noise
Repeaters are not at all visible to any protocol or
modulation scheme
They can create slight delay (order of s and less)

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Regenerating Repeaters

A regenerating repeater demodulates an incoming signal

symbol-per-symbol and modulates it again


No interpretation whatsoever of protocol fields is done
Especially, no error checking / error correction is done,

regeneration can introduce errors

LANs

MAC

Bridges, Switches

Bibliography

Hubs

A hub is a centralized repeater, it


broadcasts signals incoming on one
port to all other ports

No interpretation of the incoming


frame is done, none of its fields is
evaluated

All stations are attached with one


transmit and one receive line

A hub creates a logical bus on a


physical star

Hubs can be cascaded


It may be regenerative or not
Question: Advantage over bus?

LANs

MAC

Bridges, Switches

Outline

LANs
MAC
Bridges, Switches
Repeaters and Hubs
Bridges and Switches

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Bridges
Bridges interconnect LANs on the MAC layer

Important Point
Bridges understand and interpret fields related to the MAC
protocol (e.g. address fields), repeaters / hubs do not!
Nowadays they mostly connect LANs of the same type (i.e.

Ethernet Ethernet), but bridges connecting LANs of


different types (e.g. Ethernet Token Ring) also exist(ed)
We focus on bridges interconnecting broadcast Ethernets
Important: Ethernet frames carry an Ethernet source

address and an Ethernet destination address in their frame


header
A bridge can connect several LANs
Bridges can be cascaded

LANs

MAC

Bridges, Switches

Bibliography

Basic Operation

When bridge receives frame from


LAN A, it checks the frame for
correctness, buffers it and checks the
MAC destination address (dst)

dst on LAN A: bridge does nothing


dst on LAN B or dst unknown:
bridge transmits frame on LAN B,
following the rules of the MAC
protocol!

Same in direction B A
Bridge does not modify any frame,
nor does it encapsulate them

Stations need not be aware of the


presence of bridges (transparency)

LANs

MAC

Bridges, Switches

Bibliography

Basic Operation (2)

How does bridge know which station


is in which LAN?

From reading a frames source


address field (src) the bridge can
learn on which bridge interface a
source can be reached

When a bridge receives a frame with


dst not having been observed so
far, it unconditionally re-transmits the
frame on all interfaces except
incoming one

The latter approach is dangerous


when several bridges are used and
loops are present (see below)

LANs

MAC

Bridges, Switches

Bibliography

Some Reasons to Use Bridges

Reliability: by keeping LANs separated and only

interconnected by a bridge, failures in one LAN do not


affect others
Performance: by carefully evaluating addresses, bridges
can confine traffic local to one LAN to that very LAN,
enabling parallel local transmissions in different LANs
Repeaters / hubs cannot do this traffic separation!

Security: similarly, traffic local to one LAN cannot be

eavesdropped in the other LAN

LANs

MAC

Bridges, Switches

Bibliography

Excursion: Encapsulating Bridges

Bridges can be interconnected


through third-party link, e.g. serial
line, microwave link, the Internet, . . .

Third-party link requires own frame


format

Approach: a bridge encapsulates a


frame from LAN A into a frame
appropriate for third-party link

Receiving bridge decapsulates


frame and puts it on LAN B

When is this useful?

LANs

MAC

Bridges, Switches

Bibliography

A Larger Network Example

Packets between Stn1 and Stn7


traverse two bridges

Between Stn1 and Stn5 two different


pathes exist!

Providing different pathes is useful to


provide fault-tolerance and
load-balancing

Only one bridge (Br1 or Br7) should


forward packet from Stn1 to Stn5 to
avoid duplicate packets on LAN E

We have a routing problem!

(Example taken from [44, Fig. 15.10])

LANs

MAC

Bridges, Switches

Bibliography

A Larger Network Example (2)

Fixed routing approach: each bridge


possesses for each incoming
interface a table indicating whether a
frame to dst should be forwarded
and to which outgoing interface / LAN

Problem: table needs to be


recomputed and re-distributed upon
every change in topology

Does not scale well to large


installations

(Example taken from [44, Fig. 15.10])

LANs

MAC

Bridges, Switches

Spanning Tree Approach

The spanning tree approach addresses automatic

construction and maintenance of forwarding tables by


bridges in larger LAN installations
It is specified in IEEE 802.1D [19]
Algorithm consists of three elements:
Frame forwarding
Address learning
Loop resolution

Bibliography

LANs

MAC

Bridges, Switches

Bibliography

Spanning Tree Approach Frame Forwarding

For each port / attached LAN a bridge maintains two

informations:
A forwarding table
A flag indicating if port is in blocking or forwarding state

Forwarding table contains:


all MAC addresses which can be reached (directly or
indirectly) by sending to this port
A timer for each stored MAC address
Example: for bridge Br2 (see previous slides)
all stations in LANs A, B, D, and E on upper port
all stations in LANs C, F and G on lower port

LANs

MAC

Bridges, Switches

Bibliography

Spanning Tree Approach Frame Forwarding (2)

Suppose a bridge receives frame to dst on port x


Bridge checks forwarding tables on all other ports than x
If dst is not found, bridge sends frame to all ports in

forwarding state except x


If dst is found on port y and port y is in forwarding state,

the frame is sent to port y , otherwise (y not in forwarding


state) the frame is dropped

LANs

MAC

Bridges, Switches

Bibliography

Spanning Tree Approach Address Learning

Suppose a frame from station src arrives on port x


The bridge then checks:
If no entry for src exists in the forwarding table for x, then it
is added and a timer is started
If already an entry for src exists on port x and the timer is
running, the timer is canceled and re-started
If the timer for src at port x expires, the entry for src and

its timer are deleted from forwarding table of x


This usage of timers is called soft state!
Why is this done?
Soft-state is a fairly common mechanism in many protocols!

LANs

MAC

Bridges, Switches

Bibliography

Spanning Tree Approach Loop Resolution


Suppose initially all forwarding tables
are empty

At time t0 station Stn1 sends a frame


to Stn2 using LAN A

Both bridges receive frame in parallel


and forward to LAN B, bridge Br1 at
time t1 and bridge Br2 at time t2 > t1

At time t1 bridge Br2 receives a


frame with Stn1s source address on
LAN B, at time t2 the same happens
to Br1

Effect: Stn2 receives the same frame


two times

Effect: Br1 and Br2 have Stn1


included in forwarding tables for both
ports!!

Question: what happens if next Stn2


(Example taken from [44, Fig. 15.11])

sends a frame to Stn1?

LANs

MAC

Bridges, Switches

Bibliography

Spanning Tree Approach Loop Resolution (2)


To avoid this kind of loops, IEEE 802.1D specifies the

spanning tree protocol


Approach:
Each bridge is equipped with an individual MAC address
A cost value is administratively assigned to each bridge
Bridges run a dedicated protocol among each other,

exchanging information about network topology


When topology is fully discovered, a minimum-weight

(related to per-bridge costs) spanning tree is computed


In the computation LANs are taken as vertices and bridges
are taken as edges in the graph!!
Spanning-tree computation can result in removal of

possible connections, the corresponding bridge ports are


set to state blocking
Tree is re-calculated upon changes in topology

LANs

MAC

Bridges, Switches

Bibliography

Layer-2 Switches
A switch is a centralized element,
forwarding frames only to the
correct output port

Stations are attached to switch via


point-to-point links with separate
transmit/receive lines (full-duplex)

No broadcast medium anymore!


A switch is able to process frames to
distinct destinations in parallel,
switches can therefore increase
network capacity as compared to
LANs with broadcast medium

Frames arriving in parallel for the


same destination are buffered
(output buffering)

Switches are transparent to stations


Nowadays almost all Ethernet
installations use switches

LANs

MAC

Bridges, Switches

Bibliography

Layer-2 Switches (2)


Difference to a hub:
A hub builds a broadcast medium, only one station can
transmit at a time without collisions
A switch can accept up to N parallel transmissions, where
N is the number of stations in the LAN
Each attached station can receive packets at the full
medium capacity

Important Point
Switches remove the shared medium assumption and the
need for a MAC, but now the stations contend for the resources
of the switch (switching capacity, buffer memory)!

LANs

MAC

Bridges, Switches

Bibliography

Layer-2 Switches (3)


Operation modes of switches:
Store-and-forward: switch receives a packet fully, checks
frame for correctness, reads off destination address,
determines outgoing port and transmits packet there
Cut-through-switch: switch starts forwarding to output
port already after having read the address field (appears
very early in most frame formats!), no check for frame
correctness
Switches can be cascaded and more complex

infrastructures (including loops) can be built


Switches often incorporate the same loop removal

technique as bridges

LANs

MAC

Bridges, Switches

Bibliography

Norman Abramson.
Development of the ALOHANET.
IEEE Transactions on Information Theory, 31(2):119123, March 1985.
Norman Abramson, editor.
Multiple Access Communications Foundations for Emerging Technologies.
IEEE Press, New York, 1993.
Norman Abramson.
Multiple Access in Wireless Digital Networks.
Proceedings of the IEEE, 82(9):13601370, September 1994.
Ian F. Akyildiz, Janise McNair, Loren Carrasco, and Ramon Puigjaner.
Medium access control protocols for multimedia traffic in wireless networks.
IEEE Network Magazine, 13(4):3947, 1999.
Mostafa H. Ammar and George N. Rouskas.
On the performance of protocols for collecting responses over a multiple-access
channel.
IEEE Transactions on Communications, 43(2):412420, February 1995.
Guiseppe Anastasi, Luciano Lenzini, Enzo Mingozzi, Andreas Hettich, and
Andreas Krmling.
Mac protocols for wideband wireless local access: Evolution towards wireless
atm.
IEEE Personal Communications, 5(5):5364, October 1998.

LANs

MAC

Bridges, Switches

Bibliography

D. Bertsekas and R. Gallager.


Data Networks.
Prentice Hall, Englewood Cliffs, New Jersey, 1987.
J. I. Capetanakis.
Tree Algorithm for Packet Broadcast Channels.
IEEE Transactions on Information Theory, 25(5):505515, September 1979.
Charles Chien, Mani B. Srivastava, Rajeev Jain, Paul Lettieri, Vipin Aggarwal,
and Robert Sternowski.
Adaptive Radio for Multimedia Wireless Links.
IEEE Journal on Selected Areas in Communications, 17(5):793813, May 1999.
Lou Dellaverson and Wendy Dellaverson.
Distributed channel access on wireless atm links.
IEEE Communications Magazine, 35(11):110113, November 1997.
Andras Farago, Andrew D. Myers, Violet R. Syrotiuk, and Gergely V. Zaruba.
Meta-MAC Protocols: Automatic Combination of MAC Protocols to Optimize
Performance for Unknown Conditions.
IEEE Journal on Selected Areas in Communications, 18(9):16701681,
September 2000.
Robert G. Gallager.
A Perspective on Multiaccess Channels.
IEEE Transactions on Information Theory, 31(2):124142, March 1985.

LANs

MAC

Bridges, Switches

Bibliography

Jonathan Goodman, Albert G. Greenberg, Neal Madras, and Peter March.


Stability of binary exponential backoff.
Journal of the ACM, 35(3):579602, 988.
Ajay Chandra V. Gummalla and John O. Limb.
Wireless medium access control protocols.
IEEE Communications Surveys and Tutorials, 3(2):215, 2000.
http://www.comsoc.org/pubs/surveys.
Boudewijn R. Haverkort.
Performance of Computer Communication Systems A Model Based Approach.
John Wiley and Sons, Chichester / New York, 1998.
A. Hoffmann, R. J. Haines, and A. H. Aghvami.
Performance analysis of a token based MAC protocol with asymmetric polling
strategy (TOPO) for indoor radio local area networks under channel outage
conditions.
In Proc. International Conference on Communications (ICC), pages 13061311,
New Orleans, Louisiana, 1994. IEEE.
IEEE.
802.4 Token-passing Bus Access Method, 1985.
IEEE.
802.5 Token Ring Access Method and Physical Layer Specifications, 1985.
IEEE Computer Society.

LANs

MAC

Bridges, Switches

Bibliography

802.1D IEEE Standard for Local and Metropolitan Area Networks Media
Access Control (MAC) Bridges, June 2004.
IEEE Computer Society.
IEEE Standard for Information technology Telecommunications and information
exchange between systems Local and metropolitan area networks Specific
requirements Part 3: Carrier sense multiple access with collision detection
(CSMA/CD) access method and physical layer specifications, December 2005.
IEEE Computer Society Sponsored by the LAN/MAN Standards Committee.
IEEE Standard for Information technology Telecommunications and information
exchange between systems Local and metropolitan area networks Specific
requirements Part 15.1: Wireless Medium Access Control (MAC) and Physical
Layer (PHY) Specifications for wireless personal area networks (WPANs), June
2005.
IEEE Computer Society, sponsored by the LAN/MAN Standards Committee.
IEEE Standard for Information technology Telecommunications and Information
Exchange between Systems Local and Metropolitan Area Networks Specific
Requirements Part 11: Wireless LAN Medium Access Control (MAC) and
Physical Layer (PHY) Specifications, 2007.
Hong ju Moon, Hong Seong Park, Sang Chul Ahn, and Wook Hyun Kwon.
Performance Degradation of the IEEE 802.4 Token Bus Network in a Noisy
Environment.
Computer Communications, 21:547557, 1998.

LANs

MAC

Bridges, Switches

Bibliography

Mark J. Karol, Z. Liu, and K.Y. Eng.


An efficient demand-assignment multiple access protocol for wireless (atm)
networks.
Wireless Networks, 1(3), 1995.
Leonard Kleinrock and Fouad A. Tobagi.
Packet switching in radio channels: Part I carrier sense multiple access models
and their throughput-/delay-characteristic.
IEEE Transactions on Communications, 23(12):14001416, 1975.
J. F. Kurose, M. Schwartz, and Y. Yemini.
Multiple-access protocols and time-constrained communication.
ACM Computing Surveys, 16:4370, March 1984.
Byung-Jae Kwak, Nah-Oak Song, and Leonard E. Miller.
Performance Analysis of Exponential Backoff.
IEEE/ACM Transactions on Networking, 13(2):343355, April 2005.
S. S. Lam.
Multiaccess protocols in computer communications. volume I: Principles.
In W. Chon, editor, Principles of Communication and Network Protocols, pages
114155. Prentice-Hall, Englewood Cliffs, NJ, 1983.
LAN/MAN Standards Committee of the IEEE Computer Society.
IEEE Standard for Information technology Telecommunications and information
exchange between systems Local and metropolitan area networks Specific
requirements Part 15.4: Wireless Medium Access Control (MAC) and Physical

LANs

MAC

Bridges, Switches

Bibliography

Layer (PHY) Specifications for Low Rate Wireless Personal Area Networks
(LR-WPANs), September 2006.
revision of 2006.
Shu Lin and Daniel J. Costello.
Error Control Coding.
Prentice-Hall, Englewood Cliffs, New Jersey, second edition, 2004.
Shu Lin, Daniel J. Costello, and Michael J. Miller.
Automatic-Repeat-Request Error-Control Schemes.
IEEE Communications Magazine, 22(12):517, December 1984.
Hang Liu, Hairuo Ma, Magda El Zarki, and Sanjay Gupta.
Error control schemes for networks: An overview.
MONET Mobile Networks and Applications, 2(2):167182, 1997.
Biswanath Mukherjee.
Optical WDM Networks.
Optical Networks Series. Springer, New York, 2006.
Andrew D. Myers and Stefano Basagni.
Wireless media access control.
In Ivan Stojmenovic, editor, Handbook of Wireless Networks and Mobile
Computing, pages 119143. John Wiley & Sons, New York, 2002.
The Editors of IEEE 802.
IEEE 802.2, ISO/IEC 8802-2: Local Area Networks: Logical Link Control, 1989.

LANs

MAC

Bridges, Switches

Bibliography

Nikos Passas, Sarantis Paskalis, Dimitri Vali, and Lazaros Merakos.


Quality-of-service-oriented medium access control for wireless atm networks.
IEEE Communications Magazine, 35(11):4250, November 1997.
Radia Perlman.
Interconnections Second Edition Bridges, Routers, Switches and
Internetworking Protocols.
Addison-Wesley, Reading, Massachusetts, 1999.
Larry L. Peterson and Bruce S. Davie.
Computer Networks A Systems Approach.
Morgan Kaufmann, San Francisco, fourth edition, 2007.
Theodore S. Rappaport.
Wireless Communications Principles and Practice.
Prentice Hall, Upper Saddle River, NJ, USA, 2002.
Izhak Rubin.
Access-Control Disciplines for Multi-Access Communication Channels:
Reservation and TDMA Schemes.
IEEE Transactions on Information Theory, 25(5):516536, September 1979.
Izhak Rubin.
Multiple access methods for communications networks.
In Jerry D. Gibson, editor, The Communications Handbook, pages 622649. CRC
Press / IEEE Press, Boca Raton, Florida, 1996.

LANs

MAC

Bridges, Switches

Bibliography

Izhak Rubin and L. F. M. de Moraes.


Message Delay Analysis for Polling and Token Multiple-Access Schemes for
Local Communication Networks.
IEEE Journal on Selected Areas in Communications, 1(5):935947, 1983.
Oran Sharon and Eitan Altman.
An efficient polling mac for wireless lans.
IEEE/ACM Transactions on Networking, 9(4):439451, August 2001.
William Stallings.
Data and Computer Communications.
Prentice Hall, Englewood Cliffs, New Jersey, fourth edition, 2006.
Hideaki Takagi.
Analysis of Polling Systems.
MIT Press, Cambridge, Massachusetts, 1986.
Hideaki Takagi.
Queueing analysis of polling models: an update.
In Hideaki Takagi, editor, Stochastic Analysis of Computer and Communication
Systems, pages 267318. Elsevier, Amsterdam, 1990.
Andrew S. Tanenbaum.
Computer Networks.
Prentice-Hall, Englewood Cliffs, New Jersey, third edition, 1997.
Fouad A. Tobagi.

LANs

MAC

Bridges, Switches

Bibliography

Multiaccess protocols in packet communications systems.


IEEE Transactions on Communications, 28:468488, 1980.
Fouad A. Tobagi and Leonard Kleinrock.
Packet switching in radio channels: Part II the hidden terminal problem in csma
and busy-tone solutions.
IEEE Transactions on Communications, 23(12):14171433, 1975.
Fouad A. Tobagi and Leonard Kleinrock.
Packet switching in radio channels: Part III polling and (dynamic) split-channel
reservation multiple access.
IEEE Transactions on Communications, 24(8):832845, August 1976.
Don Towsley and J. K. Wolf.
On adaptive tree polling algorithms.
IEEE Transactions on Communications, 32(12):12941298, 1984.
David Tse and Pramod Viswanath.
Fundamentals of Wireless Communications.
Cambridge University Press, Cambridge, UK, 2005.
Harmen R. van As.
Media access techniques: The evolution towards terabit/s LANs and MANs.
Computer Networks and ISDN Systems, 26:603656, 1994.
George Varghese and Radia Perlman.
Transparent Interconnection of Incompatible Local Area Networks Using Bridges.

LANs

MAC

Bridges, Switches

Bibliography

IEEE Journal on Selected Areas in Communications, 8(1):15651575, January


1990.
Andreas Willig and Adam Wolisz.
Ring stability of the PROFIBUS token passing protocol over error prone links.
IEEE Transactions on Industrial Electronics, 48(5):10251033, October 2001.

Introduction

PHY

Half-Duplex Ethernet

Data Communications and Networking


COSC 264
Local Area Networks
Ethernet
Dr. Andreas Willig1
Dr. Muhammad Asad Arfeen2
1 Dept.

of Computer Science and Software Engineering


University of Canterbury, Christchurch
2 Dept. of Computer and Information Systems Engineering
NED University of Engineering & Technology, Karachi

UoC, 2014

Bibliography

Introduction

PHY

Half-Duplex Ethernet

Outline

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Introduction

PHY

Half-Duplex Ethernet

Outline

Introduction
PHY
Half-Duplex Ethernet

Bibliography

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Ethernet
Ethernet is a packet-switched LAN technology
Ethernet became the dominating wired LAN technology,

early competitors like Token-Ring vanished


Standardized by IEEE ([1] and various amendments)
Its name refers to the Ether that was believed to be the

necessary medium for electromagnetic wave propagation


It offers:
High data rates (10 Mbps, 100 Mbps, 1 Gbps, 10 Gbps, 100
Gbps)
Cheap and mature components
Large market penetration, many different vendors
These slides are mainly based on [3, Chap. 16]

Introduction

PHY

Half-Duplex Ethernet

Brief History of Ethernet


1973: Developed at Xerox PARC by Metcalfe / Boggs
1976: first publication by Metcalfe / Boggs
1980: IEEE 802 working group adopted Ethernet
1981: IEEE 802.3 standard working group established
1983: 10Base2 Cheapernet standard published
1985: published as ISO standard 8802/3
1991: 10BaseT standard, adopting twisted-pair cabling
1995: Fast Ethernet standard with 100 Mbps published
1999: Gigabit Ethernet approved
2002: 10 Gb Ethernet approved
2012: 40 and 100 Gb Ethernet approved

Bibliography

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Frame format

The original Ethernet had a slightly different frame format


Minimum payload size is 46 bytes, smaller messages must be padded, i.e. filled
up with zeros

Maximum payload size is 1500 bytes


SOF = Start of Frame
FCS = Frame Check Sequence, it is a CRC checksum

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Ethernet Addresses
Address fields are 48 bits long
Each Ethernet adapter has its own address, typically

burned into adapter hardware


Addresses are required to be globally unique
Each vendor gets own address range and assigns unique

addresses from that range


Nowadays adapters with programmable address available

Address representation as six colon-separated bytes in

hexadecimal representation, e.g.:


00:0c:29:10:fb:f3
Special addresses and address ranges:
Broadcast address: FF:FF:FF:FF:FF:FF
Addresses with first bit set to 1 but different from broadcast
address are multicast addresses
Addresses with first bit set to 0 are unicast addresses

Introduction

PHY

Half-Duplex Ethernet

Bibliography

The Length/Type Field

When the L/T field carries value

Type field
0x0800
0x0806
0x809B
0x86DD
0x8863
0x8864

Protocol
IPv4
ARP
AppleTalk
IPv6
PPPoE Discovery
PPPoE Session

0x0600 it is interpreted as a type


field, indicating the higher-layer
protocol that is encapsulated in the
Ethernet frame

The type field therefore provides


protocol multiplexing

When the L/T field carries a value


1500, it indicates the length of the
payload, then the type field is
assumed to be in the first two bytes
of the payload

Introduction

PHY

Half-Duplex Ethernet

Outline

Introduction
PHY
Half-Duplex Ethernet

Bibliography

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Ethernet PHYs
The Ethernet standard specifies several different physical

layers, using different transmission media and adapters:


Coaxial cables
Twisted pair cables
Fiber cables

The standard also specifies several transmission rates: 10

Mbps, 100 Mbps, 1 Gbps, 10 Gbps, 40 Gbps, 100 Gbps


Notation:

<Data rate> <Signaling> <Max. Segment Length>


where:
Data rate is given in Mbps
Maximum segment length is given in hundreds of meters,

or uses letters for special types of media


Signaling always BASE, referring to baseband transmission

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Topologies

Ethernet supports fundamentally two types of topologies:


Broadcast topologies
Switched Ethernet

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Broadcast Topologies

All stations share capacity of the medium, MAC is needed


Stations operate in half-duplex mode
In Ethernet:
Bus topology (with coaxial cable)
Hub-based topology, using different types of cables
Only used in the 10 Mbps PHYs
Referred to as half-duplex Ethernet in the following

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Switched Topologies

Stations are attached to (cascaded) switches via

point-to-point links, no MAC needed on those links


The MAC is nonetheless used, since it is present anyway

Stations operate in full-duplex mode


Parallel transmissions possible, contention occurs for

resources in switches (e.g. buffer memory)


Used for 100 Mbps and higher PHYs
Referred to as switched Ethernet in the following

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Ethernet PHYs A Few Examples

10BASE5: 10 Mbps, 500 m segment length (coaxial cable)


10BASE-T: 10 Mbps, using unshielded twisted-pair (UTP)

cables in star topology


10BASE-F: 10 Mbps, fibre-optic cables with star topology

(repeaters as central elements)


100BASE-TX: 100 Mbps, shielded twisted-pair (STP)

cabling, star topology, or Category 5 UTP


There are many, many more PHY specifications . . .

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Ethernet PHYs Channel Coding Examples

The 10 Mbps Ethernet variants use baseband transmission

and Manchester coding


self-clocking!

The 100 Mbps variant (100BASE-X) uses baseband

transmission and a more efficient 4b5b channel code:


Blocks of four user bits are mapped to blocks of five bits
The 4b 5b mapping is constructed so that sufficiently

many transitions between 0 and 1 are present, allowing


receiver to track symbol-/bit synchronization
One 1 Gbps variant (1000BASE-X) uses 8b10b channel

code
There are also other channel codes in use

Introduction

PHY

Half-Duplex Ethernet

Ethernet PHYs Summary

There is broadcast-/half-duplex Ethernet and switched

Ethernet
There is a great selection of media types and speeds to

choose from

Bibliography

Introduction

PHY

Half-Duplex Ethernet

Outline

Introduction
PHY
Half-Duplex Ethernet

Bibliography

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Half-duplex MAC Protocol

Half-duplex Ethernet uses CSMA/CD


Carrier-Sense-Multiple-Access with Collision-Detection
This is a 1-persistent CSMA with specific collision
resolution procedure
Major assumptions:
Parallel transmissions lead to collisions
Stations monitor the medium for collision while transmitting
Ethernet implements this by checking the voltage on the
medium, a collision is inferred when this voltage exceeds the
voltage applied by the transmitter alone

Introduction

PHY

Half-Duplex Ethernet

Bibliography

MAC Protocol
1. Suppose a new packet arrives at the MAC of a station, set
coll to zero
2. Station performs a carrier-sense operation
3. When the medium is idle, transmission starts immediately
4. When medium is busy:
4.1 listen until channel becomes idle again
4.2 start transmitting

5. While transmitting, check for collision


6. If collision is detected:
6.1
6.2
6.3
6.4

Abort frame transmission, send a short jamming signal


Increase counter coll, counting # of subsequent collisions
If coll > 16 then drop frame and set coll to zero
Wait for random time (the backoff time), then go to step 2

7. If no collision is detected, transmit until end of frame, set


coll to zero

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Computation of Backoff Time


Draw a random integer number uniformly from the interval

i
0, 1, . . . , 2min 10,coll 1

which is called the backoff window


The actual backoff time is computed by multiplying this
integer with the pre-defined slot time
Slot time is common parameter, just large enough to cover
maximum round-trip time and small processing delays

Important Point
The backoff window size and therefore the average backoff time
doubles after each collision, until 10 collisions have been
observed! This is called the (truncated) binary exponential
backoff algorithm!
Question: why this algorithm?

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Minimum Frame Size / Slot Time

Minimum frame size (64 bytes for 10 Mbps Ethernet) is

chosen to ensure that a transmitter can indeed detect a


collision on a cable of given length
Rationale: a frame must be long enough to still being

transmitted after the maximum round-trip delay plus small


processing times introduced by repeaters / hubs
Requirement for maximum round-trip delay can be

understood from worst-case example (next slide)

Important Point
Smaller slot times / minimal frame sizes can be achieved by
restricting the maximum length of an Ethernet segment!

Introduction

PHY

Half-Duplex Ethernet

Minimum Frame Size (2)

Bibliography

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Half-duplex MAC Protocol Important properties

Very short medium access delays under light load


Under very heavy load and for many stations the collision

rate increases
Rule of thumb: Ethernet should not be operated beyond 50

- 60% load!
Important problem: the capture effect [2]

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Half-duplex MAC Protocol Capture effect

Suppose two stations A and B have many packets in their

queue and start at the same time with collision counters


nA = nB = 0
They start at the same time, experience a collision and

both set nA = nB = 1
Station A draws a backoff slot of zero, B draws one
As a result:
A wins contention and starts transmitting
A sets nA = 0, whereas nB = 1
Now A has another packet, both stations transmit after A

has finished its first packet, and experience a collision

Introduction

PHY

Half-Duplex Ethernet

Bibliography

Half-duplex MAC Protocol Capture effect contd


Result:
A sets nA = 1
B sets nB = 2
Station A has therefore higher likelihood to win the next
contention cycle
If A wins indeed, it sets nA = 0 whereas nB = 2
This is repeated, after each iteration the likelihood of A

winning the contention increases


After 16 trials node B throws away the packet

= frame delivery cannot be guaranteed!!!

Question
Can you imagine a protocol modification that circumvents this
problem?

Introduction

PHY

Half-Duplex Ethernet

Bibliography

IEEE Computer Society.


IEEE Standard for Information technology Telecommunications and information
exchange between systems Local and metropolitan area networks Specific
requirements Part 3: Carrier sense multiple access with collision detection
(CSMA/CD) access method and physical layer specifications, December 2005.
K. K. Ramakrishnan and Henry Yang.
The ethernet capture effect: Analysis and solution.
In Proc. 19th Conference on Local Computer Networks (LCN94), Minneapolis,
USA, October 1994.
William Stallings.
Data and Computer Communications.
Prentice Hall, Englewood Cliffs, New Jersey, fourth edition, 2006.

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Data Communications and Networking


COSC 264
Error Control
Dr. Andreas Willig1
Dr. Muhammad Asad Arfeen2
1 Dept.

of Computer Science and Software Engineering


University of Canterbury, Christchurch
2 Dept. of Computer and Information Systems Engineering
NED University of Engineering & Technology, Karachi

UoC, 2014

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline
Introduction
Error-Detecting Codes, Checksums
Introduction
Arithmetic in GF(2)
CRC Checksums
Error-Correcting Codes
Fundamental Considerations
Block Codes
Further Comments
ARQ Methods
Alternating Bit Protocol
Goback-N
Selective Repeat

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

About This Module

Goals:
Understand that bit- and packet-errors are a fact of life in
virtually all networks
Know the two fundamental types of error-control strategies
Literature:
Books on coding: [13], [17], [18], [4], [20], [16]
Interesting papers on coding: [5], [7], [23], [28]
ARQ and general error control: [15], [12], [11], [14]
This module is in parts based on [13]

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Some Reasons for Transmission Errors


Individual bits in a packet can be wrong because of

thermal noise coupled with weak signal strength,


interference, deliberate jamming, . . .
A crashing router / switch / station looses all packets

currently in its memory


Packets are routed in the wrong direction
Packets are dropped because of congestion
Packets are dropped because of insufficient resources at

receiver

Important Point
Transmission errors are a fact of life and present a significant
design challenge for a network and a distributed application!

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Three Ways to Deal With Errors

1. Correct them:
Requires ability to detect presence of errors
This is the domain of open- and closed-loop error control

2. Ignore them:
A few wrong pixels in a JPEG image wont hurt very often

3. Conceal them:
. . . or you replace them by average of neighbored pixels
Error concealment works only for certain types of user data

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Types of Errors

On the link-layer level:


Packet fully lost (e.g. due to failure in preamble acquisition)
Packet truncated (e.g. loss of synchronization)
Bits in packet are flipped
On the end-to-end level in a wide-area network:
Packets are lost
Packets are duplicated
Packets are delayed for very long time
Packets are reordered
Bits in packet are flipped

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Error Control Schemes

Error-control schemes try to detect and correct errors


They are typically employed:
On the link-layer, protecting single-hop transmissions
On the transport-layer, protecting multi-hop transmissions
They all involve, in one way or the other, redundancy!

Question
Suppose in a wide-area multihop network we have error-control
on every link. Is this sufficient or do we additionally need
end-to-end error control mechanisms?

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Open-Loop Error Control

Transmitter does not receive any feedback from the

receiver about transmission outcome


Major option for transmitter: equip packets with redundant

data, allowing receiver to correct bit errors


This is called error-control coding or forward error

correction (FEC)
Often applied in situations where feedback is infeasible:
Real-time voice (POTS!) and video applications
When very long links are involved (Space probes)!

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Closed-Loop Error Control


ARQ = Automatic Repeat reQuest [3], [11], [15], [26]
Here transmitter gets (and uses!!) feedback from receiver
Basic idea:
Transmitter augments each packet with a checksum
Receiver uses checksum to check the integrity of a received
packet and provides transmitter with appropriate feedback
If needed, the transmitter re-transmits packet
Hybrid-ARQ schemes couple ARQ with coding, e.g.:
Couple ARQ with a coding scheme that is active all the time
Transmit first packet without coding, but upon receiving
negative feedback switch on coding, increase coding
strength / redundancy with every further retransmission

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Packet Error Rate vs. Bit Error Rate


Suppose the channel only flips bits
Each bit is flipped independently of
others with probability p (0, 1),
called bit error rate (BER)

A packet with n bits is regarded as


erroneous when at least one bit is
flipped

The packet error rate (PER) is thus:


P(p, n) = 1 (1 p)n

Figure shows PER for varying BER


(computed as 10b , with b being the
BER exponent on x-axis) for packets
with 100 bits overhead (header,
trailer) and varying # of user bits

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Packet Error Rate vs. Bit Error Rate Observations

For fixed BER the PER increases with packet size


Longer packets are more likely to be erroneous
Shorter packets have relatively more overhead
For fixed packet size PER increases with increasing BER
When BER becomes too bad, almost no packet is
transmitted correctly

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Error Characteristics
The error model used before (bits are independent and

have same error probability) is extremely simplistic


In reality errors are often bursty:
In an error burst of order x two successive erroneous bits
are separated by less than x bits
Example: when a wireless transmitter transmits at 11 Mbps
and an interference burst of 200 s occurs, 2200 bits are
affected by the burst
The practically observed error rates depend on the

transmission medium
Fibre-optical cables have BERs in the order 1015 and less
With wireless links BERs in the order of 103 . . . 102 can

be frequently observed
Error modeling is a vast field (e.g. [1], [10], [8], [27])

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Introduction
Arithmetic in GF(2)
CRC Checksums
Error-Correcting Codes
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Introduction
With error-detection coding redundancy is added to a

packet so that:
Certain error patterns can be detected reliably
Other error patterns can be detected with high probability
No information about the position of errors in a packet can

be inferred (and hence no correction can be done)


Error-detecting codes serve as a packet checksum and

are used in ARQ schemes:


Tx computes checksum as function of header and data
Tx attaches checksum to packet and transmits both
Receiver computes own checksum (with same algorithm)

over (received) header and data


Receiver compares own checksum with the one received
If equal, packet is accepted and positive feedback is sent
Otherwise, negative feedback is provided to Tx

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Position of Checksum Fields

Often (especially for MAC- and link layer protocols) the

packet checksum is appended to the packet


Reason: checksum computation can be done on the fly

(while transmitting) by the transmitter hardware, but for this


it must have seen all the bits of a packet, leaving appending
as the only option
Examples: Ethernet, Wireless LAN
Some other protocols (e.g. TCP) store the checksum field

in the header

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Parity Check
Data stream is subdivided into small blocks, e.g. bytes
Even parity: one additional bit is appended to a byte so

that total # of 1s in data and parity bits is even, e.g.:


0100 1101 becomes 0100 1101 0
0101 1101 becomes 0101 1101 1

Odd parity: similar, but total number of 1s is odd


Properties of parity check codes:
All odd numbers of bit errors are reliably detected
All even numbers of bit errors are reliably not detected
Parity check codes have traditionally been used on serial

interfaces, e.g. RS-232, where a parity bit has been


appended to each byte

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Introduction
Arithmetic in GF(2)
CRC Checksums
Error-Correcting Codes
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Introduction

The following explains the arithmetic in GF(2), a simple

algebraic structure that:


Is familiar (maybe in other guises) to computer scientists
Is the foundation for the computation of CRC checksums

The presentation is mainly based on [13]

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Field Axioms
Definition
We are given a non-empty set F and two operations
+

F F 7 F

F F 7 F

for which the following holds (field axioms):

Associativity: For all x F , y F and z F we have:


x + (y + z)

(x + y ) + z

x (y z)

(x y ) z

x +y

y +x

x y

y x

Commutativity: For all x F and y F we have:

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Field Axioms (2)


Definition

Additive identity: there exists an element 0 F , called identity, such that for each x F we have
x +0=0+x =x

Multiplicative identity: there exists an element 1 F , called identity, such that for each x F we have
x 1=1x =x

Additive inverse: for every x F there exists an element y F such that


x +y =0
This element is denoted as x. This element is unique.

Multiplicative inverse: for every x F with x 6= 0 there exists an element y F such that
x y =1
This element is denoted as x 1 . This element is unique.

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Field Axioms (3)

Definition

Distributivity: For all x F , y F and z F we have


x (y + z) = (x y ) + (x z)

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Examples of Fields

Infinite fields:

There are also finite fields, called Galois Fields


In honor of Evariste Galois, a french mathematician

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

GF(2) Modulo-2 Arithmetic


Set F = 0, 1
Define the addition + as follows:
x
0
0
1
1

y
0
1
0
1

x+y
0
1
1
0

This is precisely an XOR!!


You can check that 0 = 1 + 1 mod 2 holds

Define the multiplication as follows:


x
0
0
1
1

y
0
1
0
1

xy
0
0
0
1

Check that these operations satisfy the field axioms!!

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

GF(2) Modulo-2 Arithmetic (2)


The additive identity is 0
The multiplicative identity is 1
Additive inverses:
0 is the inverse of 0
1 is the inverse of 1

The latter gives rise to the unfamiliar-looking (but true)


identity
1+1=0=11
and also to
01=1
This means: subtraction is the same as addition!

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Polynomials over GF(2)


Definition
Be n . A polynomial f (X ) with one variable X and with
coefficients from GF(2) is of the form
f (X ) = fn X n + fn1 X n1 + . . . + f1 X + f0 =

n
X

fk X k

k =0

where all fi are from GF(2). The polynomial is said to have


degree of n when fn = 1, otherwise it has a degree smaller
than n. The degree of f (X ) is denoted as deg f (X ).
There are 2n polynomials over GF(2) with degree n

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Addition of Polynomials over GF(2)


P

Be f (X ) = nk=0 fk X k and g(X ) = mk=0 gk X k two polynomials


For adding these polynomials, we add the coefficients of the same power
according to the rules of GF(2) aritmetic. For m n we get
f (X )+g(X ) = (f0 +g0 )+(f1 +g1 )X +. . .+(fm +gm )X m +fm+1 X m+1 +. . .+fn X n

Example:
1 X5 + 0 X4 + 1 X3 + 1 X2 + 0 X + 1
+
=

1 X5 + 1 X4 + 0 X3 + 1 X2 + 0 X + 1
0 X5 + 1 X4 + 1 X3 + 0 X2 + 0 X + 0

or, using a shorthand notation (which ignores the formal variable X ):


101101
+
=

110101
011000

Under GF(2) addition and subtraction of polynomials are really the same!!

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Multiplication of Polynomials over GF(2)


The rule for multiplication can be obtained from expanding f (X ) g(X ) and
summing up (according to the rules of GF(2)!!) the terms for the same power

We have
f (X ) g(X ) = c0 + c1 X + c2 X 2 + . . . + cn+m X n+m
with
c0

f0 g0

c1

f0 g1 + f1 g0

c2

f0 g2 + f1 g1 + f2 g0

f0 gk + f1 gk 1 + . . . + fk g0

fn gm

...
ck
...
cn+m

where the right-hand-sides are computed according to GF(2) rules

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Multiplication of Polynomials over GF(2) (2)

We can multiply polynomials in a way similar as multiplying numbers


Using the shorthand notation we have for example:
1

1
1

0
0
0

1
0
1
0

1
0
0
0
1

1
0
1
0
1
0

0
1
0
0

1
0
1

1
1

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Rules for Polynomials over GF(2)


Commutativity: for all polynomials f (X ) and g(X ):
f (X ) + g(X )

g(X ) + f (X )

f (X ) g(X )

g(X ) f (X )

Associativity: for all polynomials f (X ), g(X ) and h(X ):


f (X ) + [g(X ) + h(X )]

[f (X ) + g(X )] + h(X )

f (X ) [g(X ) h(X )]

[f (X ) g(X )] h(X )

Distributivity: for all polynomials f (X ), g(X ) and h(X ):


f (X ) [g(X ) + h(X )]

[f (X ) g(X )] + [f (X ) h(X )]

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Polynomial Division over GF(2)

Be f (X ) and g(X ) two polynomials, deg g(X ) > 0 and deg f (X ) deg g(X )
When f (X ) is divided by g(X ) then we get a unique pair of polynomials, say:
q(X ) and r (X ) such that
f (X ) = q(X ) g(X ) + r (X )
where r (X ) is called the remainder and q(X ) is called the quotient, and
furthermore deg r (X ) < deg g(X ) holds

Compare Euclids algorithm!!


We can divide polynomials very similar to the long-division technique for dividing
integer numbers

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Polynomial Division over GF(2) (2)

One example:

X6
X6
(

+X 5
X5
X5

+X 3
+X 3
+X 3

+X

+1

+X
)
+X

+1

X3

+X

+1

)
+X 2
+X 2

+1

And using the shorthand notation the same example becomes:

+X 4
+X 4

1
1
0
(

1
0
1
1

1
1
0
0

0
1
1
1

0
0
0
1
1

1
0
1
0
1

1
0)
1
0)
1

So the quotient is 1100 (or X 3 + X 2 ) and the remainder is 111 (or X 2 + X + 1)

= 1100

= X3 + X2

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Introduction
Arithmetic in GF(2)
CRC Checksums
Error-Correcting Codes
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

CRC Computation and Transmission


We are given k data bits, represented as polynomial k (X )

with k coefficients (i.e. maximum degree is k 1)


We are furthermore given a so-called generator

polynomial g(X ), for example of degree 16 or 32


We compute n k = deg g(x) check bits as follows:
compute new polynomial X nk k (X )
amounts to shifting the bits of k (X ) to the left by n k
places, filling up with zeros for the smallest coefficients
Perform a polynomial division to give:

X nk k (X )
= q(X ) g(X ) + r (X )
g(X )
Transmit the message m(X ) = X nk k (X ) + r (X )
Crucial property: m(X ) is evenly divisible by g(X )!!

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

CRC Receiver Behaviour


As a result of transmission, the channel possibly modifies

)
transmitted m(X ) to received m(X
The receiver:
) by g(X )
Divides received m(X
If the remainder is zero, the message is accepted as correct
Otherwise, the message is rejected as erroneous
Some comments:
Errors are not detected when their bit pattern (taken as a
polynomial) is evenly divisible by g(X )
All odd numbers of errors can be detected when g(X ) can
be evenly divided by X + 1
All error bursts of length less than n k are detected
As an approximation, error bursts of length r n k are
not detected with probability 1/2r

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

A Justification
The procedure rests on the fact that

m(X ) = X nk k (X ) + r (X ) is evenly divisible by g(X )


This is true since:

m(X )
g(X )

X nk k (X ) + r (X )
g(X )

r (X )
g(X )
= g(X ) q(X ) + r (X ) + r (X )

= g(X ) q(X ) + r (X ) +

= g(X ) q(X )

which is certainly divisible without remainder by g(X )


Can you justify the steps?

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

CRC Some Standard Polynomials

Quoted from [24, Sect. 6.3]

Name
CRC-12
CRC-16
CRC-CCITT
CRC-32

Specification
X 12 + X 11 + X 3 + X 2 + X + 1
X 16 + X 15 + X 2 + 1
X 16 + X 12 + X 5 + 1
X 32 + X 26 + X 23 + X 22 + X 16 + X 12 + X 11 + X 10 + X 8 + X 7 + X 5 + X 4 + X 2 + X + 1

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Summary and Important Points


CRCs can be easily implemented in hardware
They can also be implemented efficiently in software

(linear time in the number of message bits) [19], [21], [9]

Important Point
There is always a residual error rate, i.e. some probability that
errors are not detected by a CRC checksum!!
Intuitive justification: when the number of checkbits is smaller than the number of
message bits, several messages are mapped to the same checksum. An error
pattern turning one of these messages into another one cannot be detected

In many cases the residual error probability (especially for 32 bit CRCs) is pretty
small and errors can be neglected

But: see [25]

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
Fundamental Considerations
Block Codes
Further Comments
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Coding and Decoding An Abstract View

The channel is capable of


transmitting channel symbols from
a finite set, e.g. 0, 1

A coder maps source messages w


to sequences xn of channel symbols

The channel possibly introduces


errors and outputs a sequence yn of
channel symbols

The decoder applies a decoding

function to yn and produces an


of the transmitted
estimate w
message w
we have a decoding
When w 6= w
error

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Code Rate

Suppose that:
the binary representation of source message requires k bits
the coder produces n > k channel bits

Definition
The ratio kn is called the code rate. It expresses the amount of
redundancy added by code to the user data.
This is not the most general definition, see [6]

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Goal of Coding Theory

Goal of Coding Theory


A key goal of coding theory is to find codes that achieve very
small probabilities of decoding errors while being reasonably
efficient, i.e. have a high code rate!
Often, stronger codes (i.e. having lower probabilities of

decoding errors) tend to have lower code rates

Important Point
You have to pay for reliability with overhead!

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

A First Code: Repetition Coding


Consider a channel with the following properties:
The channel symbols are bits
Each bit is flipped with probility p < 12 and transmitted
correctly with probability 1 p
Bit errors are independent of each other
The transmitter wishes to transmit a one-bit message w
Coding scheme: the message bit is repeated n times, i.e.

0 7 000 . . . 00

,n times 0

1 7 111 . . . 11

,n times 1

where n is an odd number

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

A First Code: Repetition Coding (2)

Suppose n = 5 and the receiver gets:

01011
What is the most likely source message? 0 or 1?

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

A First Code: Repetition Coding (3)


It can be shown that [17, Sect. 2.5]:
The decoding rule which minimizes the decoding error
probability is the majority voting rule
Output the number that occurs most often in the received
sequence
The probability for decoding error under the majority

voting rule is:


n1

=1

2  
X
n

i=0

pi (1 p)ni

There is a problem: if we let our desirable shrink to zero,

the value n required to reach this grows without bound


This means: arbitrarily reliable decoding not possible with

repetition coding with a strictly positive code rate

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Shannons Noisy Channel Coding Theorem

Can we do better than this? Can we transmit with arbitrarily

small decoding error probabilities at strictly positive rates?


One of the initial [22] achievements of information theory is

the confirmation that this is indeed possible


More precisely:
For a given channel there exists a number C giving the

capacity of this channel


For any code rate R < C it is for given desired probability of

decoding error possible to construct a code having the


given rate R that has a decoding error probability less than

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

A Second Code: Parity Check Coding

In this example code a block of 9


user data bits is arranged in a 3 3
matrix

For each matrix row an even parity


bit is computed (bits c1 , c2 , c3 )

For each matrix column an even


parity bit is computed (bits c4 , c5 , c6 )

In the lower right corner a parity bit


for the row and column parities is
computed (bit c7 )

What happens after single-bit errors?


What happens after two-bit errors?
What happens after three-bit errors?

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
Fundamental Considerations
Block Codes
Further Comments
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Basic Operation
In a block code:
The user data stream is segmented into blocks of k bits
Each k -bit block is encoded independently of other
blocks to an n-bit codeword (n > k )
The code rate is k /n
The set of all possible source words has size 2k
The set of all possible words in the code space has size 2n
Out of these the code uses only 2k out of 2n elements
These are called valid codewords
In many practical codes the set of codewords has some
algebraic structure, it could be a group or a vector space

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Effect of Channel Errors


Channel errors can turn:
a valid codeword into a word from the set of 2n 2k unused
codewords, then the decoder must guess which was the
transmitted codeword
a valid codeword into another valid codeword this cannot
be detected by the code
When facing an unused codeword y , decoders essentially

look for the valid codeword that is closest to y


What can you do about errors that the decoder cannot

detect?
How can we define closest? What notion of distance do

we use?
Why do we choose the closest?

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

The Hamming Distance


Definition
We are given two words y1 and y2 of n bits length each. The
Hamming Distance of y1 and y2 is defined as the number of bit
positions in which y1 and y2 differ, it is denoted as d(y1 , y2 ).
The Hamming Distance of the Code, dH , is the minimum
Hamming Distance among all possible pairs of valid codewords.
Note: d(y1 , y2 ) can be computed by XORing y1 and y2 and

counting the 1s in the result


Example:

y1 = 1011 0010 0101 1111


y2 = 1011 1010 1101 1011
y1 XOR y2 = 0000 1000 1000 0100
which implies that d(y1 , y2 ) = 3

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Decoding Rule
Given the notion of Hamming distance we can formulate

the following decoding rule:

Decoding Rule
Let x1 , x2 , . . . , x2k denote the set of valid codewords. After
receiving a codeword y , the decoder outputs the xi that has the
smallest Hamming distance to y (ties are broken randomly), i.e.
we have:
x =xi x1 ,x2 ,...,x2k d(y , xi )
The adoption of an algebraic structure (groups, vector

space) for the set of valid codewords often allows to


simplify the computation of as compared to an exhaustive
traversal of all valid codewords

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Some Observations

Suppose we have a code with Hamming distance dH


Error patterns with no more than dH 1 errors can be

detected reliably
Why?

Error patterns with no more than

corrected reliably
Why?

dH 1
2

errors can be

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Practical Block Codes

Block codes are still widely used, although better codes

exist
Most popular classes:
Reed-Solomon (RS) codes
Bose-Chaudhuri-Hocquenghem (BCH) codes

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
Fundamental Considerations
Block Codes
Further Comments
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Importance-based Coding
The approach is quite simple: for many data types (e.g.

speech) not all source data bits have the same importance
= no need to encode all data bits in the same way
Approach: Apply powerful coding (more redundancy) to

important data, less powerful coding to less important data!


Some of the data types suitable for importance-based

coding are:
Speech and audio
Video

The (human) receiver of these data types does not


recognize small distortions in video frames / speech
samples

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Importance-based Coding (2)

Example: video coding with MPEG 2/4:


Some bits describe global properties (metadata) of a
video stream, e.g. resolution, color depth, color palettes
these bits are quite important
Other bits (DCT coefficients) describe picture contents
the low frequency parts of a picture (large areas of the
same color) are more important than the high frequency
parts (sharp edges)
DCT = discrete cosine transform, has some similarities to discrete
Fourier transforms

Importance-based coding is also applied in GSM speech

transmission

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
ARQ Methods

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

ARQ Protocols

ARQ = Automatic Repeat reQuest [3], [11], [15], [26]


Basic idea:
The transmitter augments each frame with a checksum
The receiver uses the checksum to check the integrity of a
packet and provides the transmitter with feedback
If needed, the transmitter retransmits the packet

Important Point
ARQ schemes are feedback-based or closed loop schemes.
They provide redundancy (retransmissions) only upon negative
feedback, on excellent channels they do not have significant
overhead

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

ARQ Protocols (2)


Some immediate consequences:
Transmitter must buffer packets for possible retransmission
Feedback channel needs bandwidth as well
Even for very few bit errors whole packet is retransmitted
ARQ protocols differ:
in the number of allowed outstanding frames /
unacknowledged frames
in the buffering requirements at receiver / transmitter
in the way feedback is provided (positive / negative
acknowledgement frames, timers)
in their maximum throughput under error conditions
We discuss basic ARQ schemes, lots of variations!!

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Assumptions
In the following we assume one transmitter, one receiver

and a channel in between


Data flows from transmitter to receiver
Transmitter always has a new message available
Packets are required to be delivered:
Reliably (i.e. at least once)
Without duplication (at most once)
In-sequence

to the higher layers of the receiver


Nomenclature:
Message: refers to the higher-layer data
Packet: refers to the packet sent by error-control protocol,

includes the message data and header information

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Assumptions (2)
Packet header includes
Source address
Destination address
Error-control information (depends on protocol)
Packet also contains a checksum, which is perfect
Upon receiving a packet the receiver:
verifies checksum and drops it silently in case of failure
checks destination address, drops it when not destined to
him (address filtering)
ARQ protocols also use acknowledgement packets,

called ACKs, which include:


Source address
Destination address
Error-control information (depends on protocol)

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Reliable and Semi-Reliable Delivery

Truly reliable delivery requires a potentially unbounded

number of retransmissions
Protocols restricting the number of retransmissions are

called semi-reliable
Almost all practical link- or transport-layer protocols are

semi-reliable

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
ARQ Methods
Alternating Bit Protocol
Goback-N
Selective Repeat

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Alternating-Bit Protocol
The alternating bit protocol (ABP) [2] is the simplest of the

serious ARQ protocols


It is also often referred to as send-and-wait
Properties:
guarantees in-sequence delivery if round-trip time is
bounded and timeout is chosen appropriately
simple to implement
requires one buffer at transmitter and one buffer at receiver
is reasonably efficient over links with propagation delays
smaller than the packet transmission time
The packet header contains one extra field:
seqno, having a width of one bit

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ABP Transmitter

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

ABP Transmitter Comments


next is transmitters view on current sequence number
tx-request is issued by higher layers to request

message transmission
tx-request is accepted when transmitter is idle
tx-request is declined when transmitter is busy

confirm primitives are sent to higher layer protocols


Timeout value must be large enough to accommodate:
propagation delay (twice)
transmission time for ACK packet
processing times
Finding good timeouts is easy for a single hop and hard for

a multi-hop network
Why?

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ABP Receiver

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ABP First Example

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ABP Second Example

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

ABP Properties
Consider a packet of 1000 bits length, sent without gaps
Medium bitrate is 100 Mbps
Speed of light is 200,000 km/s in a cable
Conclusion: packet has a geographical length of 2000 m
Assume a link between Berlin/Germany and Christchurch

has 22, 000 km length


Conclusion: 11,000 packets could be transmitted back to
back before the first one arrives in Berlin, 22,000 before
the first feedback returns to Christchurch
But: ABP allows only one packet at a time in transit

Important Point
ABP is inefficient over long fat pipes, i.e. links with a large
bandwidth-delay product!!

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

ABP Properties (2)

If the propagation delay is bounded and the timeout value

is appropriately chosen, then:


ABP guarantees delivery
ABP delivers in-sequence
ABP avoids duplicate data delivery at receiver

ABP does not work correctly if the network between

transmitter and receiver may store packets (including acks)


for arbitrary long times
Question: Can you give example scenarios?

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
ARQ Methods
Alternating Bit Protocol
Goback-N
Selective Repeat

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Goback-N protocol
Goback-N combats the inefficiency of ABP by allowing N

outstanding frames
N is also called window size
outstanding = not yet acknowledged

GB-N can be regarded as a generalization of ABP stated

differently: ABP is actually a GB-1


TCP originally used a variant of GB-N, recent versions

have selective acknowledgements


Each packet is equipped with a sequence number, taken

from a finite sequence number space


Sequence numbers go from 0 to max_seqno (included)

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

GBN Operation
The transmitter has a buffer for N packets as long as this

buffer is not full, it accepts packets from its higher layers,


appends them to the buffer and transmits them asap
Transmitter maintains variable next_seqno, initialized to

zero
Receiver maintains variable expected_seqno, initialized

to zero
If the receiver receives a packet with a seqno equal to
expected_seqno:
expected_seqno is incremented modulo max_seqno + 1
The packet is delivered to higher layers
An acknowledgement is sent with expected_seqno

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

GBN Operation (2)


If the receiver receives a packet with a different seqno:
The packet is discarded
An acknowledgement is sent with expected_seqno, or no
ack is sent at all (to let the transmitter time out)
If the transmitter gets a new packet from the application

and the window is not full:


The packet is equipped with the sequence number

next_seqno
next_seqno is incremented modulo max_seqno + 1
The packet is transmitted asap (after any other not yet

transmitted packet that is)


A timer is started for this packet after its transmission

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

GBN Operation (3)

If the transmitter receives an ack packet with sequence

number ack_seqno:
All packets in the window with sequence number smaller

than ack_seqno are regarded as acknowledged and are


removed from the window, opening it for further packets
the corresponding timers are canceled
If transmitter gets timeout for oldest packet in the window:
All pending timers are canceled
This packet and all subsequent packets are retransmitted
For each retransmitted packet a new timer is started
immediately after its transmission

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

GBN Example

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Size of Sequence Number Space


The sequence number space must be larger than the

window size N
To see this, we use a counterexample where the size of

sequence number space is just N:


Assume receivers seqno is expected_seqno=0
Assume the transmitter sends N packets with seqnos from

0 to max_seqno
The receiver receives all of them and sends

acknowledgements, at the end of this its seqno is again


expected_seqno=0
All acks are lost
The sender times out and sends all packets again
The receiver accepts all packets as new packets, since they
have the expected sequence numbers = duplicates!

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

GBN Properties
GB-N can fill up a long fat pipe if N is chosen large enough
The transmitter needs N buffers, the receiver only a single

buffer to accept an incoming packet


If the network provides an upper bound on packet delays,
the protocol ensures:
in-sequence delivery
reliable delivery
no duplicates at the receiver

If a packet fails, this packet and all subsequent packets are

retransmitted, even if the latter were correctly received

Important Point
If the packet error rate (PER) is small, Goback-N is reasonably
efficient, but for higher PERs the protocol retransmits many
correctly received packets and becomes inefficient

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

GBN Properties (2)

With no bound on packet delays, correctness is harmed by:


old data packets when their seqno equals receivers

expected_seqno
old ack packets when their seqno fit into transmitters

current window

Remark
Protocols running over a large network use a large sequence
number space TCP has 32-bit seqnos and IP has additionally
a mechanism to kill too old packets (TTL field)

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

Outline

Introduction
Error-Detecting Codes, Checksums
Error-Correcting Codes
ARQ Methods
Alternating Bit Protocol
Goback-N
Selective Repeat

ARQ Methods

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Selective Repeat Protocol


Selective-Repeat (SR) works similar to Goback-N, but

there are differences:


Receiver also has N buffers and maintains a sliding

window, giving the range of currently accepted sequence


numbers
The receiver buffers an out-of-order packet, given that its
sequence number fits into the current window, but does not
yet deliver it to higher layers
A packet arriving at the lower end of the sliding window is
directly delivered to the higher layers along with all buffered
packets immediately following the received packet after
this the receiver advances its window accordingly, shifting
out the packets just delivered and creating space for new
packets

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Selective Repeat Protocol (2)

Some acknowledgement options:


Receiver sends individual positive acknowledgements
Receiver sends negative acknowledgements (NACK) only
indicating missing frames
Receiver regularly sends a bitmap indicating the reception
status of its window elements
Upon timer expiration or reception of a NACK the

transmitter retransmits only the indicated frame


The sequence number space must have a size of at least

2N

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Acknowledgement Variations
Positive acknowledgements:
Receiver receives packet with seqno s
Receiver sends ACK packet with seqno s
Semantics: i have successfully received this packet s
Modified semantics for cumulative positive
acknowledgments: i have successfully received this packet
s and all previous packets
Negative acknowledgement (Example):
Receiver receives packet with seqno s, previous received
packet had seqno s 2
Receiver sends NACK packet with seqno s 1
Semantics: i have not received packet s 1
More generally, a NACK is issued when receiver notices
that some packet has not been successfully received
Can you imagine other methods to detect a failure when
there is no gap in the sequence numbers?
NACKs can also be coupled with cumulative ACKs

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Discussion: FEC and ARQ


FEC does not require feedback channel, ARQ does
Open-loop FEC has:
constant overhead even without errors on the channel
constant throughput (if all errors can be corrected)
constant delay
variable residual error rate
ARQ has:
variable overhead: retransmissions occur only in case of
errors
variable delays (due to retransmissions)
very low residual error rate (determined by CRC quality and
allowed number of retransmissions)

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

Haowei Bai and Mohammed Atiquzzaman.


Error modeling schemes for fading channels in wireless communications: A
survey.
IEEE Communications Surveys and Tutorials, 5(2):2 9, 2003.
http://www.comsoc.org/livepubs/surveys.
K.A. Bartlett, R.A. Scantlebury, and P.T. Wilkinson.
A note on reliable full-duplex transmission over half duplex lines.
Communications of the ACM, 12(5):260ff, 1969.
D. Bertsekas and R. Gallager.
Data Networks.
Prentice Hall, Englewood Cliffs, New Jersey, 1987.
Ezio Biglieri.
Coding for Wireless Channels.
Springer, New York, 2005.
Leon Cohen.
The history of noise on the 100th anniversary of its birth.
IEEE Signal Processing Magazine, 22(11):2026, November 2005.
Thomas M. Cover and Joy A. Thomas.
Elements of Information Theory.
John Wiley & Sons, New York, second edition, 2006.
Victor DeBrunner, Linda DeBrunner, Longji Wang, and Sridhar Radhakrishnan.
Error control and concealment for image transmission.

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Bibliography

IEEE Communications Surveys and Tutorials, 3(1), 2000.


http://www.comsoc.org/livepubs/surveys.
E. O. Elliot.
Estimates of error rates for codes on burst-noise channels.
Bell Systems Technical Journal, 42:19771997, September 1963.
David C. Feldmeier.
Fast Software Implementation of Error Detection Codes.
IEEE/ACM Transactions on Networking, 6(6):640651, December 1995.
E. N. Gilbert.
Capacity of a burst-noise channel.
Bell Systems Technical Journal, 39:12531265, September 1960.
David Haccoun and Samuel Pierre.
Automatic repeat request.
In Jerry D. Gibson, editor, The Communications Handbook, pages 181198. CRC
Press / IEEE Press, Boca Raton, Florida, 1996.
Samir Kallel.
Efficient hybrid arq protocols with adaptive forward error correction.
IEEE Transactions on Communications, 42(2):281289, February 1994.
Shu Lin and Daniel J. Costello.
Error Control Coding.
Prentice-Hall, Englewood Cliffs, New Jersey, second edition, 2004.

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Shu Lin, Daniel J. Costello, and Michael J. Miller.


Automatic-Repeat-Request Error-Control Schemes.
IEEE Communications Magazine, 22(12):517, December 1984.
Hang Liu, Hairuo Ma, Magda El Zarki, and Sanjay Gupta.
Error control schemes for networks: An overview.
MONET Mobile Networks and Applications, 2(2):167182, 1997.
David J. C. MacKay.
Information Theory, Inference, and Learning Algorithms.
Cambridge University Press, Cambridge, UK, 2003.
Arnold M. Michelson and Allen H. Levesque.
Error-Control Techniques for Digital Communication.
John Wiley and Sons, New York, 1985.
Robert H. Morelos-Zaragoza.
The Art of Error Correcting Coding.
John Wiley & Sons, Chichester, UK, second edition, 2004.
Tenkasi V. Ramabadran and Sunil S. Gaitonde.
A Tutorial on CRC Computations.
IEEE Micro, 8(4):6275, August 1988.
Tom Richardson and Ruediger Urbanke.
Modern Coding Theory.
Cambridge University Press, Cambridge, Massachusetts, 2008.

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Dilip V. Sarwate.
Computation of Cyclic Redundandy Checks via Table Look-Up.
Communications of the ACM, 31(8):10081013, August 1988.
Claude E. Shannon.
A mathematical theory of communication.
Bell Systems Technical Journal, 27:379423, 623656, July, October 1948.
Bernard Sklar.
A primer on turbo code concepts.
IEEE Communications Magazine, 35(12):94102, December 1997.
William Stallings.
Data and Computer Communications.
Prentice Hall, Englewood Cliffs, New Jersey, fourth edition, 2006.
Jonathan Stone, Michael Greenwald, Craig Partridge, and James Hughes.
Performance of checksums and crcs over real data.
IEEE/ACM Transactions on Networking, 6(5):529543, 1998.
Andrew S. Tanenbaum.
Computer Networks.
Prentice-Hall, Englewood Cliffs, New Jersey, third edition, 1997.
H.S. Wang and N. Moayeri.
Finite State Markov Channel - A Useful Model for Radio Communication
Channels.
IEEE Transactions on Vehicular Technology, 44(1):163171, February 1995.

Bibliography

Introduction

Error-Detecting Codes, Checksums

Error-Correcting Codes

ARQ Methods

Yao Wang and Qin-Fan Zhu.


Error Control and Concealment for Video Communication: A Review.
Proceedings of the IEEE, 86(5):974997, May 1998.

Bibliography

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Data Communications and Networking


COSC 264
Introduction to Routing

Dr. Andreas Willig1


Dr. Muhammad Asad Arfeen2
1 Dept.

of Computer Science and Software Engineering


University of Canterbury, Christchurch
2 Dept. of Computer and Information Systems Engineering
NED University of Engineering & Technology, Karachi

UoC, 2014

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Fundamentals
Some Unusual Routing Protocols
Shortest-Path Algorithms
Bellman-Ford Algorithm
Dijkstra Algorithm
Distance-Vector Protocols
Protocol Operation
Problems
Link-State Protocols
Protocol Operation
Discussion and Further Topics
Discussion of DV and LS Protocols
Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

On this Module

Goals:
Understand distinction between routing algorithms and
protocols
Understand distance-vector (DV) and link-state (LS) routing
protocols and their issues
We explain the basic concepts of DV and LS routing, not

any particular routing protocol built on these paradigms


This module is mainly based on [6], some parts also on [7,

Chap. 12] and [5]

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline

Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Fundamentals
Some Unusual Routing Protocols
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

An Example Network
We model a network as a graph
G = (V , E)

For simplicity, we define


V = 1, 2, . . . , N

Vertices v V correspond to
stations / routers

E V V
Edges e E correspond to
direct links, we write (i, j) or i j
for a link between i V and
jV

Edges are labeled with


non-negative numbers

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

An Example Network (2)

Edge labels (called metrics) can be interpreted differently


Interpretation as costs, e.g.:
Delay
Monetary transmission costs
Geographical distance

Please note that cost is a more generic term than any of


the mentioned examples
Interpretation as available resources, e.g.:
Number of available phone trunks
Currently available capacity, given the set of flows that

already use this link

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Fundamental task of routing

Given a network G = (V , E) , and given a fixed source

s V and destination node d V


An m-hop path between source and destination is a
sequence of edges (i0 , i1 ), (i1 , i2 ), . . . , (im1 , im ), so that
i0 = s and im = d
Alternative notation: i0 i1 . . . im

Major task
For each source-destination pair in a network identify one or
more paths that are optimal (or at least of reasonable quality) in
some pre-defined sense.

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Shortest-Path Routing

Goal
The goal of shortest-path routing algorithms is to select the
path that has the smallest total cost. To compute the total cost
of a path the costs of all its links are added.
Routing in the Internet uses shortest-path routing
But this is more complicated, though . . .
Special case: minimum-hop routing is obtained when all

link costs are the same (e.g. = 1)


Shortest-path routing is also known as least-cost routing

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Widest-Path Routing

Some networks judge the cost or value of a path based on

a non-additive property
Example: dynamic call routing in POTS:
Assume that for each network link we maintain the residual
capacity available on this link
The residual capacity of a path is defined as the minimum
residual capacity of all the links on it
The goal is then to find the path with the maximum residual
capacity, since routing a new phone call on this path does
least likely create a bottleneck for future calls

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Judgement Criteria
Effectiveness: when a route between two nodes exists,

the routing algorithm / protocol should be able to find it


Correctness: computed routes should be valid paths that

contain no circles (i.e. are loop-free)


Simplicity: routing algorithms / protocols should be

computationally simple and require only little information


exchange among routers
Robustness: a routing protocol must be able to cope with:

link or station failures


newly established links or stations
changes in link metrics
congestion situations

by establishing new routes when old ones become


infeasible or are no longer optimal

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Judgement Criteria (2)

Stability: a routing protocol should not recompute

everything upon minor changes in the network


There is a tension between stability and robustness!

Fairness: all users should be treated in the same way


Optimality: different criteria, depending on perspective:
Provider perspective: the network should carry as many
connections / packet flows as possible (maximizes revenue)
User perspective: generated routes should be short, fast
and offer good throughput

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Classification Criteria for Routing Protocols


Decision time (i.e. when are routing decisions made)
For each packet (timescale of ms and below)
For each session / call (timescale of seconds to minutes)
At network configuration time (timescale of months / years)
Decision place:
Originating node / source (Source routing)
Central node (centralized routing)
Each node (distributed routing)
Information used in decision:
None
Local information (to a station)
Information from adjacent nodes (and local information)
Information from all nodes along a route
Information from all nodes in the network

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Classification Criteria for Routing Protocols (2)


Information update frequency:
Periodic
Upon major load change or topology change
Both
Information update initiation:
Push: a node transmits information updates to its neighbors
on its own initiative
Pull: a node asks his neighbors (using request-response
exchange) for new information updates
Information exchange channel:
In-band: routing information is exchanged on the same
network / channel as user data
Example: In the Internet routing protocols run on top of IP
Out-of-band: routing information is transmitted on separate

network, no resource sharing at all with user data


Example: in the POTS routing messages are exchanged
over separate network using the SS7 protocol

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Routing Algorithms and Routing Protocols

A routing algorithm solves routing problem in centralized

fashion, assuming full network information is available


A routing protocol embeds a routing algorithm into a real

networking context:
It operates in a distributed environment
It incorporates explicit information exchange among nodes
Information exchange takes time and might fail, the protocol

must consider these possibilities


A routing protocol mainly specifies which information is

exchanged between stations (and when), it is not


necessarily tied to any specific routing algorithm

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Forwarding Table

A forwarding table within a router maps to each

destination address either:


an outgoing interface (next-hop routing)
a full route to destination, which is then added to a packet

(source routing) and obeyed by all nodes on the path

Routing in the Internet uses next-hop routing


The forwarding table:
results from the execution of the routing protocol (dynamic
routing), or can be static / preconfigured (static routing)
is changed on relatively large timescales, e.g. upon
topology changes, load changes or changes in metrics
is consulted for every packet

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Fundamentals
Some Unusual Routing Protocols
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Hot-Potato Routing

Rule: a router transmits a packet on the output link

currently having shortest queue / highest capacity / . . .


Benefits:
computationally simple
uses local information
no exchange of routing information required
Drawbacks:
No guarantees at all can be given

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Randomized Routing

Rule: a router transmits a packet on a randomly chosen

outgoing interface
Benefits:
Computationally simple
no exchange of routing information required
In a finite connected network this algorithm is guaranteed to
hit the destination node with probability one in absence of
link errors, congestion, etc.
Drawbacks:
Actual paths taken can be very long
Delays can be very long
Many hops give many opportunities for loosing packet

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Flooding
Rule: a router transmits a packet on all interfaces, except

the one it was received on


A packet can arrive multiple times at destination or router
Router uses unique packet identifier (e.g. source address

and seqno) to avoid delivering duplicate data


Routers need these identifiers to avoid forwarding a packet

more than once (i.e. avoiding self-amplifying explosion)


Benefits:
simplicity, robustness, stability, . . .
requires no routing computations, no information exchange
Drawbacks
Extreme waste of resources
Security issue: all stations in the network get the packet
Flooding can be an interesting option in networks where

data is transmitted only very rarely

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline

Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Introduction

We discuss two shortest-path routing algorithms:


Bellman-Ford
Dijkstra

There are other algorithms available, e.g. Floyd-Warshall


See also [4]
Both algorithms are centralized, i.e. they require that the

full information about network topology (nodes, edges and


metrics) is available to the algorithm
Both algorithms play a prominent role in Internet routing

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Some Notations
Given a network G = (V , E) with N = |V | stations
i V and j V refer to some generic nodes / stations in

the network
di,j is the direct link cost / metric between i and j, with:
0 di,j < when i and j are adjacent nodes
di,j = when i and j are non-adjacent nodes

i,j represents the total cost of the minimum cost path from
D
i to j in the Bellman-Ford algorithm, over one or multiple
hops, according to is current knowledge
D i,j represents the same thing for Dijkstras algorithm
Ni represents the set of nodes adjacent to node i, i.e.

Ni = k V (i, k ) E

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Some Notations (2)

In this example we have

4,6 = D = 2
d4,6 = 15, but D
4,6
(by choosing the path 4 3 6)

Furthermore, d1,6 = , but

1,6 = 3 (by choosing the path


D
1 4 3 6)

We have N5 = 3, 4, 6

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Shortest-Path Algorithms
Bellman-Ford Algorithm
Dijkstra Algorithm
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Basic Idea
B-F follows the dynamic programming principle [1], [2]
We want to find a shortest route from s V to d V
The following equations must be satisfied:

Ds,s = 0
s,k + dk ,d ,
Ds,d = min D
k Nd

for s 6= d

Explanation: suppose node s already knows its least costs

s,k to the neighbors k Nd of d, then ss least cost to d


D
s,k plus
is the minimum over all neighbors k of the costs D
the direct costs dk ,d
s,k known and how to turn this into an
But: how is D
algorithm?

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Approach

In the Bellman-Ford algorithm the costs between a source

node s and all destination nodes d are computed iteratively


The iterations are over the number of hops that packets

can take
(h)

gives the smallest cost among all


Change in notation: D
s,d
pathes between s and d that have at most h hops

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

A Simplified Bellman-Ford Algorithm

/ / Computes f o r a f i x e d node s t h e d i s t a n c e s and t h e r o u t i n g


/ / t r e e t o a l l o t h e r nodes . Graph can be d i r e c t e d o r u n d i r e c t e d
// initialization
(0) = 0 ; pred [ s ] = s ;
D
s,s
f o r a l l d w i t h d 6= s do
(0) = ; pred [ d ] = NULL ;
D
s,d
/ l o o p over a l l numbers o f hops /
f o r h = 1 . . . N do :
(h) = 0
D
s,s
f o r a l l d 6= s do :
(h) = mink N D
(h1) + dk ,d
D
s,d
s,k
d
(h1) + dk ,d
pred [ d ] = k N D
d

s,k

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

A Second Version

/ / Computes f o r a f i x e d node s t h e d i s t a n c e s and t h e r o u t i n g


/ / t r e e t o a l l o t h e r nodes . Graph i s assumed t o be d i r e c t e d
// initialization
s,s = 0 ; pred [ s ] = s ;
D
f o r a l l d w i t h d 6= s do
s,d = ; pred [ d ] = NULL ;
D
/ l o o p over a l l numbers o f hops /
f o r h = 1 t o N 1 do :
foreach (v , w) E do
s,v + dv ,w < D
s,w then
when D
s,w = D
s,v + dv ,w ;
D
pred [ w ] = v

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Comments

The runtime of the algorithm for a single source is

O(|V | |E|), for a whole network it becomes O(|V |2 |E|)


The algorithm shown is a simplified version of

Bellman-Ford that can handle only non-negative weights


The algorithm can be extended to handle negative weights

as well, as long as no negative cycles are contained


It is a centralized algorithm, complete network information

(di,j ) must be available at execution time

Question
What is pred good for? What can you read off from it?

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Shortest-Path Algorithms
Bellman-Ford Algorithm
Dijkstra Algorithm
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Basic Idea
Dijkstras algorithm is restricted to graphs with

non-negative weights
It is greedy: in every situation it makes the choice that is

currently the best, without regard to future situations


Here:
The algorithm maintains a list S of nodes that have not yet

been considered
In each step it removes k S to which the source s has

the smallest known distance D s,k


For each neighbor x of k it is then checked if a path through

k to x is shorter than best so-far known path to x


Remember that nodes are numbered from 1 to N, i.e.

V = 1, . . . , N

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

The Dijkstra Algorithm


/ / Computes f o r a f i x e d node s t h e d i s t a n c e s and t h e r o u t i n g
/ / t r e e t o a l l o t h e r nodes . Graph i s assumed t o be d i r e c t e d
// initialization
S = V \ s
D s,s = 0 ; pred [ s ] = s
f o r a l l d S do
D s,d = ds,d ; pred [ d ] = NULL ;
when ds,d <
pred [ d ] = s
/ / main l o o p
while S 6= do
k =mS D s,m
S = S \ k
f o r j Nk do
when D s,k + dk ,j < D s,j
D s,j = D s,k + dk ,j
pred [ j ] = k

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Comments

The worst-case runtime of Dijkstras algorithm for a single

node is O(N 2 ), but can be better for sparse graphs


Dijkstra cannot handle negative metrics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline

Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Protocol Operation
Problems
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Introduction

The presentation here follows [6, Chap. 3]


We want to turn the centralized Bellman-Ford algorithm

into a distributed protocol where nodes communicate only


with adjacent nodes
For such a protocol we have to clarify:
What kind of messages do nodes send to their neighbors,

what information is carried in them?


How often / when should this information be sent?

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Basic Approach
Previously we stated that the costs for the least cost routes

between source s and destination d satisfy:


Ds,s = 0
s,k + dk ,d ,
Ds,d = min D
k Nd

for s 6= d

s,k refers to a neighbor k of the


Observe here that D
destination d
A similar relationship must also hold for neighbors of s:

Ds,s = 0
k ,d ,
Ds,d = min ds,k + D
k Ns

What have we gained here?

for s 6= d

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Basic Approach (2)


In the second form node s just uses information that:
it has itself (s knows Ns and ds,k for direct neighbors)
k ,d , for k Ns )
it can obtain from direct neighbors (D
This gives hint: each node i transmits a vector, containing

i,d (t) it knows at time t, to its neighbors


all values D

k ,d (t) from all its neighbors k


By this rule, node s receives D

Upon receiving a value Dk ,d (t) from a neighbor k , node s


can re-compute its own least-cost path to d by checking
k ,d (t) + ds,k is smaller than currently known
whether D
least-cost if so, node s stores k as next-hop for d
Some time later node s in turn transmits own vector with all
s,d (t), propagating changes further to downstream nodes
D
A node sends an updated version of its vector each time it
receives new information, or periodically (or both)
Over time, a node s receives sufficient vectors to have a
view on the whole network

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Basic Approach (3)


Note that initially a node s does not need to know the

whole network, it only needs Ns and ds,k for all k Ns


Node addressing is handled outside the protocol
For each destination address d, node s stores the next hop

and the least cost / distance


Since nodes transmit a vector containing their distances

s,d to all known destinations d, this approach is called


D
distance-vector protocol
Some of the Internets routing protocols are based on this
approach, for example:
RIP and RIP-2
BGP (uses variant of DV, named path vector routing)

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

DV Protocol Details
s (t) be the minimum cost from node
In the following, let D
k ,d
k to node d, as it is available to s at time t
Rationale for time dependency: it may take time for node k

to inform node s about changes in its least-cost values


Possible reasons:
Processing delays at k , transmission delays
Node k transmits new cost vectors to some other node v
before transmitting it to s

A protocol message sent by node i at time t has format:

Here:
Id= i indicates that i is the sender
a record Dst=d,Cost=c indicates that node is current

least-cost path to node d has total costs of c


Node i includes such records for all destinations d it is

currently aware of

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

DV Protocol Details (2)

// Initialization
/ / Node i s c o n f i g u r e d w i t h unique node i d , e . g . s
s,s = 0
D
foreach k Ns do
s,k = ds,k
D
nexthop [ k ] = k
/ / p e r i o d i c t r a n s m i s s i o n o f own t a b l e t o n e i g h b o r s (DV messages )
on r e c e i v i n g t r a n s m i t t i m e r do
t r a n s m i t towards each k Ns t h e l i s t o f c u r r e n t l y \
known d e s t i n a t i o n s and t h e i r most r e c e n t l e a s t c o s t s
r e s t a r t t r a n s m i t t i m e r

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

DV Protocol Details (3)


/ / r e c e p t i o n o f DV update from n e i g h b o r s
on r e c e i v i n g DV message from n e i g h b o r k a t t i m e t do
when message c o n t a i n s new d e s t i n a t i o n d then
s,d = ds,k + D
s (t)
D
k ,d
nexthop [ d ] = k
foreach d e s t i n a t i o n d mentioned i n k s DV message do
s (t)
store D
k ,d
foreach d e s t i n a t i o n d known t o s do
when nexthop [ d ] = k then
s,d = ds,k + D
s (t)
D
k ,d
/ / r o u t e computation
foreach d e s t i n a t i o n d known t o s do
foreach n e i g h b o r m do
s (t) < D
s,d then
when ds,m + D
m,d

s (t)
Ds,d = ds,m + D
m,d
nexthop [ d ] = m

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

DV Protocol Details (4)

/ / s p e c i a l cases
/ / a l i n k t o a n e i g h b o r goes down / changes c o s t
on l i n k c o s t change towards n e i g h b o r k do :
ds,k = / / l i n k f a i l u r e , c o s t change a l s o p o s s i b l e
foreach d e s t i n a t i o n d do
when nexthop [ d ] = k then
s,d = / / a l t e r n a t i v e l y : f r e s h r o u t e computation
D
t r a n s m i t towards each m Ns t h e l i s t o f c u r r e n t l y \
known d e s t i n a t i o n s and t h e i r most r e c e n t l e a s t c o s t s
/ / a l i n k t o a n e i g h b o r comes up
on l i n k c r e a t i o n towards n e i g h b o r k do :
update ds,k
t r a n s m i t towards each m Ns t h e l i s t o f c u r r e n t l y \
known d e s t i n a t i o n s and t h e i r most r e c e n t l e a s t c o s t s

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

DV Protocol Details (5)

How to detect a link failure?


First option: check for sustained lack of periodic DV
messages
Second option: send separate hello messages frequently,
check for lack of answers
How to detect a cost change for a link?
Depends on the precise link metric
Example: delays depend on lengths of output queues

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Comments

The distances to destinations d as seen from source s

evolve over time, they change with message reception and


link failures or link establishments
The period for periodic distance vector transmissions has

influence on convergence time


A delay between detecting a link failure and transmitting an

updated distance vector can have significant influence on


protocol operation

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Protocol Operation
Problems
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Convergence Time

Consider the above example network


Assume that all nodes are switched on at the same time t = 0
Immediately after being switched on, each node informs its neighbors about its
presence

Each node transmits its distance vector message every 60 seconds


After receiving the distance vector messages the shortest path computations
takes one second

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Convergence Time Example Timeline


Time t = 0 s: all nodes are activated and send their initial DV message
Time t = 1 s: nodes finished their computations, result:
Nd 1
Dst
1
2

Cost
0
1

Outg

1-2

Nd 2
Dst
1
2
3

Cost
1
0
2

Outg
2-1

2-3

Nd 3
Dst
2
3
6

Cost
2
0
1

Outg
3-2

3-6

Cost
3
2
0
1

Outg
3-2
3-2

3-6

Cost
3
2
0
1

Outg
3-2
3-2

3-6

Nd 6
Dst
3
6

Cost
1
0

Outg
6-3

Nd 6
Dst
2
3
6

Cost
3
1
0

Outg
6-3
6-3

Nd 6
Dst
1
2
3
6

Cost
4
3
1
0

Outg
6-3
6-3
6-3

Time t = 60 s: all nodes broadcast their DV messages


Time t = 61 s: nodes finished their computations, result:
Nd 1
Dst
1
2
3

Cost
0
1
3

Outg

1-2
1-2

Nd 2
Dst
1
2
3
6

Cost
1
0
2
3

Outg
2-1

2-3
2-3

Nd 3
Dst
1
2
3
6

Time t = 120 s: all nodes broadcast their DV messages


Time t = 121 s: nodes finished their computations, result:
Nd 1
Dst
1
2
3
6

Cost
0
1
3
4

Outg

1-2
1-2
1-2

Nd 2
Dst
1
2
3
6

Cost
1
0
2
3

Outg
2-1

2-3
2-3

Nd 3
Dst
1
2
3
6

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Convergence Time Summary

In this example it took 121 s to let node 1 learn the full

network!!
If we take any DV transmission operation as the beginning

of a round, it took three rounds to converge


If the network diameter is K hops, it would take K rounds

to converge
Cure:
You cannot avoid to have K rounds
You can shorten a round by shortening transmission period
Problem: more DV messages, more overhead

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Routing Loops Example


Suppose at time t0 routing has
converged, all nodes have
correct routing tables

At time t1 link 3-6 fails


At time t2 node 3 updates its
3,6 =
routing table entry D

At time t3 node 2 sends a DV


message to node 3, including
2,6 = 3
D

At time t4 both nodes 2 and 3


perform a routing computation
for all known destinations and
update their routing tables

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Routing Loops Example (2)


Time t0 : routing tables after convergence (only routes to node 6):
Node 2
Dst
6

Cost
3

Outg
2-3

Node 3
Dst
6

Cost
1

Outg
3-6

Time t2 : routing tables after node 3 recognized link failure to node 6:


Node 2
Dst
6

Cost
3

Outg
2-3

Node 3
Dst
6

Cost

Outg
3-6

Time t3 : node 3 receives the following DV message from node 2

Time t4 : routing tables after both node 2 and 3 performed routing computation:
Node 2
Dst
6

Cost
3

Outg
2-3

And here we have a loop!!

Node 3
Dst
6

Cost
5

Outg
3-2

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Routing Loops Summary


Routing loops can occur in DV protocols e.g. after link

failure or major increase in a link metric


In this example, the loop would not have occured if node 3

would have updated its table and transmitted an updated


vector immediately after t1 and before time t3
However, in a distributed environment such race conditions

cannot be entirely removed


A more rigorous solution (called Diffusing Update

Algorithm) has been incorporated into the EIGRP protocol


(see [6, Chap. 3], [3])
EIGRP = Enhanced Interior Gateway Routing Protocol,

used in the Internet

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Routing Loops Time-To-Live Approach


In networks where routing loops can occur, packets must

somehow be prevented from circulating forever


A simple mechanism: time-to-live fields
A packet source adds to its packets a specific header field,
e.g. called time-to-live (TTL)
The TTL parameter either indicates a physical time that the
packet is allowed to circulate in the network, or it indicates
the maximum number of hops that a packet may take
Here we assume that TTL indicates max number of hops
A router behaves as follows:
It reads the TTL field off from an incoming packet
If the TTL field is one and the router cannot directly reach
the final destination, the packet is dropped
Otherwise, the TTL field is decremented, written back into
the packet (with additional checksum re-calculation, if
necessary) and the packet is forwarded further

This approach has been adopted by the IP protocol

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Count-to-Infinity Example

Time t0 : network has converged


Time t1 : cost on link A-B increases
from 4 to 100, node B detects this

(see [5, Sec. 4.5])

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Count-to-Infinity Example (2)


Time t0 : routing tables after convergence:
Node A
Dst
A
B
C

Cost
0
4
5

Outg

A-B
A-B

Node B
Dst
A
B
C

Cost
4
0
1

Outg
B-A

B-C

Node C
Dst
A
B
C

Cost
5
1
0

Outg
C-B
C-B

Time t1 : node B detects link cost change, and computes new path to A (taking into
account his knowledge that C can offer a path of length 5 to A), giving:
Node A
Dst
A
B
C

Cost
0
4
5

Outg

A-B
A-B

Node B
Dst
A
B
C

Cost
6
0
1

Outg
B-C

B-C

Node C
Dst
A
B
C

Cost
5
1
0

Outg
C-B
C-B

We have a routing loop now!!


Time t2 : node B informs node C via DV message that its new costs to A is 6, node C
re-calculates costs and route to A (which is via B), as:
Node A
Dst
A
B
C

Cost
0
4
5

Outg

A-B
A-B

Node B
Dst
A
B
C

Cost
6
0
1

Outg
B-C

B-C

Node C
Dst
A
B
C

Cost
7
1
0

Outg
C-B
C-B

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Count-to-Infinity Example (3)


Continuation:
Node C informs node B about its new cost (which is now 7)
and subsequently node B re-calculates its cost to 8
Node B informs node C about its new cost (which is now 8)
and subsequently node C re-calculates its cost to 9
and so on, and so on
The procedure stops when node B announces costs of 50,
then leading C to adopt the direct link C-A to C
This behaviour is known as the count-to-infinity problem
One cure:
Split-horizon approach: when transmitting a DV message
on a link, include updates only for nodes for which the link
is not the next-hop link!
Convince yourself that this solves the present example
But: it does not solve the problem in general!

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Another Problem with DV Protocols

Suppose you are an evil or incompetent person and have

root access to a router running a DV protocol


Can you imagine a way in which, by sending well-formed

DV messages, you can corrupt routing in parts of the


network?

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline

Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline

Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Protocol Operation
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

The Dijkstra Algorithm


/ / Computes f o r a f i x e d node s t h e d i s t a n c e s and t h e r o u t i n g
/ / t r e e t o a l l o t h e r nodes . Graph i s assumed t o be d i r e c t e d
// initialization
S = V \ s
D s,s = 0 ; pred [ s ] = s
f o r a l l d S do
D s,d = ds,d ; pred [ d ] = NULL ;
when ds,d <
pred [ d ] = s
/ / main l o o p
while S 6= do
k =mS D s,m
S = S \ k
f o r j Nk do
when D s,k + dk ,j < D s,j
D s,j = D s,k + dk ,j
pred [ j ] = k

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Basic Idea
The Dijkstra algorithm needs the di,j as its main input
Stated differently the Dijkstra algorithm must know:
the links (i.e. their start and end node)
and their costs / their state (up, down)

which is equivalent to knowing the whole topology


In a distributed environment, this is not known a priori to a

node, but has to be discovered by use of a protocol


Since such a protocol exchanges link information or link

states, they are known as link-state protocols


Individual nodes then maintain a link-state database,

collecting all the di,j values

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Some Assumptions

Dissemination of link states is based on flooding, which is

assumed to be reliable
Means: each node in the network eventually gets the

information
But no guarantees are given as to when this happens
(Efficient) Implementation of reliable flooding is challenging

Upon receiving new link-state information, a node performs

the shortest-path computation (e.g. using Dijkstra) locally,


based on its current link-state database
A node has a mechanism to detect the costs and cost

changes of its outgoing links

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Link State Advertisement (LSA) Messages

SrcNode identifies the source of a link (its one end)


DstNode identifies the destination of a link (its other end)
Alternatively, we give a LinkID of the kind 1 7 2 to specify link
Cost identifies the current link cost di,j
Seqno is a sequence number
TTL indicates the remaining time the LSA is valid (different meaning as in TTL
mechanism discussed before)

LSA messages are generated by each node for each of its outgoing links, these
are flooded reliably

An example:

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

LSA Generation and Database Maintenance


A node generates and floods LSA messages both:
Periodically, and
Upon topology / cost changes (triggered update)
Periodic updates:
Period should be than the initial value of TTL field
A node receiving an LSA stores received data in link-state
database, recomputes routes and initializes a timer with the
value of the TTL field for this link
When this timer reaches 0:
the link-state information is purged from the database
routes are re-computed without the purged link
neighbors are informed (flooding!) with special LSA with TTL
set to 0, forcing them to purge link-state information as well
The chosen period has significant impact on overhead!!
Updates upon topology / cost changes:
Whenever a node detects a change in cost / a link failure / a
new link, it immediately floods an LSA

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

The Need for Sequence Numbers

Suppose that LSAs consist only of


SrcNode, DstNode, Cost

Time t0 : node 1 generates LSA


[1, 1 7 2, 1] and sends this to nodes
2 and 4

Time t1 : link 1 7 2 fails, node 1


generates new LSA [1, 1 7 2, ]
and sends it to node 4

Time t2 : node 2 forwards LSA


[1, 1 7 2, 1] to node 4 (as part of the
flooding process)
(see [6, Fig. 3.10])

Problem
How can node 4 tell which LSA is correct?

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

The Need for Sequence Numbers (2)


To circumvent this problem, sequence numbers are used
More precisely:
A node i maintains for each outgoing link o a separate
sequence number so
Whenever i notices that the cost of link o has changed, it
sends a new LSA, includes the current value of so in it and
increments so afterwards (for periodic updates so does not
necessarily need to be updated)
Any other node k receiving a LSA for a link i 7 j stores the
received sequence numbers
When node k receives a LSA from i for i 7 j, it checks
whether the contained seqno is larger than the stored one
If so, the new cost is extracted from the LSA and stored
Otherwise, the LSA is dropped
Note: there is no coupling between sequence numbers at

different nodes (not even when sharing the same link)

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

The Sequence Number Space should be LARGE!!


A SeqNo field in a frame is of finite width, say b bits
Sequence number space is S = 0, 1, 2, . . . , 2b 1
At the end of S the numbers wrap around
For b = 3 we have S = 0, 1, 2, . . . , 7
Suppose a node receives two LSAs for a link, the first with
SeqNo=7, the other with SeqNo=2
Not clear whether SeqNo=2 was generated before or after
SeqNo=7!!
To (almost) resolve this ambiguity:
Sequence number space should be large, e.g. b = 32 or
b = 64, essentially removing need for wraparound
The TTL mechanism purges old link-state information

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Sequence Numbers and Node Crashes


Nodes increment seqno after each LSA transmission
Nodes can crash
How should a crashed node choose an initial sequence

number that is larger than what is present in the network?


First approach:
Initialize sequence number to 0
Wait for a time of at least maximum TTL to ensure that all

old information has been purged


Then send a LSA with sequence number 0

Second approach (Resynchronization):


Ask neighbor for his most recent copy of own LSA message
Extract seqno from this and increment it
Send an updated LSA message with new seqno
A recovered / new node also asks the neighbor for a copy
of its link-state database to be able to compute routes
quickly (i.e. much quicker than maximum TTL parameter)

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

The Hello Protocol

One mechanism for checking state of link are periodic

packet exchanges with neighbor


Hello packets
Link / neighbor is considered dead after subsequent failed

hello exchanges
Period can be significantly smaller than maximum TTL
The initial query of a neighbors link-state database is also

considered as part of the hello protocol

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Discussion / Comments

Each node performs route computations independently


Because of finite propagation speed of link state changes,

inconsistent views and routing loops can occur as well


But: the flooding adopted by LS tends to resolve

inconsistencies much faster than in DV protocols


LS protocol requires periodic flooding of LSA messages,

one for each link


Significant overhead
Reduce overhead by making period large, so that triggered

updates have major share of all updates

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline

Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics
Discussion of DV and LS Protocols
Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Differences between DV and LS Protocols


Information maintained by nodes:
DV: nodes know all other nodes and costs to reach them
LS: nodes know all links and their costs
Protocol information messages:
DV: carry per-node information
LS: carry per-link information
Communication partners:
DV: nodes talk only to their neighbors
LS: nodes inform the whole network
Therefore:
LS has much more overhead than DV!!
LS can propagate new information much quicker, shortens
the time for which inconsistencies can exist!

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

DV and LS Protocols Scalability


Both protocols have scalability issues:
Nodes have to know all nodes / links in a network
This does not scale well to large networks like the Internet!
Therefore, realistic protocols like OSPF are hierarchical:
Network is partitioned into areas
A router in one area needs to know:
How to reach each node in its own area
How to reach other areas (but not individual nodes in them!)
Flooding of link states is restricted to own area
The hierarchy can have more levels

Furthermore: routing is not done for the entire Internet,

but only within smaller pieces (autonomous systems)


Internet is partitioned into autonomous system (AS)
One AS is usually owned by one entity, which decides

about routing within the AS


Routing across different AS handled by BGP

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

DV and LS Protocols Reliable Information Exchange

We have seen that both protocols allow for periods with

inconsistent views on the network


One consequence: routing loops!

Allowing loss of distance vector messages or LSA

messages would exacerbate consistency problems


Therefore, many practical DV / LS protocols assume or

define a mechanism for reliable delivery of protocol control


messages (including DV messages and LSA messages)

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Outline
Introduction
Shortest-Path Algorithms
Distance-Vector Protocols
Link-State Protocols
Discussion and Further Topics
Discussion of DV and LS Protocols
Further Topics

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

k Shortest-Path Algorithms
It is sometimes useful to have not only a single (the best!)

connection between source and destination, but several,


for example k paths
One usage: backup paths for fault tolerance
Another usage: load balancing over paths

You typically do not want some k paths, but the k best ones
One heuristic approach for finding link-disjoint paths:
Start w/ full network G0 , identify best path P1 (e.g. Dijkstra)
Compute network G1 by removing links of P1 from G0
Identify best path P2 in G1 , and so on . . .
Variations:
Remove one link of P1 at a time from G0
Remove all intermediate nodes of P1 (and their direct links)
from G0 , this creates node-disjoint paths

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Max-Flow Problems

Given a source node S and a destination node T , link labels denote capacity
S wishes to transmit a continuous flow of constant rate to T
Assume that source can use all links it wishes
Assume that flows on a link can be split arbitrarily, e.g. a router can split a flow of
rate x to two outputs with rate x/2 each
What is the maximum rate at which S can transmit data?
Read about the Ford-Fulkerson algorithm, implement it and use it to solve this
problem!!

Introduction Shortest-Path Algorithms Distance-Vector Protocols Link-State Protocols Discussion and Further Topics Bibliograp

Dimitri P. Bertsekas.
Dynamic Programming and Optimal Control Volume 1.
Athena Scientific, Belmont, Massachusetts, 3rd edition, 2005.
Dimitri P. Bertsekas.
Dynamic Programming and Optimal Control Volume 2.
Athena Scientific, Belmont, Massachusetts, 3rd edition, 2007.
J. J. Garcia-Luna-Aceves.
Loop-free routing using diffusing computation.
IEEE/ACM Transactions on Networking, 1(1):130141, February 1993.
Bernhard Korte and Jens Vygen.
Combinatorial Optimization Theory and Algorithms.
Springer, Berlin, third edition, 1005.
James F. Kurose and Keith W. Ross.
Computer Networking A Top-Down Approach Featuring the Internet.
Addison-Wesley, Boston, fourth edition, 2001.
Deepankar Medhi and Karthikeyan Ramasamy.
Network Routing Algorithms, Protocols, and Architectures.
Morgan Kaufmann, San Francicso, California, 2007.
William Stallings.
Data and Computer Communications.
Prentice Hall, Englewood Cliffs, New Jersey, fourth edition, 2006.

The Internet

IPv4

IP Helper Protocols

Data Communications and Networking


COSC 264
IP and Related Protocols
Dr. Andreas Willig1
Dr. Muhammad Asad Arfeen2
1 Dept.

of Computer Science and Software Engineering


University of Canterbury, Christchurch
2 Dept. of Computer and Information Systems Engineering
NED University of Engineering & Technology, Karachi

UoC, 2014

Bibliography

The Internet

IPv4

IP Helper Protocols

Outline
The Internet
IPv4
Packet Format
IP Addressing
IP Forwarding and Routing
Fragmentation and Reassembly
IP Helper Protocols
ARP
ICMP
DNS

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

About This Module

Goals of this Module:


Get a first idea of the Internet
Get to know the IP protocol and important support protocols
Useful references:
The bible on TCP/IP: [12] (old, but still great!)
Other references: [4], [11, Part V]
Internet protocols are published as requests-for-comment
(RFC) by the Internet Engineering Task Force (IETF), you
can access them via: http://www.ietf.org/rfc.html
Most of these slides are based on [12]

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
IP Helper Protocols

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

The Internet
The Internet is a packet-switched network
It is a network of networks:
It consists of many different networks, connected by routers
The networks or links can be of any technology:

Ethernet
Optical point-to-point links
Wireless LAN
...
Carrier pigeons (RFC 1149)

It is really large:
The Internet Systems Consortium estimates about 900
million stations (called hosts) as of July 2012
See http://www.isc.org/solutions/survey
It has a fairly complex topology [1]

The Internet

IPv4

IP Helper Protocols

Bibliography

The Internet (2)


The end-to-end principle [10]:
Most intelligent functions should be performed in end
points, not in the network
For example:
The network knows how to deliver packets (using routers)
All functions making this delivery reliable are performed in
the end host, e.g. by the TCP protocol
There is no network-layer mechanism for reliable delivery
Keep the routers simple!

The Internet is standardized by the IETF, the standards are

called RFCs
IETF = Internet Engineering Task Force (www.ietf.org)
RFC = Request For Comment

For the design philosophy see [3]

The Internet

IPv4

IP Helper Protocols

Bibliography

The Hourglass Model for the Internet Protocol Stack

Everything over IP, IP over everything

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
IP Helper Protocols

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

Introduction

IP is specified in RFC 791 and many followup RFCs


It is the network layer protocol of the Internet
Some terminology:
IP packets are called datagrams
End stations are called hosts
IP routers are called routers
IP addresses are assigned to network interfaces:
When a host has three Ethernet adapters, it has three IP
addresses, one for each adapter
Since most hosts have only one adapter, we speak of the IP
address of that host

The Internet

IPv4

IP Helper Protocols

Bibliography

IP Service Best Effort

Basic IP service is datagram delivery


This service is:
Connectionless: no connection or shared state is set up
before datagram delivery starts
Unacknowledged: IP does not use acknowledgements
Unreliable: on IP level no retransmissions are carried out
Unordered: IP does not guarantee in-sequence delivery [2]
This kind of guarantee-nothing service is called best effort

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
Packet Format
IP Addressing
IP Forwarding and Routing
Fragmentation and Reassembly
IP Helper Protocols

Bibliography

The Internet

IPv4

IP Helper Protocols

Packet Format

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

Packet Format (2)


Where applicable (e.g. addresses), header is using big

endian byte ordering (also called network byte order)


The HdrLen field:
specifies the length of IP header as number of 32-bit words
If the Options field does not use a multiple of 32 bits, a

Padding field is used to fill up to 32 bits


When HdrLen > 5, then an Options field is present

The TOS/DSCP field:


TOS = Type Of Service, DSCP = DiffServ Code Point
Allows to mark packets for differentiated treatment to
achieve Quality-Of-Service (QoS), e.g. express priorities
DiffServ [6] is a framework for Internet QoS, another one is
IntServ [13]
Most routers ignore the TOS/DSCP field

The Internet

IPv4

IP Helper Protocols

Bibliography

Packet Format (3)


The TotalLength field:
Gives the total length of datagram in bytes (i.e. up to 65535)
Can be modified during fragmentation and reassembly
The TotalLength field is part of IP header, since some
technologies (Ethernet!) pad up frames to achieve
minimum frame size and do not reverse this
The Identification field:
Uniquely identifies each datagram sent by host / interface
Incremented by source host before sending new datagram
Routers do not touch this field
The Flags field:
Contains two flags relevant for fragmentation and
reassembly (DF, Dont Fragment, and MF, More Fragments)

The Internet

IPv4

IP Helper Protocols

Bibliography

Packet Format (4)


The FragmentOffset field:
is used for fragmentation and reassembly
gives the offset of the current fragment within entire
datagram, in multiples of eight bytes
The HeaderChecksum field:
Is calculated over IP header only, not the data (TCP, UDP
etc. have their own checksums to cover their data)
The Time-To-Live field:
gives upper limit to number of routers a packet can traverse
decremented by each router, forces re-computation of
header checksum
when TTL reaches one and packet cannot be directly
delivered to destination, datagram is discarded and sender
is notified with ICMP message
Typical initial values: 32 or 64

The Internet

IPv4

IP Helper Protocols

Bibliography

Packet Format (5)

The Protocol field indicates the


Protocol field
0x01
0x02
0x04
0x06
0x11

Protocol
ICMP
IGMP
IP-in-IP Encapsulation
TCP
UDP

higher-layer protocol that generated


the payload

This field provides protocol


multiplexing

Some values shown in table

The Internet

IPv4

IP Helper Protocols

Bibliography

Packet Format (6)

The SourceAddress/DestinationAddress fields:


SrcAddr indicates the initial sender of datagram
DstAddr indicates intended final receiver of datagram
Are of 32 bits width
The Options field:
Contains header field for optional IP features
One example option: source routing
Options are rarely used, we will not consider this anymore

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
Packet Format
IP Addressing
IP Forwarding and Routing
Fragmentation and Reassembly
IP Helper Protocols

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

IP Address Representation
IP addresses have a width of 32 bits
They are supposed to be worldwide unique
This is not really true anymore with NAT . . .
IP addresses are written in dotted-decimal notation, e.g.:

130.149.49.77
where decimal (!) numbers are separated by dots
They have an internal structure:

<network-id> <host-id>
where:
<network-id> denotes a network (e.g. an Ethernet)
<host-id> refers to a host within this network

The <host-id> must only be unique w.r.t. its network

The Internet

IPv4

IP Helper Protocols

Bibliography

Important Points
Important Point
A host address is tied to its location in the network, i.e. it is
coupled to network topology. When a host switches to another
network, it obtains another address and ongoing connections
(TCP!) are disrupted IP therefore has no direct support for
mobility!!

Important Point
IP Routing is mostly concerned with networks, i.e. forwarding
tables in routers mostly store <network-id>s it is the
responsibility of the last router on a path to deliver an IP
datagram to a directly connected host.

The Internet

IPv4

IP Helper Protocols

Bibliography

Classful Addressing
Initially so-called classful addressing has been used
IP-addresses are subdivided into four classes:
Class-A addresses: 7 bit network-id, 24 bits host-id, i.e. 128
class-A networks with maximum 224 2 16.7 million
hosts in each network

Class-B addresses: 14 bit network-id, 16 bit host-id

Class-C addresses: 21 bit network-id, 8 bit host-id

Class-D addresses: 28 bits multicast group address

The Internet

IPv4

IP Helper Protocols

Bibliography

Classful Addressing (2)

For each network-id there are two special host-ids:


Host-id with all zeros refers to the network as such
Host-id with all ones is the broadcast address of this
network
Example:
130.149.0.0 refers to a class-B network
130.149.255.255 is broadcast address of this network
130.149.49.123 refers to a particular host in that network

The Internet

IPv4

IP Helper Protocols

Bibliography

Classful Addressing Discussion

These three classes support networks of few distinct sizes


Problems:
With the growth of the Internet class-B addresses were
quickly exhausted, but many of the requesting
organizations do not really have 65534 hosts, these often
were poorly utilized
With the allocation of class-C addresses the routing tables
in Internet core routers quickly became large, which slows
down packet processing!

The Internet

IPv4

IP Helper Protocols

Bibliography

Subnetting
Suppose that an organization:
has a class-B network address, say 130.149.0.0
has its network internally divided into several LANs,
and wants to couple these by routers
First option: allocate a class C address for each LAN
requires additional addresses
increases size of routing tables in core routers
Second option: use class-B address externally, subdivide

this network internally


The whole class-B network is seen by all external networks

only through one border router and one IP address


All internal networks are allocated addresses like

130.149.x.0 with an eight-bit host part


All internal routers and the border router know the internal

network structure and the networks

The Internet

IPv4

IP Helper Protocols

Bibliography

Classless Inter-Domain Routing

CIDR = Classless Inter-Domain Routing


Introduced 1993, specified in RFCs 1518, 1519, 4632
Goal was to address the problems of classful addressing

by giving more fine-grained network sizes


CIDR runs in conjunction with more modern routing

protocols like OSPF, RIP-2 or BGP


In CIDR a network is specified by two values:
A 32 bit network address
A 32 bit network mask (netmask)

The Internet

IPv4

IP Helper Protocols

Bibliography

CIDR Netmask

For a given 32-bit IP address the netmask specifies which

bits belong to the network-id and which bits belong to the


host-id
The netmask consists of 32 bits, the left k bits are ones,

the remaining bits are zeros


Examples:
Netmask
11111111.11110000.00000000.00000000
11111111.11111111.00000000.00000000
11111111.11111111.11100000.00000000
11111111.11111111.11111110.00000000

Shorthand
/12
/16
/19
/23

The Internet

IPv4

IP Helper Protocols

Bibliography

CIDR Netmask (2)


Example: we are given the host address 192.168.40.3

and the netmask /24, then the hosts network address can
be computed as:
AND

11000000.10101000.00101000.00000011
11111111.11111111.11111111.00000000
11000000.10101000.00101000.00000000

192.168.40.3
/24
192.168.40.0

The same example, now with netmask /21:


AND

11000000.10101000.00101000.00000011
11111111.11111111.11111000.00000000
11000000.10101000.00101000.00000000

192.168.40.3
/21
192.168.40.0

In both examples the network addresses are the same, but

the networks are of different size


To fully specify a network, one gives both network address

and netmask, e.g.:


192.168.40.0/21

The Internet

IPv4

IP Helper Protocols

Bibliography

CIDR Netmask (3)

In the network 192.168.40.64/28 there are 14

addresses available:
The netmask leaves four bits for the host-id, i.e. 16 values
The value 0000 is part of the network-id
The value 1111 is the broadcast address for this network

The Internet

IPv4

IP Helper Protocols

Bibliography

Supernetting
Suppose an organization has allocated 16 networks of size

/24 with contiguous network addresses, e.g.:

130.149.64.0/24
130.149.65.0/24
...
130.149.79.0/24

With supernetting:
these networks are summarized under the network address
130.149.64.0/20
Routers outside any of these networks only have an entry
for 130.149.64.0/20 instead of 16 entries
Can you figure out the formal conditions under which

supernetting is allowed?

The Internet

IPv4

IP Helper Protocols

Bibliography

Reserved IP address blocks


Address Block
10.0.0.0/8
127.0.0.0/8
169.254.0.0/16
172.16.0.0/12
192.168.0.0/16

Current Usage
Private-use IP networks
Host loopback network
Link-local for point-to-point links (e.g. dialup)
Private-use IP networks
Private-use IP networks

(from: [8])

Private-use IP addresses are often used for broadband

clients or by NAT boxes


The traditional loopback address of a host is

127.0.0.1, but any address from the 127.0.0.0/8


network serves the same purpose
Packets with private addresses are not routed in the public

internet, only within the provider network

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
Packet Format
IP Addressing
IP Forwarding and Routing
Fragmentation and Reassembly
IP Helper Protocols

Bibliography

The Internet

IPv4

IP Helper Protocols

Simplified Packet Processing

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

Simplified Packet Processing (2)


Packet processing chain is followed in routers and hosts
Incoming packets are checked for correctness and stored

in IP input queue correctness includes:


right value in IP version field
correct IP header checksum

Next, packet options are processed:


Options are rarely used
Special case: source routing option, then packet is
delivered to IP output stage
Next, it is checked if packet is destined to this host / router

or to broadcast address
If so, protocol demultiplexing is carried out
The protocol field in IP header is checked for its value
Packet payload is delivered to the software entity
implementing the indicated higher-layer protocol
Packet is not processed any further!

The Internet

IPv4

IP Helper Protocols

Bibliography

Simplified Packet Processing (3)


If the packet is not destined to this host/router:
If packet forwarding is not enabled, the packet is dropped
Otherwise:
Check if packet is destined to a directly reachable station
(e.g. on same Ethernet) if so, deliver packet directly
If packet is not destined to directly reachable station, consult
forwarding table to determine next hop / outgoing interface
Decrement TTL value, drop packet when it reaches zero
Recompute packet header checksum (why?)
Hand packet over to outgoing interface

Forwarding table is maintained by a routing daemon, i.e.

a process executing a routing protocol


Note that datagrams to be routed can come from local

applications or from other hosts via IP input queue


Linux commands to inspect / modify forwarding table:
netstat
route

The Internet

IPv4

IP Helper Protocols

Bibliography

Forwarding Table Contents


Each entry in the forwarding table contains:
Destination IP address, which can be either:
a full host address (i.e. non-zero host-id)
a network address, with netmask

depending on the value of a flag


Information about next hop, either:
IP address of next-hop router (must be directly reachable)
IP address of directly-connected network (network address)
Flags:
A flag telling whether destination IP is host or network
A flag telling whether next hop is a router or directly attached
network
Specification of outgoing interface

The Internet

IPv4

IP Helper Protocols

Bibliography

Forwarding
From forwarding table structure it is clear that a host /

router does not know the full path, but only next hop
Forwarding table lookup for a packet with destination IP
address dst proceeds in three stages:
First look for an entry that is a full-host address matching

dst if found, send packet to indicated next hop / outgoing


interface and stop processing
This is not used very often
Next look for an entry that is a network address matching

dst if found, send packet to indicated next hop / outgoing


interface and stop processing
Finally look for special default entry if found, send packet
to indicated next hop (the default router) and stop
processing
Otherwise drop packet, send ICMP message back to
original sender of datagram

The Internet

IPv4

IP Helper Protocols

Bibliography

Forwarding Tables in Hosts


Most end hosts leverage the default route mechanism:
An end host can differentiate between packets to local
destinations and to all other destinations
Question: suppose an end host has address
130.149.49.77 and is part of a /24 network how does it
check whether a destination address dst belongs to another
host in the same network?
Packets to local destinations are delivered directly
Packets to all other destinations are sent to default router

Therefore, forwarding tables in end hosts can be made out

of just two entries:


One entry for the local network
The default route

The default route must be configured

The Internet

IPv4

IP Helper Protocols

Bibliography

Forwarding Tables in Routers

Most routers at the border of the Internet only have

routing tables for a subset of all networks attached to the


Internet, for all other networks they rely on default routers
Some routers in the core:
do not have a default router
are the default routers of other routers
must know (almost) all the Internet networks

The Internet

IPv4

IP Helper Protocols

Bibliography

Routing Interior and Exterior Gateway Protocols


Applying any distance-vector or link-state routing protocol

to the whole Internet is hardly feasible


Routing tables in routers would become too large
Internet dynamics would keep routing protocols busy all the

time (triggered updates)


The Internet therefore is subdivided into autonomous

systems (AS):
An AS is administered by one authority
An AS has a unique 16-bit identifier
Examples: a University campus, a corporation

Routing protocols that route . . .


within an AS are called interior gateway protocols
across AS are called exterior gateway protocols

The Internet

IPv4

IP Helper Protocols

Bibliography

Routing Interior Gateway Protocols


An AS can choose any of the interior gateway protocols to

determine routes between routers in the same AS


Some interior gateway protocols:
RIP and RIP-2 (Routing Information Protocol):
Defined in RFCs 1058 and RFC 1388
Both are distance-vector protocols, metric is hop-count
RIP-2 contains improvements to address DV problems
EIGRP (Extended Interior Gateway Routing Protocol):
Defined in [5], used in CISCO routers
Loop-free distance-vector protocol
OSPF (Open Shortest-Path First):
Defined in RFC 1247
It is a link-state protocol

Note that all these protocols use shortest-path algorithms

The Internet

IPv4

IP Helper Protocols

Bibliography

Routing Exterior Gateway Protocols


Exterior gateway protocols (EGP) are used between

routers belonging to different AS


Major EGP: BGP (Border Gateway Protocol)
Defined in RFCs 1267 and 1268
BGP-V4: RFC 4271
A BGP router stores for networks in foreign AS a number

of pathes towards this network:


Such a path lists all intermediate AS, not individual routers
These paths are not determined based on costs only, but

also based on a policy


Policies are specified based on political or economical

considerations

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
Packet Format
IP Addressing
IP Forwarding and Routing
Fragmentation and Reassembly
IP Helper Protocols

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

On the Choice of Packet Size


The link-layer technologies underlying IP offer many

different maximally allowed packet sizes, e.g.:

Ethernet: 1500 bytes


Gigabit Ethernet: 9000 bytes
IEEE 802.11 WLAN: 2312 bytes
ISDN: 576 bytes

This maximum size is also known as maximum

transmission unit (MTU)


Higher-layer protocols (TCP, UDP) and applications should
not be required to know these maximal sizes:
One reason: software hygiene, separation of concerns
Another reason: it is not well defined:
Different packets of the same flow can take different routes
A packet can use different technologies while on travel
Even if all packets go the same route, this route can change
due to link failures / restores

The Internet

IPv4

IP Helper Protocols

Bibliography

Fragmentation and Reassembly


IP hides this from upper layers, offers own maximum

message length of 65515 bytes to higher layers


65515 = 65535 - 20, 20 bytes is minimum size of IP header

To cope with smaller MTUs:


Sender IP instance partitions message into fragments
Fragment size is chosen as MTU of outgoing link
Each fragment is transmitted individually as a full IP packet,
with header information specifying that this is a fragment
and giving the position of fragment in whole message
IP instance at destination buffers received fragments,
re-assembles message and delivers to higher layers

Question
Would it be useful to have intermediate IP routers perform
reassembly?

The Internet

IPv4

IP Helper Protocols

Bibliography

Fragmentation and Reassembly (2)


In addition, every intermediate router can:
fragment a full message
further fragment a fragment

when necessary for transmission on next hop


When the destination receives the first fragment, it:
Allocates buffer large enough for whole message
Starts a timer
When all fragments arrive before timer expiration:
Timer is canceled
Re-assembled packet is handed over to higher layers
Buffer is de-allocated
When timer expires before all fragments have arrived:
The already received fragments are dropped, buffer is freed
ICMP message (type 11, code 1) is sent to source host

The Internet

IPv4

IP Helper Protocols

Bibliography

Some Details
Every message handed over to IP from higher layers has

its own identifier


See identification field in IP header

All fragment datagrams belonging to same message have:


A full IP header
The same identification field
A TotalLength field reflecting the fragment size
Different values for FragmentOffset field (reflecting the
start of the present fragment within the whole message):
FragmentOffset specifies offset in multiples of 64 bits
The MF (more-fragments) bit set, except for the last

fragment, which has non-zero FragmentOffset

The Internet

IPv4

IP Helper Protocols

Bibliography

Some Details: The DF bit

By setting the DF (dont fragment) bit in the IP header a

source node forbids fragmentation by intermediate routers


When a router receives a datagram with DF set, it:
Checks whether outgoing link for this packet has an MTU

large enough to transmit the packet


If so, the packet is transmitted onto next hop
If not, the router drops the datagram and returns an ICMP

datagram to original IP source


ICMP with type 2 (destination unreachable) and code 4
(fragmentation required, but DF set)

The Internet

IPv4

IP Helper Protocols

Bibliography

Some Details: The DF bit (2)

Question
How could you use this for the sender to determine the path
MTU, defined as the smallest MTU of all links along a path
between source and destination?

The Internet

IPv4

IP Helper Protocols

Bibliography

Fragmentation and Reassembly Discussion


Fragmentation/Reassembly creates significant overhead:
Several datagrams transmitted per message, each one
having full IP header
Reassembly adds significant complexity to receiver
Upon loss of single fragment the whole message is possibly
re-transmitted by higher layers (TCP!)
Fragmentation and reassembly complicates operation of

application-level firewalls, since these also must implement


reassembly logic
Application-level firewalls look at user data of packets
When user data is spread over several fragments, it must

collect them all


Exception: the part of user data that is of interest is known to
fit in the first fragment

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
IP Helper Protocols

Bibliography

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
IP Helper Protocols
ARP
ICMP
DNS

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

Address Resolution Protocol ARP


IP addresses only have a meaning to IP and higher layers
In an Ethernet, stations have own 48-bit MAC addresses
An Ethernet station picks up a packet only if the destination

MAC address matches its own MAC address (ignoring


broadcast/multicast), IP addresses and other packet
contents are ignored
An IP address is assigned to an Ethernet adapter

Important Question
How do other stations know to which MAC address a given IP
address refers, i.e. to which station an IP packet must be sent
(encapsulated in Ethernet packet)?

The Internet

IPv4

IP Helper Protocols

Bibliography

Address Resolution Protocol ARP (2)


ARP provides a binding service: it determines MAC

address for given IP address


ARP is specified in RFC 826
ARP is not restricted to Ethernet MACs, but in general is
geared towards LANs with broadcast capabilities
ARP is dynamic:
The MAC address for a given IP address does not need to

be statically configured, but the protocol provides a


mechanism to determine this on-the-fly
Advantage: nodes can be moved or equipped with new
MAC adapters without any re-configuration
Disadvantage: a separate protocol is needed, bringing
additional complexity and requiring some bandwidth
There is also a protocol that lets stations find an IP address

for given MAC address, this is called RARP (Reverse ARP)

The Internet

IPv4

IP Helper Protocols

Bibliography

Basic Operation of ARP


Suppose that:
We have two stations A and B attached to the same
Ethernet, having the following addresses:
MAC
IP

Station A
11:11:11:11:11:11
130.149.49.11

Station B
22:22:22:22:22:22
130.149.49.22

Both A and B are in the same IP network

130.149.49.00/24, which is an Ethernet network


Station A wishes to send an IP packet to address

130.149.49.22 and does not yet have any information


about the corresponding MAC address
Each station maintains an ARP Cache, which stores the

mappings from IP to MAC addresses that the station


currently knows about

The Internet

IPv4

IP Helper Protocols

Bibliography

Basic Operation of ARP (2)


Station A broadcasts an ARP-request message

(displayed in wireshark as arp who-has), indicating:


As own IP and MAC address
Bs IP address

Broadcasting means: packet is sent to Ethernet


broadcast address!!
Any host C having an IP address other than
130.149.49.22 simply drops the ARP-request packet
Upon receiving the ARP request, host B (with IP address
130.149.49.22) performs the following actions:
It stores a binding between between As IP and MAC

address in its own ARP cache


It responds with an ARP-reply packet that includes:
Bs MAC and IP address
As MAC and IP address

ARP reply is unicast to As MAC addr. (Why no broadcast?)

The Internet

IPv4

IP Helper Protocols

Bibliography

Basic Operation of ARP (3)


Upon receiving ARP response from B, station A stores a

binding between Bs IP and MAC address in its ARP cache


This procedure is called address resolution
ARP does not make any retransmissions in case the ARP

request is not answered, this is left to higher layers


If a station wants to send an IP packet to a local
destination with address xx.xx.xx.xx, it:
first checks the ARP cache whether a binding for

xx.xx.xx.xx can be found


If so, the packet is encapsulated in an Ethernet frame and

directed to the MAC address found in the ARP cache


Otherwise, the address resolution procedure is started and

the packet is sent when the result is available

The Internet

IPv4

IP Helper Protocols

Bibliography

The ARP Cache

The entries in an ARP cache are soft-state, entries are

typically removed 20 minutes after their creation


Why?
Some implementations restart the timer after each

reference to an ARP cache entry


Under Linux you can inspect your ARP cache with the

command:
/usr/sbin/arp -a
The path to the arp command can vary between systems

The Internet

IPv4

IP Helper Protocols

Bibliography

The ARP Frame Format

(See [12, Sect. 4])

HardType determines the type of MAC addresses used, 0x0001 for Ethernet
48-bit addresses

ProtType determines the higher-layer protocol for which address resolution


needs to be done, value 0x0800 for IP

HardSize and ProtSize specify the size (in bytes) of the hardware and and
protocol addresses they are 6 and 4 for Ethernet and IP

Op distinguishes between ARP-request and ARP-reply, and some other types


(RARP is covered as well)

The remaining four fields are the mentioned address fields

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
IP Helper Protocols
ARP
ICMP
DNS

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

Introduction
ICMP = Internet Control Message Protocol
Specified in RFC 792
This protocol:
Accompanies IP by allowing routers or destination hosts to
inform sender about unusual situations, including:
There is no route to the destination
Destination host exists, but is not reachable
Fragmentation required but DF set
Operates on top of IP, i.e. ICMP messages are

encapsulated in regular IP datagrams


Does not add any additional mechanisms (like error control)

to the IP service
IP sending host must not rely on ICMP messages

The Internet

IPv4

IP Helper Protocols

Bibliography

Message Format

type and code specify actual ICMP message type and sub-type
checksum covers ICMP header and data, with checksum assumed as zero

The Internet

IPv4

IP Helper Protocols

Some type/code Combinations


type
0
3
3
3
3
3
3
3
4
8
11
11

code
0
0
1
2
3
4
6
7
0
0
0
1

Meaning
Echo reply
Destination network unreachable
Destination host unreachable
Destination protocol unreachable
Destination port unreachable
Fragmentation required, but DF bit set
Destination network unknown
Destination host unknown
Source quench (Congestion control)
Echo request
TTL expired in transit
Fragment reassembly time exceeded

There are many more, e.g. for router advertisements, information about
malformed IP packets, etc.

It is implementation-dependent, which ICMP messages are generated


ICMP messages are often suppressed by firewalls, otherwise too much
information about internal network structures could be revealed

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

Some type/code Combinations (2)


Source-quench (type=4, code=0):
generated by an IP router when it has to drop a packet
because of congestion
Intention is to let source host throttle its rate
TTL expiration (type=11, code=0):
generated by an IP router when it drops a packet because
its TTL value reached zero
Fragment reassembly timeout (type=11, code=1):
Generated by destination when not all fragments of a
message have been received within timeout
Used to invite higher-layer protocol at sending host to
re-transmit message
IP itself does not perform any retransmission!

The Internet

IPv4

IP Helper Protocols

Bibliography

Some type/code Combinations (3)


The destination-unreachable messages (type=3):
code=0 (destination network unreachable) and code=1
(destination host unreachable): generated when:
router finds that the cost to reach a non-directly connected
host are infinity (e.g. are link failure)
router could not deliver datagram to directly connected host
code=2 (protocol unreachable): IP datagram refers to

non-existent higher-layer protocol in destination (compare


protocol-type field)
code=3 (port unreachable): used with TCP / UDP
code=6 (destination network unknown) and code=7
(destination host unknown): generated when:
a router could not determine a next-hop to a non-directly
connected host or network
In these messages first 32 bits of the variable ICMP

message part are 0, following bytes contain IP header and


first few bytes of offending IP datagram

The Internet

IPv4

IP Helper Protocols

Outline

The Internet
IPv4
IP Helper Protocols
ARP
ICMP
DNS

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

Names vs. Addresses


Names denote / refer to things
In general: persons, cats, ships, . . .
In networks: nodes, networks, data, transactions, . . .
Often, but not always: names are unique
Addresses: information needed to find these things
Street address, IP address, MAC address
Often, but not always, unique
Addresses often have hierarchical structure to support their
intended use, e.g. in routing protocols
Binding services: map names to addresses or vice versa

(also called name resolution)


Example: DNS maps www.canterbury.ac.nz to

132.181.2.23
See [9, 7] for more about names and addresses

The Internet

IPv4

IP Helper Protocols

Bibliography

The Domain Name Service (DNS)


Initial specifications in RFC 1034 and RFC 1035
DNS is responsible for mapping human-readable names to

addresses, it is a binding service


DNS is used solely by applications, it has no role in the

TCP/UDP/IP protocols themselves


It has a distributed implementation:
It consists of several name servers, which assist end hosts
in mapping a name to an address
No name server has the full knowledge of all bindings that
exist in the Internet
Besides mapping names to IP addresses it has additional

functions:
It allows to return an email server address for a given host
It allows to manage alias names for hosts

It is also possible to perform reverse lookup, i.e. mapping

IP addresses to names

The Internet

IPv4

IP Helper Protocols

The DNS Name Space

(compare [12, Fig. 14.1])

Bibliography

The Internet

IPv4

IP Helper Protocols

Bibliography

The DNS Name Space (2)


The name space is hierarchical:
Arranged as a tree made of nodes
Each node has label of up to 63 characters
The domain name of any node is the (unique) list of all
labels that connect it with the unnamed root
All immediate children of a node must have distinct names
In the written representation a full host name is

represented by its name, followed by its domain, all labels


are separated by .
Example:

www.canterbury.ac.nz
Here:
www is the host name
canterbury.ac.nz is its domain name

The Internet

IPv4

IP Helper Protocols

Bibliography

DNS Zones

A zone is a sub-tree of the namespace that is

administered separately
Example: ac.nz
A zone can be sub-divided into further zones, e.g. there

could be zones:
canterbury.ac.nz
massey.ac.nz

For each zone multiple nameservers must be provided by

the administrative owner of that zone

The Internet

IPv4

IP Helper Protocols

Bibliography

DNS Nameservers
A nameserver keeps a table of all name 7 IP-address

mappings in a zone
When new host is added, administrator allocates name and

IP address and enters them into table


When host is removed, table entry is deleted as well

There are primary nameservers and secondary

nameservers:
These are independent and redundant servers
Reason: fault tolerance
A primary nameserver reads the mapping table from a file
A secondary nameserver reads mappings from primary
nameserver (zone transfer)
Secondary nameservers update their tables regularly
against a primary nameserver

A nameserver can handle several zones

The Internet

IPv4

IP Helper Protocols

Bibliography

DNS Client Side


Applications in hosts are DNS clients
A DNS resolver library is linked to an application
Under UNIX:
see man page for gethostbyname for a C binding to the
resolver
nslookup is a command-line version of the resolver
The resolver reads a configuration file (often found under
/etc/resolv.conf, which contains a line like
nameserver 130.149.14.12
The resolver uses the nameserver(s) specified in

/etc/resolv.conf to perform name resolution

The Internet

IPv4

IP Helper Protocols

Bibliography

DNS Query Process


The host hands over a name to its local resolver
Example: www.canterbury.ac.nz
The resolver library sends a request to its nameserver
The nameserver:
Checks if the requested name is in its zone table
If so, it returns a response to the resolver, which includes
the name 7 IP-address binding
Otherwise, it contacts a root name server
Currently there are 13 known root servers
The nameserver must know IP addresses of all of them as
part of its configuration
The root server returns name and address of a nameserver

responsible for the top-level domain of the request


Here: nz

The Internet

IPv4

IP Helper Protocols

Bibliography

DNS Query Process (2)

Continuation:
It next connects to the nameserver for zone nz, which
returns name and address of the nameserver for zone
ac.nz
It next connects to the nameserver for zone ac.nz which
returns name and address of the nameserver for zone
canterbury.ac.nz
It next connects to the nameserver for zone
canterbury.ac.nz which then returns the IP address for
host www.canterbury.ac.nz

The Internet

IPv4

IP Helper Protocols

Bibliography

DNS Caching

A nameserver is required to store a name 7 IP-address

mapping for a time that is indicated in the response of the


final nameserver
Caching

Reason: when same name needs to be resolved short

time later, it is not necessary to again involve all


nameservers, the query can be handled from cache
Load reduction on root name servers

The Internet

IPv4

IP Helper Protocols

Bibliography

David Alderson, Lun Li, Walter Willinger, and John C. Doyle.


Understanding Internet Topology: Principles, Models and Validation.
IEEE/ACM Transactions on Networking, 13(6):12051218, December 2005.
Jon C. R. Bennett, Craig Partridge, and Nicholas Shectman.
Packet Reordering is Not Pathological Network Behaviour.
IEEE/ACM Transactions on Networking, 7(6):789798, December 1999.
David D. Clark.
The design philosophy of the darpa internet protocols.
ACM Computer Communication Review, 18(4):106114, August 1988.
Douglas E. Comer.
Internetworking with TCP/IP Principles, Protocols and Architecture, volume 1.
Prentice Hall, Englewood Cliffs, New Jersey, third edition, 1995.
J. J. Garcia-Luna-Aceves.
Loop-free routing using diffusing computation.
IEEE/ACM Transactions on Networking, 1(1):130141, February 1993.
Kalevi Kilkki.
Differentiated Services for the Internet.
Macmillan Technical Publishing, Indianapolis, 1999.
James F. Kurose and Keith W. Ross.
Computer Networking A Top-Down Approach Featuring the Internet.
Addison-Wesley, Boston, third edition, 2001.

The Internet

IPv4

IP Helper Protocols

Bibliography

Deepankar Medhi and Karthikeyan Ramasamy.


Network Routing Algorithms, Protocols, and Architectures.
Morgan Kaufmann, San Francicso, California, 2007.
J. H. Saltzer.
Naming and binding of objects.
In R. Bayer, R. M. Graham, and G. Seegmller, editors, Operating System An
Advanced Course, Lecture Notes in Computer Science, pages 99208. Springer,
1978.
Jerome H. Saltzer, David P. Reed, and David D. Clark.
End-to-end arguments in system design.
ACM Transactions on Computer Systems, 2(4):277288, November 1984.
William Stallings.
Data and Computer Communications.
Prentice Hall, Englewood Cliffs, New Jersey, fourth edition, 2006.
W. Richard Stevens.
TCP/IP Illustrated Volume 1 - The Protocols.
Addison-Wesley, Boston, Massachusetts, 1995.
Paul P. White and Jon Crowcroft.
The integrated services in the internet: State of the art.
Proceedings of the IEEE, 85(12):19341946, December 1997.

S-ar putea să vă placă și