Documente Academic
Documente Profesional
Documente Cultură
Chapter 1
Application Layer
his chapter presents some of the important functions of application layer. Mainly we discus DNS, Email and WWW.
1.1 INTRODUCTION
Application layer is the outermost layer of the TCP/IP architecture. This layer is responsible for many of the user applications such as WWW, EMAIL, FTP, DNS etc, In this chapter the reader will get a basic understanding of the concepts of some popular application layer functions. The lower layers of the TCP/ IP model does support for transport. However, there is still need for some transport functions at application layer which an essential for application to RUN. One for the most important one is DNS.
2
l
we come across many applications which primarily depend upon the IP address. There is a need to map IP address to a common Generic Name. So ASCII characters were introduced to replace IP address. Example abc@cs.sjce.ac.in where Cs.sjce.ac.in is mapped to the IP address, However, such a mapping should not result in conflict. Obviously a centralized system can not work because of large number of host names all over the world. DNS was introduced to alleviate the above problems
l l
It is a hierarchical scheme It employs domain based naming. It uses a distributed database. Primarily maps host names and E-mail servers to IP addresses. Found in RFC 1034, 1035
Working of DNS
An application program calls a library procedure called Resolver with its domain name as parameter. The Resolver sends an UDP packet to the local DNS server. The DNS server searches its table and returns the IP address which matches the Domain name. Armed with that, the program can establish a TCP connection or send UDP packets.
n n n
Each top-level domain is further divided into subdomain. Each subdomain is further divided into one or more levels of subdomains. Top level domain can be split into two major classes.
l l
Generic Country
Generic
Countries
Int
edu yale
gov
mil
org
net
jp
us ac co
cs eng
Ai linda Robot
Domain names are hierarchically arranged and are separated by periods. Eg. exgg@cs.yale.edn and xyz@cs.sjce.ac.in
n
Domain names are case insensitive for example, Edu, EDU, edu are same. Each component name can be 63 characters and full path name should not exceed 255 characters. Naming follows organizational boundaries and not physical networks.
Resource Records:
n n n n
Every domain is associated with a set of records with it. To every enquiry, the Resolver will be supplied with Resource Records. Thus the primary function of DNS is to map the Domain Names into Resource Records. Resource Records have five components.
% Domain Name % Time to live % Class % Type % Value I Domain name tell the domain to which this record applies. It is the primary search key.
II Time to live: Indicates how stable the record is. Most stable record has 86400 (the number of seconds in 1 day). Unstable records have a duration of 60 (1 minute). III Class: Its value for internet information, is IN, other codes are used for other application. IV Type: Tells what kind of record this is, SOA A MX NS CNAME PTR HINFO TXT Start of Authors IP address Mail Exchange Name Server Canonical name Pointer Host description Text Parameters for this zone 32 bit integer Domain willing to accept e-mail Name of a server for this domain Domain name Alias for an IP CPU/OS ASCII character
1) SOA 2) A 3) MX 4) NS
: Gives the name of the name server zone. : Gives IP address : Gives details about mail server name : Gives name server
5) CNAME : Helps in connecting the DNS entry Cs.mit.edu 86400 IN CNAME Ics.mit.edu 6) PTR : Pointer to another name
7) HINFO : OS/CPU details. Name Server DNS name space is divided into non-overlapping zones. Each zone contains some part of the tree and contains name server holding the information. Zones have one primary server and secondary server. Consider the example shown below on cs.stanford.edu
ROOT
EDU
STANFORD
CS
6
We have Domain Name DN : Edu DN : Stanford.edu DN : cs.stanford.edu A Domain name may be fully qualified or partially qualified,
FQDN: Fully qualified domain name contains full name of a host, contains all labels. PQDN: Contains few labels. DOMAIN is a sub tree of domain name space.
Com
Domain name space is huge, so it is distributed among many DNS servers. Root Server
Arpa Server
Edu Server
Com Server
US Server
mcgrawhill.com CISCO.com
Iterative Resolution
edu
mit.edu
1 2
client Stanford.edu
Figure 1.5 Iterative resolution
Whenever the DNS server receive a packet, it will check whether it is an authority for the same, if so it will send the answer to Resolver. If it is not, then it will send the IP address of another server, which it thinks can resolve the DNS query. The client then sends DNS request to the new DNS server, if it is an authority, it gives the IP address else. It sends the IP address of anther DNS server. This process is called iterative resolution. The same idea is given in the figure 1.5
Catching
Whenever the DNS server gets a query for a name resolution, which is not in its domain, it searches its databases for server IP address and it is cached. Whenever a similar query is encountered, it first check the cache and return the answer. This increases the speed. TTL is a number in sec ( time in sec) for which the server can cache the information). After this time the information is invalid and any query should be sent again to authoritative server.
1.3 EMAIL
Architecture and Services
Electronic mail is the most widely used tool in the present world for fast and reliable communication. It is based on RFC 822. It has two components from architectural point of view.
8
User agent (UA) EMAIL Message Transfer Agents.
User agents are local programs that provide that can provide Command based or Menu based or Graphical method for interacting with email system. Message Transfer agents are system daemons (processes that run in background) that move e-mail through the system.
Composition
n n n
Some UA have memo/Icon driven interface while other may use commands.
Sending an E-mail:
User must provide a message, destination address and other data.
n n n n
Messages are generated using built in editors. Destination address follows a format user@dns-address. Instead of entering the full address, one can write alias. Email can also be sent to a mailing list.
Reading Email: When a user agent in invoked, it looks for mail box and display the information
Flags
Bytes
Each line of the display is extracted from the mail envelope or corresponding header. The user can use any of the following commands R: Reply C: Compose D: Delete F: Forward E: Exit
The message transfer system is concerned with relaying messages from the originator to the recipient. The simplest way to do this is to establish a transport connection from the source machine to the destination machine and then just transfer the message
11
The first three lines are from telnet telling you what it is doing. The last line is from the SMTP server on the remote machine announcing its willingness to talk to you and accept e-mail.
POP3
Unfortunately, this solution creates another problem: how does the user get the email from the ISPs message transfer agent? The solution to this problem is to create another protocol that allows user transfer agents ( on client PCs) to contact the message transfer agent ( on the ISPs machine) and allow email to be copied from the ISP to the user. One such protocol is POP3 ( Post office Protocol Version 3) , which is described in RFC 1939. The situation that is used to hold ( both sender and receiver having a permanent connection to the Internet) is illustrated in fig 1.19.
Message transfer agent
SMTP
Internet
User agent
Sending host
Receiving host
.
Sending host
Permanent connection
12
POP3 begins when the user starts the mail reader. The mail reader calls up the ISP ( Unless there is already a connection) and establishes a TCP connection With the message transfer agent at port 110. Once the connection has been established, the POP3 protocol goes through three states in sequence: 1. Authorization. 2. Transactions. 3. Update. The authorization state deals with having the user log in. The transaction state deals with the user collecting the emails and marking them for deletion from the mailbox. The update state actually causes the emails to be deleted. This behavior can be observed by typing something like: telnet mail.isp.com 110 Where mail.isp.com represents DNS name of your ISPs mail server. Telnet establishes a TCP connection to port 110, on which the POP3 server listens. Upon accepting the TCP connection, the server sends an ASCII message announcing that it present. Usually , it begins with +OK followed by a comment . An example scenario is shown in fig 7-16 starting after the TCP connection has been established . As before, the lines marked C: are from the client ( User) and those marked S: are from the server ( message transfer agent on the ISPs machine). During the authorization state, the client sends over its user name and then its password. After a successful login, the client can then send over the LIST command, which causes the server to list the contents of the mailbox, one message per line, giving the length of that message . The list is terminated by a period. Then the client can retrieve messages using RETR command and mark them for deletion with DELE. When all messages have been retrieved ( and possibly marked for deletion), the client gives the QUIT command to terminate the transaction state and enter the update state. When the server has deleted all the messages , it sends a reply and breaks the TCP connection. While it is true that the POP3 protocol supports the ability to download a specific message or set of messages and leave them on the server, most email programs just download everything and empty the mail box. This behavior means that in practice, the only copy is on the users hard disk . If that crashes, all email may be lost permanently.
13
- It is a powerful GUI that presents the required information in a attractive way for the user. - First graphical browser was developed by Marc Andreessen of University of Illinois in the year 1993 and named it as Mosaic. - Netscape is another browser ( 1994) - IE is the default windows based browser.
Architectural Overview
From users point of view, www is a collection of web documents ( Web Pages ) or simply pages. - Each page contain links to other pages which may be present elsewhere on machines in the internet. - The idea of one page pointing to another page is called hypertext. - Pages are viewed with a program called browser, browser fetches the requested pages, formats and displays. - On a web page, strings of texts which an underlined are links to other web pages. These are called hyperlinks. - Page fetching is done by the browser. Users work is only to click the mouse button on the required link.
Server No.1 Server No.2 xyz.com Server 3 abc.com
abc.com xyz.com ws D ws D ws D
14
The entire process of obtaining the web pages on the client machine in response to a click on the URL falls into two major processes. 1 The dynamics happening on the client machine 2 The dynamics happening on the server machine. These are described below.
15
Web browser is actually an html interpreter. Browser have many buttons which will provide facility for easy navigation, Eg - Previous page - Next page - Book Mark etc. - Not all web pages contain HTML - Web pages may also include PDF, JPEG, MP3, MPEG data. - So a general approach is used to represent them. The server along with the web pages also sends additional information about the page. This information uses MIME format. ( RFC 1341 ) Multipurpose Internet Email Extension. - Whenever the browser encounters a Format not available readily, it consults its MIME table to understand how to display the page.
Use of plug in
A plug in is a code module that the browser fetches from a special directory in the disk and installs this as an extension to itself. Thus plug-ins runs inside the browser and therefore it has access to the current page and display it. The plug - ins are removed after the application.
Browser Runs as a Single process BROWSER Base code Browser interface used by plug- in Plug in interface used by browser PLUG-IN
16
- There will be codes which are specific to browsers plug-in. Plug-in are pre- installed in windows. - On Unix the installer is a shell script.
Helper application
This is another way to display the MIME documents. Helper application is a separate process. Helper application are large programs that do not have any 1/f with browser . They accepts the name of the file and simply open them. Eg. Adobe, word. So URL can directly point to a PDF or a word document directly. Other helper application include Adobe photo shop - image | x- Photoshop Real one player audio | mp3
Browser can also fetch local files. They heavily depend upon file extension than those Mime types.
17
verifies the cache and respond if the file exists else it invokes disk search and caches the file and also send the file to the client. At any instant of time t out of k modules, K-X modules may be few to take requests, X modules may be in the queue waiting for disk access and cache search. If the number of disks are enhanced then it is possible to enhance the speed.
Front end
In coming request.
Each Module does the following. 1. Resolve the name of the Web page requested. Eg: http:// www.cisco.com There is no file name here. Default is index .html. 2. Authentication of client needed because some pages are not available for public. 3. Perform access control on the client check to see if there are any restrictions. 4. Perform access control on the web page. Access restrictions on the page itself. 5. Check the cache. 6. Fetch the requested page.
18
7. Determine MIME type 8. Take care of miscellaneous address ends. ( Building User profile, Satisfaction.) 9. Return the reply to the client. 10. Make an entry in the server log.
if too many requests come in each second, the CPU will not be able to handle the processing load, irrespective of no of disks in parallel. The solution is to add more machine with replicated disks. This is called server form. A front end still accepts the request and sprays them to all CPUs rather than multiple threads to reduce the load on that machine. Individual machines are again Multithreaded with Multiple disks
Router
It is to be seen that cache is local to each machine. TCP connection should terminate at processing node and not at front end.
URL
An URL need to address 1. What is the page called ? 2. Where the page is located ? 3. How can the page be accessed ? http:// www.sjce.ac.in http:// www.sjce.ac.in/~tnn
19
Some of the common URLS include, http, ftp, file, mailto, telnet, etc.
1.5 SUMMARY
In this chapter we presented technologies pertaining to some popular application layer functions. We provided the details about the DNS, Email and WWW. More details on these and other applications such as http, ftp, telnet etc can read from the reference book.
1.6 QUESTIONS
1. 2. 3. 4. 5. 6. 7. 8. 9. List the functions of application layer What is DNS? Explain the working of DNS with an example What is iterative resolution? Give example What are the functions of the user agent in the Email architecture? Discuss the sending and receiving process in Email? Write a brief note on SMTP? Discuss the working of POP3 in an Email system? What are its limitations? What is www?
10. Discuss the architecture of WWW? 11. Explain the client side and server side events when a user click on a URL?
20
Chapter 2
Routing Protocols
To discuss the two types of connections for effecting datagram transfer between networks Discuss direct and indirect Routing Discuss different Routing protocols
2.1 INTRODUCTION
ne of the main objectives of the network layer is to deliver the packets to the destination. The delivery of packets is often accomplished using either a connection-oriented or a connectionless network service. In a connection-oriented approach, the network layer protocol first makes a connection with the network layer protocol at the remote site before sending a packet. When the connection is established, a sequence of packets from the same source to the same destination can be sent one after another. In this case, there is a relationship between packets. They are sent on the same path where they follow each other. A packet is logically connected to the packet traveling before it and to packet traveling after it. When all packets of a message have been delivered, the connection is terminated. In a connectionoriented approach, the decision about the route of a sequence of packets with the same source and destination addresses can be made only once, when the connection is established. The network device will not compute the route again and again for each arriving packet. In a connectionless situation, the network protocol treats each packet independently, with each packet having no relationship to any other packet. The packets in a message may not travel the same path to their destination. The internet protocol (IP) is a connectionless protocol. It handles each packet transfer in a separate way. This means each
20
21
packet travel through different networks before settling to their destination network. Thus the packets move through heterogeneous networks using connection less IP protocol.
Host 1
Host 2
Network
Host 3
The sender can easily determine if the delivery is direct. It can extract the network address of the destination packet (Mask all the bits of the Host address) and compare this address with the addresses of the networks to which it is connected. If a match is found, then the delivery is direct. In direct delivery, the sender uses the destination IP address to find the destination physical address. The IP software then delivers the destination IP address with the destination physical address to the data link layer for actual delivery. In practical sense a protocol called address resolution protocol (ARP) dynamically maps an IP address to the corresponding physical address. It is to be noted that the IP address is a FOUR byte code where as the Physical address is a SIX byte code. The Physical address is also called as MAC address, Ethernet address and hardware address. When the network part of the IP address does not match with the network address to which the host is connected, the packet is delivered indirectly. In an indirect delivery, the packet goes from router to router until it reaches the one connected to the same physical network as its final destination ( Figure 2.1).
l
Note that a delivery always involves one direct delivery but zero or more indirect deliveries.
22
l
Note also that the last delivery is always a direct delivery. In an indirect delivery, the sender uses the destination IP address and a routing table to find the IP address of the next router to which the packet should be delivered. The sender then uses the ARP protocol to find the physical address of the next router. Note that in direct delivery, the address mapping is between the IP address of the final destination and the physical address of the final destination. In an indirect delivery, the address mapping is between the IP address of the next router and the physical address of the next router.
H o st 1
Ho st 2
H o st 3
Ro u te r
R ou ter
N e tw ork 3
Net w ork 1
N et work 2
H os t 4 H o st 5
Figure 2.2 Indirect Delivery
H ost 6
Routing tables are used in the routers. The routing table contain the list of IP addresses of neighboring routers. When a router has received a packet to be forwarded, it looks at this table to find the route to the final destination. However, this simple solution is impossible today in an Internetwork such as the Internet because the number of entries in the routing table make table lookups inefficient. Several techniques can make the size of the routing table manageable and handle such issues as security.
23
Next-hop routing
One technique to make the contents of a routing table smaller is called next-hop routing. In this technique, the routing table holds only the address of the next hop instead of holding information about the complete route. Routing tables are thereby consistent with each other.
Network-specific routing
A second technique to make the routing table smaller and the searching process simpler is called network-specific routing. Here, instead of having an entry for every host connected to the same physical network, we have only one entry to define the address of the network itself. In other words, we treat all hosts connected to the same network as only single entity. For example, if 1,000 hosts are attached to the same network, only one entry exists in the routing table instead of 1,00.
Host-specific routing
In host-specific routing, the host address is given in the routing table. The idea of host-specific routing is the inverse of network-specific routing. Here efficiency is sacrificed for other advantages: Although it is not efficient to put the host address in the routing table, there are occasions in which the administrator wants to have more control over routing. Host-specific routing is a good choice for certain purposes such as checking the route or providing security measure.
Default routing
Another technique used to simplify routing is default routing. In Figure 6.6 host A is connected to a network with two routers. Router R1 is used to route the packets to hosts connected to network N2. However, for the rest of the Internet, router R2 should be used. So instead of listing all networks in the entire Internet, host A can just have one entry called the default (network address 0.0.0.0). A host or a router keeps a routing table, with an entry for each destination, to route IP packets. The routing table can be either static or dynamic.
A static routing table contains information entered manually. The administration enters the route form each destination into the table. When a table is created, it cannot update automatically when there is a change in the Internet. The table must be manually altered by the administrator. A static routing table can be used in a small internet that does not change very much, or in an experimental internet for troubleshooting. It is not a good strategy to use a static routing table in a big internet such as the Internet.
Routing Updates
RIP sends routing-update messages at regular intervals and when the network topology changes. When a router receives a routing update that includes changes to an entry, it updates its routing table to reflect the new route. The metric value for the path is increased by 1, and the sender is indicated as the next hop. RIP routers maintain only the best route (the route with the lowest metric value) to a destination. After updating its routing table, the router immediately begins transmitting routing updates to inform other
25
network routers of the change. These updates are sent independently of the regularly scheduled updates that RIP routers send.
RIP Timers
RIP uses numerous timers to regulate its performance. These include a routing-update timer, a routetimeout timer, and a route-flush timer. The routing-update timer clocks the interval between periodic routing updates. Generally, it is set to 30 seconds, with a small random amount of time added whenever the timer is reset. This is done to help prevent congestion, which could result from all routers simultaneously attempting to update their neighbors. Each routing table entry has a route-timeout timer associated with it. When the route-timeout timer expires, the route is marked invalid but is retained in the table until the route-flush timer expires.
Packet Formats
The following section focuses on the IP RIP and IP RIP 2 packet formats illustrated in Figures 2.3 and 2.4. Each illustration is followed by descriptions of the fields illustrated.
26
An IP RIP Packet Consists of Nine Fields
The following descriptions summarize the IP RIP packet format fields illustrated in Figure 2.3 CommandIndicates whether the packet is a request or a response. The request asks that a router send all or part of its routing table. The response can be an unsolicited regular routing update or a reply to a request. Responses contain routing table entries. Multiple RIP packets are used to convey information from large routing tables. Version numberSpecifies the RIP version used. This field can signal different potentially incompatible versions. ZeroThis field is not actually used by RFC 1058 RIP; it was added solely to provide backward compatibility with pre-standard varieties of RIP. Its name comes from its defaulted value: zero. Address-family identifier (AFI)Specifies the address family used. RIP is designed to carry routing information for several different protocols. Each entry has an address-family identifier to indicate the type of address being specified. The AFI for IP is 2. AddressSpecifies the IP address for the entry. MetricIndicates how many internetwork hops (routers) have been traversed in the trip to the destination. This value is between 1 and 15 for a valid route, or 16 for an unreachable route.
Figure 2.4 : An IP RIP 2 Packet Consists of Fields Similar to Those of an IP RIP Packet
27
The following descriptions summarize the IP RIP 2 packet format fields illustrated in Figure 2.4 : CommandIndicates whether the packet is a request or a response. The request asks that a router send all or a part of its routing table. The response can be an unsolicited regular routing update or a reply to a request. Responses contain routing table entries. Multiple RIP packets are used to convey information from large routing tables. VersionSpecifies the RIP version used. In a RIP packet implementing any of the RIP 2 fields or using authentication, this value is set to 2. UnusedHas a value set to zero. Address-family identifier (AFI)Specifies the address family used. RIPv2s AFI field functions identically to RFC 1058 RIPs AFI field, with one exception: If the AFI for the first entry in the message is 0xFFFF, the remainder of the entry contains authentication information. Currently, the only authentication type is simple password. Route tagProvides a method for distinguishing between internal routes (learned by RIP) and external routes (learned from other protocols). IP addressSpecifies the IP address for the entry. Subnet maskContains the subnet mask for the entry. If this field is zero, no subnet mask has been specified for the entry. Next hopIndicates the IP address of the next hop to which packets for the entry should be forwarded. MetricIndicates how many internetwork hops (routers) have been traversed in the trip to the destination. This value is between 1 and 15 for a valid route, or 16 for an unreachable route.
28
the early 1970s by BBN), Dr. Radia Perlmans research on fault-tolerant broadcasting of routing information (1988), BBNs work on area routing (1986), and an early version of OSIs Intermediate System-toIntermediate System (IS-IS) routing protocol. OSPF has two primary characteristics. The first is that the protocol is open, which means that its specification is in the public domain. The OSPF specification is published as Request For Comments (RFC) 1247. The second principal characteristic is that OSPF is based on the SPF algorithm, which sometimes is referred to as the Dijkstra algorithm, named for the person credited with its creation. OSPF is a link-state routing protocol that calls for the sending of link-state advertisements (LSAs) to all other routers within the same hierarchical area. Information on attached interfaces, metrics used, and other variables is included in OSPF LSAs. As OSPF routers accumulate link-state information, they use the SPF algorithm to calculate the shortest path to each node. As a link-state routing protocol, OSPF contrasts with RIP and IGRP, which are distance-vector routing protocols. Routers running the distance-vector algorithm send all or a portion of their routing tables in routing-update messages to their neighbors.
Routing Hierarchy
Unlike RIP, OSPF can operate within a hierarchy. The largest entity within the hierarchy is the autonomous system (AS), which is a collection of networks under a common administration that share a common routing strategy. OSPF is an intra-AS (interior gateway) routing protocol, although it is capable of receiving routes from and sending routes to other ASs. An AS can be divided into a number of areas, which are groups of contiguous networks and attached hosts. Routers with multiple interfaces can participate in multiple areas. These routers, which are called Area Border Routers, maintain separate topological databases for each area. A topological database is essentially an overall picture of networks in relationship to routers. The topological database contains the collection of LSAs received from all routers in the same area. Because routers within the same area share the same information, they have identical topological databases. The term domain sometimes is used to describe a portion of the network in which all routers have identical topological databases. Domain is frequently used interchangeably with AS. An areas topology is invisible to entities outside the area. By keeping area topologies separate, OSPF passes less routing traffic than it would if the AS were not partitioned. Area partitioning creates two different types of OSPF routing, depending on whether the source and the destination are in the same or different areas. Intra-area routing occurs when the source and destination are in the same area; interarea routing occurs when they are in different areas.
29
An OSPF backbone is responsible for distributing routing information between areas. It consists of all Area Border Routers, networks not wholly contained in any area, and their attached routers. Figure 2.5 shows an example of an internetwork with several areas. In the figure, routers 4, 5, 6, 10, 11, and 12 make up the backbone. If Host H1 in Area 3 wants to send a packet to Host H2 in Area 2, the packet is sent to Router 13, which forwards the packet to Router 12, which sends the packet to Router 11. Router 11 then forwards the packet along the backbone to Area Border Router 10, which sends the packet through two intra-area routers (Router 9 and Router 7) to be forwarded to Host H2. The backbone itself is an OSPF area, so all backbone routers use the same procedures and algorithms to maintain routing information within the backbone that any area router would. The backbone topology is invisible to all intra-area routers, as are individual area topologies to the backbone. Areas can be defined in such a way that the backbone is not contiguous. In this case, backbone connectivity must be restored through virtual links. Virtual links are configured between any backbone routers that share a link to a nonbackbone area and function as if they were direct links.
30
AS border routers running OSPF learn about exterior routes through exterior gateway protocols (EGPs), such as Exterior Gateway Protocol (EGP) or Border Gateway Protocol (BGP), or through configuration information.
SPF Algorithm
The Shortest Path First (SPF) routing algorithm is the basis for OSPF operations. When an SPF router is powered up, it initializes its routing-protocol data structures and then waits for indications from lower-layer protocols that its interfaces are functional. After a router is assured that its interfaces are functioning, it uses the OSPF Hello protocol to acquire neighbors, which are routers with interfaces to a common network. The router sends hello packets to its neighbors and receives their hello packets. In addition to helping acquire neighbors, hello packets also act as keepalives to let routers know that other routers are still functional. On multiaccess networks (networks supporting more than two routers), the Hello protocol elects a designated router and a backup designated router. Among other things, the designated router is responsible for generating LSAs for the entire multiaccess network. Designated routers allow a reduction in network traffic and in the size of the topological database. When the link-state databases of two neighboring routers are synchronized, the routers are said to be adjacent. On multiaccess networks, the designated router determines which routers should become adjacent. Topological databases are synchronized between pairs of adjacent routers. Adjacencies control the distribution of routing-protocol packets, which are sent and received only on adjacencies. Each router periodically sends an LSA to provide information on a routers adjacencies or to inform others when a routers state changes. By comparing established adjacencies to link states, failed routers can be detected quickly, and the networks topology can be altered appropriately. From the topological database generated from LSAs, each router calculates a shortest-path tree, with itself as root. The shortest-path tree, in turn, yields a routing table.
Packet Format
All OSPF packets begin with a 24-byte header, as illustrated in Figure 2.6.
31
The following descriptions summarize the header fields illustrated in Figure 46-2.
l l
Version numberIdentifies the OSPF version used. TypeIdentifies the OSPF packet type as one of the following:
m m
HelloEstablishes and maintains neighbor relationships. Database descriptionDescribes the contents of the topological database. These messages are exchanged when an adjacency is initialized. Link-state requestRequests pieces of the topological database from neighbor routers. These messages are exchanged after a router discovers (by examining database-description packets) that parts of its topological database are outdated. Link-state updateResponds to a link-state request packet. These messages also are used for the regular dispersal of LSAs. Several LSAs can be included within a single linkstate update packet. Link-state acknowledgmentAcknowledges link-state update packets.
m l l l
Packet lengthSpecifies the packet length, including the OSPF header, in bytes. Router IDIdentifies the source of the packet. Area IDIdentifies the area to which the packet belongs. All OSPF packets are associated with a single area. ChecksumChecks the entire packet contents for any damage suffered in transit. Authentication typeContains the authentication type. All OSPF protocol exchanges are authenticated. The authentication type is configurable on per-area basis. AuthenticationContains authentication information. DataContains encapsulated upper-layer information.
l l
l l
32
the three IP TOS bits (the delay, throughput, and reliability bits). For example, if the IP TOS bits specify low delay, low throughput, and high reliability, OSPF calculates routes to all destinations based on this TOS designation. IP subnet masks are included with each advertised destination, enabling variable-length subnet masks. With variable-length subnet masks, an IP network can be broken into many subnets of various sizes. This provides network administrators with extra network-configuration flexibility.
33
BGP is a very robust and scalable routing protocol, as evidenced by the fact that BGP is the routing protocol employed on the Internet. At the time of this writing, the Internet BGP routing tables number more than 90,000 routes. To achieve scalability at this level, BGP uses many route parameters, called attributes, to define routing policies and maintain a stable routing environment. In addition to BGP attributes, classless interdomain routing (CIDR) is used by BGP to reduce the size of the Internet routing tables. For example, assume that an ISP owns the IP address block 195.10.x.x from the traditional Class C address space. This block consists of 256 Class C address blocks, 195.10.0.x through 195.10.255.x. Assume that the ISP assigns a Class C block to each of its customers. Without CIDR, the ISP would advertise 256 Class C address blocks to its BGP peers. With CIDR, BGP can supernet the address space and advertise one block, 195.10.x.x. This block is the same size as a traditional Class B address block. The class distinctions are rendered obsolete by CIDR, allowing a significant reduction in the BGP routing tables. BGP neighbors exchange full routing information when the TCP connection between neighbors is first established. When changes to the routing table are detected, the BGP routers send to their neighbors only those routes that have changed. BGP routers do not send periodic routing updates, and BGP routing updates advertise only the optimal path to a destination network.
BGP Attributes
Routes learned via BGP have associated properties that are used to determine the best route to a destination when multiple paths exist to a particular destination. These properties are referred to as BGP attributes, and an understanding of how BGP attributes influence route selection is required for the design of robust networks. This section describes the attributes that BGP uses in the route selection process:
l l l l l l l
Weight Local preference Multi-exit discriminator Origin AS_path Next hop Community
Weight Attribute
Weight is a Cisco-defined attribute that is local to a router. The weight attribute is not advertised to neighboring routers. If the router learns about more than one route to the same destination, the route with the highest weight will be preferred. In Figure 2.8, Router A is receiving an advertisement for network
34
172.16.1.0 from routers B and C. When Router A receives the advertisement from Router B, the associated weight is set to 50. When Router A receives the advertisement from Router C, the associated weight is set to 100. Both paths for network 172.16.1.0 will be in the BGP routing table, with their respective weights. The route with the highest weight will be installed in the IP routing table.
35
Origin Attribute
The origin attribute indicates how BGP learned about a particular route. The origin attribute can have one of three possible values:
l
IGPThe route is interior to the originating AS. This value is set when the network router configuration command is used to inject the route into BGP. EGPThe route is learned via the Exterior Border Gateway Protocol (EBGP). IncompleteThe origin of the route is unknown or learned in some other way. An origin of incomplete occurs when a route is redistributed into BGP.
l l
The origin attribute is used for route selection and will be covered in the next section.
36
AS_path Attribute
When a route advertisement passes through an autonomous system, the AS number is added to an ordered list of AS numbers that the route advertisement has traversed. Figure 2.11 shows the situation in which a route is passing through three autonomous systems. AS1 originates the route to 172.16.1.0 and advertises this route to AS 2 and AS 3, with the AS_path attribute equal to {1}. AS 3 will advertise back to AS 1 with AS-path attribute {3,1}, and AS 2 will advertise back to AS 1 with AS-path attribute {2,1}. AS 1 will reject these routes when its own AS number is detected in the route advertisement. This is the mechanism that BGP uses to detect routing loops. AS 2 and AS 3 propagate the route to each other with their AS numbers added to the AS_path attribute. These routes will not be installed in the IP routing table because AS 2 and AS 3 are learning a route to 172.16.1.0 from AS 1 with a shorter AS_path list.
Next-Hop Attribute
The EBGP next-hop attribute is the IP address that is used to reach the advertising router. For EBGP peers, the next-hop address is the IP address of the connection between the peers. For IBGP, the EBGP next-hop address is carried into the local AS, as illustrated in Figure 2.12.
37
Router C advertises network 172.16.1.0 with a next hop of 10.1.1.1. When Router A propagates this route within its own AS, the EBGP next-hop information is preserved. If Router B does not have routing information regarding the next hop, the route will be discarded. Therefore, it is important to have an IGP running in the AS to propagate next-hop routing information.
Community Attribute
The community attribute provides a way of grouping destinations, called communities, to which routing decisions (such as acceptance, preference, and redistribution) can be applied. Route maps are used to set the community attribute. Predefined community attributes are listed here:
l l l
no-exportDo not advertise this route to EBGP peers. no-advertiseDo not advertise this route to any peer. internetAdvertise this route to the Internet community; all routers in the network belong to it.
Figure 2.13 illustrates the no-export community. AS 1 advertises 172.16.1.0 to AS 2 with the community attribute no-export. AS 2 will propagate the route throughout AS 2 but will not send this route to AS 3 or any other external AS.
38
In Figure 2.14, AS 1 advertises 172.16.1.0 to AS 2 with the community attribute no-advertise. Router B in AS 2 will not advertise this route to any other router.
39
Figure 2.15 demonstrates the internet community attribute. There are no limitations to the scope of the route advertisement from AS 1.
If the path specifies a next hop that is inaccessible, drop the update. Prefer the path with the largest weight. If the weights are the same, prefer the path with the largest local preference. If the local preferences are the same, prefer the path that was originated by BGP running on this router. If no route was originated, prefer the route that has the shortest AS_path. If all paths have the same AS_path length, prefer the path with the lowest origin type (where IGP is lower than EGP, and EGP is lower than incomplete). If the origin codes are the same, prefer the path with the lowest MED attribute.
l l
40
l l l
If the paths have the same MED, prefer the external path over the internal path. If the paths are still the same, prefer the path through the closest IGP neighbor. Prefer the path with the lowest IP address, as specified by the BGP router ID.
2.9 SUMMARY
In this chapter we presented an overview of Routing process in an inter network. Direct and indirect routing have been discussed and some of the techniques available for making routing table more efficient are discussed. We presented three important Routing protocols RIP, OSPF and BGP. Enough details are provided on these as these protocols are default standards and are currently being used in the internet.
2.10 QUESTIONS
1. 2. 3. 4. 5. 6. 7. What is Routing? Discuss direct and indirect Routing? Discuss different approaches used to make the routing table more efficient? Name RIPs various stability features. What is the purpose of the timeout timer? What two capabilities are supported by RIP 2 but not RIP? What is the maximum network diameter of a RIP network? When using OSPF, can you have two areas attached to each other where only one AS has an interface in Area 0? Area 0 contains five routers (A, B, C, D, and E), and Area 1 contains three routers (R, S, and T). What routers does Router T know exists? Router S is the ABR. Can IBGP be used in place of an IGP (RIP, IGRP, EIGRP, OSPF, or ISIS)?
8.
9.
10. Assume that a BGP router is learning the same route from two different EBGP peers. The AS_path information from peer 1 is {2345,86,51}, and the AS_path information from peer 2 is {2346,51}. What BGP attributes could be adjusted to force the router to prefer the route advertised by peer 1? 11. Can BGP be used only by Internet service providers?
12. If a directly connected interface is redistributed into BGP, what value will the origin attribute have for this route?
41
Chapter 3
Multimedia Networking
Understanding the limitations of the best of best effort service rendered by the Network layer Possible solutions when we want some of the killer applications such as Video conferencing, Video on Demand, Internet Telephony etc to happen on the existing Internet Some of the important protocols available to handle some such applications
3.1 INTRODUCTION
W
l l l l l l
e have been experiencing the impact of the digital multimedia technology. This means we are getting familiar with many applications which are becoming a part of our life. For Example
Streaming Video IP Telephony Internet Radio Tele Conferencing Interactive Games Virtual Networks
41
42
l l
Clearly these are new killer applications that have grown above the basic applications such as
l l l l
It is to be noted that the Internet is the largest dynamic network which works on a simple concept of best of best service effort. This means the packets that are released from the internet layer do not guarantee the final delivery to their respective destination in spite of its best effort. While conventional Email, web commerce and other off-line applications have no problem, the real time application with huge data suffer many limitations. In other words Multimedia applications are sensitive to end-to-end delay delay variation but can tolerate occasional loss of data. In this chapter we will examine how multi-media applications can be designed to make the best of the bet-effort Internet, which provides no end-to-end delay guarantees. Also we will examine a number of activities that are currently under way to extend the Internet architecture to provide explicit support for the service requirements of multimedia applications. We know that timing considerations and tolerance of data loss are particularly important for networked multimedia applications. Timing considerations are important because many multimedia applications are highly delay-sensitive. We will see shortly that in many multimedia applications, packets that incur a sender-to-receiver delay of more than a few hundred milliseconds are essentially useless. On the other hand, networked multimedia applications are for the most part loss-tolerant occasional loss only causes occasional glitches in the audio/video playback, and these losses can often be partially or fully concealed. These delay-sensitive but loss-tolerant characteristics are clearly different from those of elastic applications such as the Web, e-mail, FTP, and Telnet. For elastic applications, long delays are annoying but not particularly harmful, and the completeness and integrity of the transferred data is of paramount importance.
43
l l
Stored media. The multimedia content has been prerecorded and is stored at the server. As a result, a user may pause, rewind, fast-forward, or index through the multimedia content. The time from when a client should be on the order of one to ten seconds for acceptable responsiveness. Streaming. In a streaming stored audio/video application, a client begins playout of the audio/ video of few seconds after it begins receiving the file from the server. This means that the client will be playing out audio/video from one location in the file while it is receiving later parts of the file from the server. This technique, known as streaming, avoids having to download the entire file (and incurring a potentially long delay) before beginning playout. There are many streaming multimedia products, such as RealPlayer, QuickTime and Media Player. Continuous playout. Once playout of the multimedia content begins, it should proceed according to the original timing of the recording. This places critical delay constraints on data delivery. Data must be received from there server in time for its playout at the client. Although stored media applications have continuous playout requirements, their end-to-end delay constraints are nevertheless less stringent than those for live, interactive applications such as Internet telephony and video conferencing.
44
However, live audio/video distribution is more often accomplished through multiple separate unicast streams. As with streaming stored multimedia, continuous playout is required, although the timing constraints are less stringent than for real-time interactive applications. Delays of up to tens of seconds from when the user requests the delivery/playout of a live transmission to when playout begins can be tolerated.
45
intervening links are congested (such as congested transoceanic links). Internet phone and real-time interactive video has, to date, been less successful than streaming stored audio/video. Indeed, real-time interactive voice and video impose rigid constraints on packet delay and packet jitter. Packet jitter is the variability of packet delay within the same packet stream. Real-time voice and video can work well in regions where bandwidth is plentiful, and hence delay and jitter are minimal. But quality can deteriorate to unacceptable levels as soon as the real-time voice or video packet stream hits a moderately congested link. The design of multimedia applications would certainly be more straightforward if there were some sort of first-class and second-class Internet services, whereby first-class packets were limited in number and received priority service in router queues. Such a first-class service could be satisfactory for delaysensitive applications. But to date, the Internet has mostly taken an egalitarian approach to packet scheduling in router queues. All packets receive equal service; no packets, including delay-sensitive audio and video packets, receive special priority in the router queues. So for the time being we have to live with best-effort service. But given this constraint, we can make several design decisions and employ a few tricks to improve the user-perceived quality of a multimedia networking application. For example, we can send the audio and video over UDP, and thereby circumvent TCPs low throughput when TCP enters its slow-start phase. We can delay playback at the receiver by 100 msecs or more in order to diminish the effects of network-induced jitter. We can timestamp packets at the sender so that the receiver knows when the packets should be played back. For stored audio/video we can pre-fetch data during playback when client storage and extra bandwidth are available. We can even send redundant information in order to mitigate the effects of network-induced packet loss.
46
honored. With these new scheduling policies, not all packets get equal treatment; instead, those that reserve (and pay) more get more. 3. In order or honor reservations, the applications must give the network a description of the traffic that they intend to send into the network. The network must then police each applications traffic to make sure that it abides by the description. Finally, the network must have a means of determining whether it has sufficient available bandwidth to support any new reservation request. These mechanisms, when combined, require new and complex software in the hosts and routers as well as new types of services. At the other extreme, some researchers argue that it isnt necessary to make any fundamental changes to best-effort service and the underlying Internet protocols. Instead they advocate a laissez-faire approach:
l
As demand increases, the ISPs (both top-tier and lower-tier ISPs) will scale their networks to meet the demand. Specifically ISPs will add more bandwidth and switching capacity to provide satisfactory delay and packet loss performance within their networks. The ISPs will thereby provide better service to their customers (users and customer ISPs), translating to higher revenues through more customers and higher service fees. ISPs can also install caches in their networks, which bring stored content (Web pages as well as stored audio and video) closer to the users, thereby reducing the traffic in the higher-tier ISPs. Content distribution networks (CDNs), replicate stored content at the edges of the Internet. Given that a large fraction of the traffic flowing through the Internet is stored content (Web pages, MP3s, Video), CDNs can significantly alleviate the traffic loads on the ISPs and the peering interfaces between ISPs. Furthermore, CDNs provide a differentiated service to content providers: content providers that pay for a CDN service can deliver content faster and more effectively. To deal with live streaming traffic (such as a sporting event), which is being sent to millions of users simultaneously, Multicast overlay networks can be deployed. A multicast overlay network consist of servers scattered throughout the ISP network (and potentially throughout the entire Internet). These servers and the logical links between them collectively form an overlay network, which multicasts traffic from the source to the millions of users. Unlike layer, overlay networks multicast at the application layer. For example, the source host might send the stream to three overlay servers; each of the overlay servers may forward the stream to three more overlay servers the process continues, creating a distribution three on top of the underlying IP network with router an hosts. By multicasting popular live traffic trough overlay networks, overall traffic loads in the Internet can be further reduced.
Between the reservation camp and the laissez-faire camp there is a yet a third camp the so called differentiated service camp. This camp wants to make relatively small changes at the network and transport layers, and introduce simple pricing and policing schemes at the edge of the network (that is, at the interface between the user and the users ISP). The idea is to introduce a small number of traffic
47
classes (possibly just two classes), assign each datagram to one of the classes, give datagrams different levels of service according to their class in the router queues, and charge users according to the class of packets that they are sending into the network.
Audio Compression
A continuously varying analog audio signal (which could emanate from speech or music) is normally converted to a digital signal as follows:
l
The analog audio signal is first sampled at some fixed rate, for example, at 8,000 samples per second. The value of each sample is an arbitrary real number. Each of the samples is then rounded to one of a finite number of values. This operation is referred to as quantization. The number of finite value called quantization values is typically a power of two, for example, 256 quantization values. Each of the quantization values is represented by a fixed number of bits. For example, if there are 256 quantization values, then each value and hence each sample is represented by one bytes. Each of the samples is converted to its bit representation. The bit representations of all the samples are concatenated together t form the digital representation of the signal.
As an example, if an analog audio signal is sampled t 8,000 samples per second and each sample is quantized and represented by 8 bits, then the resulting digital signal will have a rate of 64,000 bits second.
48
This digital signal can then be converted back that is, decoded to an analog signal for playback. However, the decoded analog signal is typically different from the original audio signal. By increasing the sampling rate and the number of quantization values, the decoded signal can approximate the original analog signal. Thus, there is a clear trade-off between the quality of the decoded signal and the storage and bandwidth requirements of the digital signal. The basic encoding technique that we just described is called pulse code modulation (PCM). Speech encoding often uses PCM, with a sampling rate of 8,000 samples per second and eight bits per sample, giving a rate of 64 kbps. The audio compact disk (CD) also uses PCM, with a sampling rate of 44,100 samples per second with 16 bits per sample; this gives a rate of 705.6 kbps for mono and 1.411 Mbps for stereo. A bit rate of 1.411 Mbps for stereo music exceeds most access rates, and even 64 kbps speech exceeds the access rate for a dial-up modem user. For these reasons, PCM encoded speech and music are rarely used in the Internet. Instead compression techniques are used to reduce the bit rates of the stream. Popular compression techniques for speech include GSM (13 kbps), G.729 (8 kbps), and G. 723.3 (both 6.4 and 5.3 kbps), and also a large number of proprietary techniques, including those used by Real Networks
MP3
A popular compression technique for near CD quality stereo music is MPEG 1 layer 3, more commonly known as MP3, MP3 encoders typically compress to rates of 96 kbps, 128 kbps, and 160 kbps, and produce very little sound degradation. When an MP3 file is broken up into pieces, each piece is still playable. This header-less file format allows MP3 music files to be streamed across the Internet (assuming the playback bit rate and speed of the Internet connection are compatible). The MP3 compression standard is complex, using psychoacoustic masking, redundancy reduction, and bit reservoir buffering.
Video Compression
A video is a sequence of frames, with frames typically being displayed at a constant rate, for example at 24 or 30 frames per second. An uncompressed, digitally encoded image consists of an array of pixels, with each pixel encode into a number of bits to represent luminance and color. Video has two types of redundancies
l l
Spatial redundancy is the redundancy within a given image. For example, an image that consists of mostly white space can be efficiently compressed. Temporal redundancy reflects repetition from image to subsequent image. If, for example, an image and the subsequent image are exactly the same, there is no reason to re-encode the subsequent image; it is more efficient simply to indicate during encoding that the subsequent image is exactly the same.
49
The MPEG compression standards are among the most popular compression techniques. These include MPEG 1 for CD-ROM quality video (1.5 Mbps), MPEG 2 for high-quality DVD video (3-6 Mpbs), and MPEG 4 for object-oriented video compression. The MPEG standard draws heavily from the JPEG standard for image compression by exploiting temporal redundancy across images in addition to the spatial redundancy exploited by JPEG. The H.261 video compression standards are also very popular in the Internet. In addition there are numerous proprietary schemes, including Apples Quick Time and Real Networks encoders.
50
l
Decompression. Audio/video is almost always compressed to save disk storage and network bandwidth. A media player must decompress the audio/video on the fly during playout. Jitter removal. Packet jitter is the variability of source-to-destination delays of packets within the same packet stream. Since audio and video must be played out with the same timing with which it was recorded, a receiver will buffer received packets for a short period of time to remove this jitter. Error correction. Due to unpredictable congestion in the Internet, a fraction of packets in the packet stream can be lost. If this fraction becomes too large, user perceived audio/video quality becomes unacceptable. To this end, may streaming systems attempt to recover from losses by either (1) reconstructing lost packets through the transmission of redundant packets, (2) having the client explicitly request retransmission of lost packets, or (3) masking loss by interpolating the missing data from the received data.
The media player has a graphical user interface with control knobs. This is the actual interface that the user interacts with. It typically includes volume controls, pause/resume buttons, sliders for making temporal jumps in the audio/video stream, and so on. Plug-ins may be used to embed the user interface of the media player within the window of the Web browser. For such embeddings, the browser reserves screen space on the current Web page, and it is up to the media player to manage the screen space. But whether appearing in a separate window or within the browser window (as a plug-in), the media player is a program that is being executed separately from the browser.
51
The case of video can be a little more tricky, because the audio and video parts f the video may be stored in two different files; that is, they may be two different objects in the Web servers file system. In this case, two separate HTTP requests are sent to the server (over two separate TCP connections for HTTP/1.0), and the audio and video files arrive at the client in parallel. It is up to the client to manage the synchronization of the two streams. It is also possible that the audio and video are interleaved in the same file, so that only one object need be sent to the client. To keep our discussion simple, for the case of video we assume that the audio and video are contained in one file. An architecture for audio/video streaming is shown in Figure 3.1. In this architecture:
C lie n t
S e rv er
W eb B ro w ser
M ed ia p la y er
The browser process establishes a TCP connection with the Web server and requests the audio/video file with an HTTP request message. The Web server sends the audio/video file to the browser in an HTTP response message. The content-type header line in the HTTP response message indicates a specific audio/video encoding. The client browser examines the content type of the response message, launches the associate media player, and passes the file to the media player. The media player then renders the audio/video file.
l l
Although this approach is very simple, it has a major drawback: the media player (that is, the
52
helper application) must interact with the server through a Web browser as an intermediary, the entire object must be downloaded before the browser passes the object to a helper application. The resulting delay before playout can begin is typically unacceptable for audio/video clips of moderate length. For this reason, audio/video streaming implementations typically have the server send the audio/video file directly to the media player process. In other words, a direct socket connection is made between the server process and the media player process. As shown in Figure 3.2, this is typically done by making use of a meta file, a file that provides information (for example, URL or type of encoding) about the audio/ video file that is to be streamed. A direct TCP connection between the server and the media player is obtained as follows: 1. The user clicks on a hyperlink for an audio/video file. 2. The hyperlink does not point directly to the audio/video file, but instead to a meta file. The meta file contains the URL of the actual audio/video file. The HTTP response message that encapsulates the meta file includes a content type header line that indicates the specific audio/ video application. 3. The client browser examines the content type header line of the response message, launches the associated media player, and passes the entire body of the response message (that is, the meta file) to the media player. 4. The media player sets up a TCP connection directly with the HTTP server. The media player sends an HTTP request message for the audio/video file into the TCP connection. 5. The audio/video file is sent within an HTTP response message to the media player. The media player streams out the audio/video file. The importance of the intermediate step of acquiring the meta file is clear, when the browser sees the content type of the file, it can launch the appropriate media player, and thereby have the media player contact the server directly. We have just learned how a meta file can allow a media player to communicate directly with a Web server that stores an audio/video file. Yet many companies that sell products for audio/video streaming do not recommend the architecture we just described. This is because the architecture has the media player communicate with the server over HTTP and hence also over TCP. HTTP is often considered insufficiently rich to allow for satisfactory user interaction with the server; in particular, HTTP does not easily allow a user (through the media player) to send pause/resume, fast-forward, and temporal jump commands to the server.
53
S er v er
C lien t
Figure 3.2 Web server sends audio/video directly to the media player
54
In the architecture of Figure 3.3, there are many options for delivering the audio/video from the streaming server to the media player. A partial list of the options is given below. 1. The audio/video is sent over UDP at a constant rate equal to the drain rate at the receiver (which is the encoded rate of the audio/video). For example, if the audio is compressed using GSM at a rate of 13 kbps, then the server clocks out the compressed audio file at 13 kbps. As soon as the client receives compressed audio/video from the network, it decompresses the audio/video and play it back.
C lie n t
S er v er
A u d io /v id eo file re qu ested a n d se n t
2. This is the same as Option 1, but the media player delays playout for two to five seconds in order to eliminate network-induced jitter. The client accomplishes this task by placing the compressed media that it receives from the network into a client buffer, as shown in Figure 3.4. Once the client has pre-fetched a few seconds of the media, it begins to drain the buffer. For this, and the previous option, the fill rate x(t) is equal to the drain rate d, except when there is packet loss, in which case x(t) is momentarily less than d.
55
3. The media is sent over TCP. The server pushes the media file into the TCP socket as quickly as it can; the client (that is, media player) reads from the TCP socket as quickly as it can, and places the compressed video into media player buffer. After an initial two to five second delay, the media player reads from its buffer at a rate d and forwards the compressed media to decompression and playback. Because TCP retransmits lost packets, it has the potential to provide better sound quality than UDP. On the other hand, the fill rate x(t) now fluctuates with packet loss. TCP congestion control and window flow control. In fact, after packet loss, TCP congestion control may reduce the instantaneous rate to less than d for long periods of time. This can empty the client buffer and introduce undesirable pauses into the output of the audio/ video stream at the client. For the third option, the behavior of x(t) will very much depend on the size of the client buffer (which is not to be confused with the TCP receive buffer). If this buffer is large enough to hold all of the media file (possible within disk storage), then TCP will make use of all the instantaneous bandwidth available to the connection, so that x(t) can become much larger than d. If x(t) becomes much larger than d for long periods of time, then a large portion of media is pre-fetched into the client, and subsequent client starvation is unlikely. If, on the other hand, the client buffer is small, then x(t) will fluctuate around the drain rate d. Risk of client starvation is much larger in this case.
C lien t b u ffer
F ill R a te = x(t)
D rain rate= d
To de com p ressio n an d P lay o ut
P re fe tc h ed Vid eo d ata
Figure 3.4 Client buffer being filed at rate x(t) and drained at rate d
56
future or past point of time, fast-forwarding playback visually, rewinding playback visually, and so on. This functionally is similar to what a user has with a DVD player when watching a DVD video or with a CD player when listening to a music CD. To allow a user to control playback, the media player and server need a protocol for exchanging playback control information. Real-time streaming protocol (RTSP), defined in RFC 2326, is such a protocol. Before getting into the details of RTSP, let us first indicate what RTSP does not do.
l l
RTSP does not define compression schemes for audio and video. RTSP does not define how audio and video are encapsulated in packets for transmission over a network; encapsulation for streaming media can be provided by RTP or by a proprietary protocol. (RTP is discussed in Section 6.4) For example, Real Networks audio/video servers and players user RTSP to send control information to each other, but the media stream itself can be encapsulated in RTP packets or in some proprietary data format. RTSP does not restrict how streamed media is transported; it can be transported over UDP or TCP. RTSP does not restrict how the media player buffers the audio/video. The audio/video can be played out as soon as it begins to arrive at the client, it can be played out after a delay of a few seconds, or it can be downloaded in its entirety before playout.
So if RTSP doesnt do any of the above, what does it do? RTSP is a protocol that allows a media player to control the transmission of a media stream. As mentioned above, control actions include pause/ resume, repositioning of playback, fast-forward, and rewind. RTSP is an out-of-band protocol. In particular, the RTSP messages are sent out-of-band, whereas the media stream, whose packet structure is not defined by RTSP, is considered in-band. RTSP messages use a different port number, 544, from the media stream. The RTSP specification (RFC 2326) permits RTSP messages to be sent over either TCP or UDP. Recall that file transfer protocol (FTP) also uses the out-of-band notion. In particular, FTP uses tow client/server pairs of sockets, each pair with its own port number: one client/server socket pair supports a TCP connection that transports control information; the other client/server socket pair supports a TCP connection that actually transports the file. The RTSP channel is in many ways similar to FTPs control channel.
57
S e rv er
C lien t
HTTP GET
S etu p P la y
M ed ia stream
P a u se Te a rd o w n
C: SETUP rtsp:// audio.example.com/twister/audio RTSP/1.0 Cseq: 1 Transport: rtp/udp; compression; port=3056; mode=PLAY S: RTSP/1.0 200 OK Cseq: 1 Session: 4231
58
C: PLAY rtsp://audio. Example.com/twister/audio.en/lofi RTSP/1.0 Range: npt=0Cseq: 2 Session: 4231 S: RTSP/1.0 200 OK Cseq: 2 Session: 4231 C: PAUSE rtsp://audio.example.com/twister/audio.en/lofi RTSP/1.0 Range: npt=37 Cseq: 3 Session: 4231 S: RTSP/1.0 200 OK Cseq: 3 Session: 4231 C: TEARDOWN rtsp://audio. Example.com/twister/audio.en/ lofi RTSP/1.0 Cseq: 4 Session: 4231 S: RTSP/1.0 200 OK Cseq: 4 Session: 4231
It is interesting to note the similarities between HTTP and RTSP. All request and response messages are in ASCII text, the client employs standardized methods (SETUP, PLAY, PAUSE, and so on), and the server responds with standardized reply codes. One important difference, however, is that the RTSP
59
server keeps track of the state of the client for each ongoing RTSP session. For example, the server keeps track of whether the client is in an initialization state, a play state, or a pause state (see the programming assignment for this chapter). The session and sequence numbers, which are part of each RTSP request and response, help the server keep track of the session state. The session number is fixed throughout the entire session; the client increments the sequence number each time it sends a new message; the server echoes back the session number and the current sequence number. As shown in the example, the client initiates the session with the SETUP request, providing the URL of the file to be streamed and the RTSP version. The setup message includes the client port number to which the media should be sent. The setup message also indicates the client port number to which the media should be sent. The setup message also indicates that the media should be sent over UDP using the packetization protocol RTP. Notice that in this example, the player chose not to play back the complete presentation, but instead only the low-fidelity portion of the presentation. The RTSP protocol is actually capable of doing much more than described in this brief introduction. In particular, RTSP has facilities that allow clients to stream toward the server (for example, for recording). RTSP has been adopted by Real Networks one of the industry leaders in audio/video streaming
Packet Loss
Consider one of the UDP segments generated by our Internet phone application. The UDP segment is encapsulated in an IP datagram. As the datagram wanders through the network, it passes through buffers (that is, queues) in the routers in order to access outbound links. It is possible that one or more of the buffers in the route from sender to receiver is full and cannot admit the IP datagram. In this case, the IP datagram is discarded, never to arrive at the receiving application. Loss could be eliminated by sending the packets over TCP rather than over UDP. Recall that TCP retransmits packets that do not arrive at the destination. However, retransmission mechanisms are often considered unacceptable for interactive real-time audio applications such as Internet phone, because they increase end-to-end delay. Furthermore, due to TCP congestion control, after packet loss the transmission rate at the sender can be reduced to a rate that is lower than the drain rate at the receiver. This can have a severe impact on voice intelligibility at the receiver. For these reasons, almost all existing Internet phone applications run over UDP and do not bother to retransmit lost packets.
60
But losing packets is not necessarily as disastrous as one might think. Indeed, packet loss rates between 1 and 20 percent can be tolerated, depending on how the voice is encoded and transmitted, and on how the loss is concealed at the receiver. For example, forward error correction (FEC) can help conceal packet loss. Well see below that with FEC, redundant information is transmitted along with the original information so that some of the lost original data can be recovered from the redundant information. Nevertheless, if one or more of the links between ender and receiver is severely congested, and packet loss exceeds 10-20 percent, then there is really nothing that can be done to achieve acceptable sound quality. Clearly, best effort service has its limitations.
End-to-End Delay
End-to-end delay is the accumulation of transmission, processing, and queuing delays in routers; propagation delays in the links; and end-system processing delays. For highly interactive audio applications, such as Internet phone, end-to-end delays smaller than 150 milliseconds are not perceived by a human listener; delays between 150 and 400 milliseconds can be acceptable but are not ideal; and delays exceeding 400 milliseconds can seriously hinder the interactivity in voice conversations. The receiving side of an Internet phone application will typically disregard any packets that are delayed more than a certain threshold, for example, more than 400 milliseconds. Thus, packets that are delayed by more than the threshold are effectively lost.
Packet Jitter
A crucial component of end-to-end delay is the random queuing delays in the routers. Because of these varying delays within the network, the time from when a packet is generated at the source until it is received at the receiver can fluctuate from packet to packet. This phenomenon is called jitter. As an example, consider two consecutive packets within a talk spurt in our Internet phone application. The sender sends the second packet 20 msec after sending the first packet. But at the receiver, the spacing between these packets can become greater than 20 msec. To see this, suppose the first packet arrives at a nearly empty queue at a router, but just before the second packet arrives at the queue a large number of packets from other sources arrive at the same queue. Because the first packet suffers a small queuing delay and the second packet suffers a large queuing delay at this router, the first and second packets become spaced by more than 20 mecs. The spacing between consecutive packets can also become less than 20 msecs. To see this, again consider two consecutive packets within a talk spurt. Suppose the first packet joins the end of a queue with large number of packets, and the second packet arrives at the queue before packets from other sources arrive at the queue. In this case, our two packets find themselves one right after the other in the queue. If the time it takes to transmit a packet on the routers outbound link is less than 20 msecs, then the first and second packets become spaced apart by less than 20 msecs.
61
If the receiver ignores the presence of jitter and plays out chunks as soon as they arrive, then the resulting audio quality can easily become unintelligible at the receiver. Fortunately, jitter can often be removed by using sequence numbers, timestamps, and a playout delay, as discussed below.
Prefacing each chunk with a sequence number. The sender increments the sequence number by one for each of the packets it generates. Prefacing each chunk with a timestamp. The sender stamps each chunk with the time at which the chunk was generated. Delaying playout of chunks at the receiver. The playout delay of the received audio chunks must be long enough so that most of the packets are received before their scheduled palyout times. This playout delay can either be fixed throughout the duration of the audio session or it may vary adaptively during the audio session lifetime. Packets that do not arrive before their scheduled playout times are considered lost and forgotten; as noted above, the receiver may use some form of speech interpolation to attempt to conceal the loss.
62
Specifically, streaming of stored audio/video can tolerate significantly larger delays. Indeed, when a user requests an audio/video clip, the user may find it acceptable to wait five seconds or more before playback begins. And most users can tolerate similar delays after interactive actions such as a temporal jump within the media stream. This greater tolerance for delay gives the application developer greater flexibility when designing stored media applications.
RTP Basics
RTP typically runs on top of UDP. The sending side encapsulates a media chunk within an RTP packet, then encapsulates the packet in a UDP segment, and then within an RTP packet, then encapsulates the packet in a UDP segment, and then hands the segment to IP. The receiving side extracts the RTP packet from the UDP segment, then extracts the media chunk from the RTP packet, and then passes the chunk to the media player for decoding and rendering. As an example, consider the use of RTP to transport voice. Suppose the voice source is PCMencoded (that is, sampled, quantized, and digitized) at 64 kbps. Further suppose that the application collects the encoded data in 20 msec chunks, that is, 160 bytes in a chunk. The sending side precedes
63
each chunk of the audio data with an RTP header that includes the type of audio encoding, a sequence number, and a timestamp. The RTP header is normally 12 bytes. The audio chunk along with the RTP header form the RTP packet. The RTP packet is then sent into the UDP socket interface. At the receiver side, the application receives the RTP packet from its socket interface. The application extracts the audio chunk from the RTP packet and uses the header fields of the RTP packet to properly decode and play back the audio chunk. If an application incorporates RTP instead of a proprietary scheme to provide payload type, sequence numbers, or timestamps then the application will more easily interoperate with other networked multimedia applications. For example, if two different companies develop Internet phone software and they both incorporate RTP into their product, there may be some hope that a user using one of the Internet phone products will be able to communicate with a user using the other Internet phone product. In Section 6.4.3 well see that RTP is often used in conjunction with the Internet telephony standards. It should be emphasized that RTP in itself does not provide any mechanism to ensure timely delivery of data or provide other quality of service guarantees; it does not even guarantee delivery of packets or prevent out-of-order delivery of packets. Indeed, RTP encapsulation is seen only at the end systems. Routers do not distinguish between IP datagrams that carry RTP packets and IP datagrams that dont. RTP allows each source (for example, a camera or a microphone) to be assigned its own independent RTP stream of packets. For example, for a video conference between tow participant, four RTP stream could be opened two streams for transmitting the audio(one in each direction) and two streams for transmitting the video (again, one in each direction). However, many popular encoding techniques including MPEG 1 and MPEG 2 bundle the audio and video into a single stream during the encoding process. When the audio and video are bundled by the encoder, then only one RTP stream is generated in each direction. RTP packets are not limited to unicast applications. They can also be sent over one-to-many and many-to-many multicast trees. For a many-to-many multicast session, all of the sessions senders and sources typically use the same multicast group for sending their RTP streams. RTP multicast streams belonging together, such as audio and video streams emanating from multiple senders in a video conference application, belong to an RTP session.
3.13 SUMMARY
In this chapter we have learnt wealth of information on the multimedia data transport across internet. Especially we looked into the audio and video streaming across internet. Limitation of the present internet and removal of certain drawbacks to make the existing internet to port multimedia information. Some of the protocols used for real time streaming are also presented.
64 3.14 QUESTIONS
1. 2. 3. 4. 5. 6. 7. 8. 9. What is multimedia? Give examples of multimedia data What is an audio? What is a video? What is streaming?
List the drawbacks of the current internet to drive the multimedia data? How the existing internet can be made to port multimedia data? Explain the Why Audio and Video need to be compressed? Explain audio streaming process? What is a streaming server? What are the limitations of the best effort service? Explain
10. Discuss the features of Real Time Protocol? 11. Explain how does the helper application get the data from a streaming server?
65
Chapter 4
n this chapter we present yet another upcoming technology that is making impact on the way we use the modern computer based devices. So we focus on the technologies related to the Wireless Local Area Network. The main objectives include,
l l l l
Overview of the different forms of signals and their characteristics Necessities of Wireless LAN. WLAN system architecture, Protocols and standards, To study the MAC management issues and functions for WLAN.
4.1 INTRODUCTION
As the number of portable computing and communication devices grows, so does the demand to connect them to the outside world. Even the very first portable telephones had the ability to connect to other telephones. The first portable computers did not have this capability, but soon afterward, modems became commonplace. To go on-line, these computers had to be plugged into a telephone wall socket. Requiring a wired connection to the fixed network meant that the computers were portable, but not mobile. To achieve true mobility, portable computers need to use radio (or infrared) signals for communication. In this manner, dedicated users can read and send email while driving or boating. A system of portable computers that communicate by radio can be regarded as a wireless LAN. As the name suggests, a wireless LAN is one that makes use of a wireless transmission medium. Until relatively recently, wireless LANs were little used; the reasons for this included high prices, low data
65
66
rates, occupational safety concerns, and licensing requirements. As these problems have been addressed, the popularity of wireless LANs has grown rapidly. Wireless LANs have been developed over the last 30 years. ALOHANET, the first operating wireless network, was implemented in Hawaii in 1971. It was started as a research project of the University of Hawaii. It allowed seven campuses across four islands to communicate via satellite with a central computer. The protocol used for ALOHA went through multiple iterations before a good throughput was achieved. Ham radio operators developed terminal node controllers (TNCs) in the 1980s, which they used to connect their computers to the ham radio network. The TNCs modulated the computer signal and used packet switching to transmit the data. Ham radio associations began sponsoring forums for the development of wireless WANs in the early 1980s. In the mid-1980s, the FCC authorized pubic use of the Industrial, Scientific, and Medical (ISM) frequency bands. The ISM band is designated for short range, low power devices therefore licensing is not required to manufacture or use equipment operating in this range. This move by the FCC encouraged the development of wireless LAN components. Early development, as with most new technology, resulted in a lot of proprietary wireless equipment. This equipment was also expensive, which prevented widespread use. In the late 1980s, commercial industry standards development began for Wireless LAN. The Institute of Electrical and Electronics Engineers (IEEE) 802 Working Group created the 802.11 Working Group to develop wireless LAN standards. They defined the physical and media access control specifications. As time has progressed, the initial standards were finalized and extended to cover multiple frequencies and access speeds. Equipment prices are now falling and performance is increasing. Wireless LANs have become a viable solution in both homes and in industry.
67
The mixing, or modulating, of intelligence with the carrier frequency comes in various forms. Common methods are AM, CCK, PBCC, FM, BPSK, and QPSK.
Carriers
If you tune the radio in your home to 103.9 FM, you will receive the same station all the time. In the US, this is because the FCC regulates this range of frequencies. However, the frequency band used for wireless both the 2 and 5 GHz ranges are unregulated. There is no ownership of any one frequency. Interference could become a problem if fixed carrier frequencies were used. To overcome this problem, carrier frequencies are consistently changed via several approaches. The major approach used in wireless is called spread spectrum. The height of the carrier is reduced (suppressed carrier), and the carrier frequency is consistently changed within a predefined range and with a pattern known by both the receiver and the transmitter.
S p rea d S p ec tru m M eth o d s F req u en cy H o p p in g S p rea d S p e ctr u m (F H S S ) u se s a p seu d o -ran d o m c arrie r h o p m eth o d . In th eo ry, F H S S is m o re sec u re b eca u se o f th e d ifficu lty in v o lv ed in p re d ic tin g an d cap tu rin g carriers g e n erated in p seu d o-ran d o m p a tte rn s.
68
In Figure 4.3, we see the output of a spread spectrum system. Notice how the carrier moves back and forth. There are several approaches to spread spectrum; these approaches comprise different 802.11 standards.
69
Bandwidth
Bandwidth alone should not be the deciding factor in equipment purchase and installation. In a wired environment, many devices share the same wires. In a wireless environment, many devices share the same radio spectrum. However, with the use of spread-spectrum technology, the resources are reused many times over. It is said that bigger is better, so more bandwidth is better, right? It may not be. In wired networks, sometimes the rating of the wires clock speed is confused with traffic throughput. Because Ethernet uses CSMA/CD with statistical multiplexing, the general rule is to design networks in which the throughput does not exceed 30% of the rating, so an Ethernet-based 10Mbps link would have an average throughput of 3Mbps. But what if I need more bandwidth for killer applications? We have been waiting for that killer application for some time now. VoIP, Videoconferencing, and even on-line interactive training courses use much less BW than one would think. An interactive videoconference uses around 2MHz of stream bandwidth.
Now that we have discussed some of the basic concepts of wireless communications, lets take a look at the difference between 802.11 a, b, and g. The wireless 802.11 standard is a top-level standard that has been divided into several subsections, including 802.11a, 802.11b, and 802.11g. The 802.11 umbrella covers the sub-committee standards 802.11a, b, and g, along with any other 802.11 standards. There has been more than just the IEEE committee work on wireless standards. Thinking that it could improve both marketing and product quality, a consortium called bluetooth was formed. Bluetooths promoters include 3Com, Ericsson, IBM, Intel, Microsoft, Motorola, Nokia, and Toshiba, as well as hundreds of associate and adapter member companies. In Table 4.1, we see a comparison between the different 802.11 and Bluetooth standards. Standards a and b were approved at the same time, but products supporting 802.11b, being less expensive to make, have flooded the market. It should be noted that 802.11b operates in the 2.4 GHz range, with an operational bandwidth of 11Mbps. Notice that the 802.11a standards operates at 5Ghz with an operational l bandwidth of 54Mbps. These two standards are not compatible.
Wireless LAN Standards 802.11a Data Rate Frequency Modulation Channels Bandwidth Available Power 54-72 5Ghz OFDM 12/8 802.11b 11 2.4Ghz DSSS/CCK 11/3 802.11g 54 2.4Ghz DSSS/PBCC 11/3 83.5 (22MHz channel) 100mW Bluetooth 721 56 Kbs 2.4Ghz FHSS 79 ( 1Mhz wide) per 83.5 100mW Kbps
300 40-800mW
83.5 100mW
71
Flexibility: Within radio coverage, nodes can communicate without further restriction. Radio waves can penetrate walls, senders and receivers can be placed anywhere. Planning: Only wireless networks allow for communication without previous planning, any wired network needs wiring plans.
Robustness: Wireless networks can survive disasters. Networks requiring a wired infrastructure will typically some time break down completely.
Industry Retail
Applications Portable point-of-sale, wireless order entry Replicated branches, temporary audit Financial workgroups Mobile nursing stations, patient record Medical tracking Transportation Remote mobile customer service Education Mobile classrooms Real-time data collection, inventory Manufacturing management Government Wireless office automation Personal area networks, wireless home Residential networks Warehousing Networking forklift trucks
Table 4.2: Application of Wireless LANs in Industry
Quality of Service (QoS): WLANs typically offer lower quality than wired networks. The main reasons are lower bandwidth due to limitations in radio transmission (e.g., only 1-10 Mbps), higher error rates due to interference (e.g., 10-4 instead of 10--10 for fiber optics), and higher delay/delay variation.
72
l
Cost: For e.g., high-speed Ethernet adapters are in the range of some 10 pounds, wireless LAN adapters, e.g., as PC-Card ranges from 100 pounds. Proprietary solutions: Due to slow standardization procedures, many companies have come up with proprietary solutions offering standardized functionality plus many enhanced features. However, these additional features only work in a homogeneous environment. Restrictions: Several government and non-government institutions worldwide regulate the operation and restrict frequencies to minimize interference. Consequently, it takes a very long time to establish global solutions like e.g., IMT-2000. WLANs are limited to low power senders and certain license-free frequency bands. Safety and security: Using radio waves for data transmission might interfere with other hightech equipment in e.g., hospitals. Additionally, the open radio interface makes eavesdropping much easier in WLANs than e.g., in the case of fiber optics.
The main advantages of infrared technology are its simple and extremely cheap senders and receivers, which are integrated in almost all mobile devices and receivers available today. PDAs, laptops, notebooks, mobile phones etc. have an infrared data association (IrDA) interface. Version 1.0 of this industry standard implements data rates of up to 115 kbps, while IrDA 1.1 defines higher data rates of 1.152 and 4 Mbps. No licenses are required for infrared technology and shielding is very simple. Furthermore, electrical devices do not interfere with infrared transmission. Disadvantages of infrared transmission are its low bandwidth compared to other LAN technologies. Typically, IrDA devices are internally connected to a serial port limiting transfer rates to 115 kbps. Infrared is quite easily shielded. They cannot penetrate walls or other obstacles, for good transmission quality and high data rates typically a LOS, i.e., direct connection is needed.
73
There are many networks that use radio transmission, e.g. GSM at 900, 1,800 and 1,900 MHz, DECT at 1,880 MHz etc.
l
Advantages of radio transmission include the long-term experiences made with radio transmission for wide area networks (e.g., microwave links) and mobile cellular phones. Radio transmission can cover larger areas and can penetrate (thinner) walls, furniture, plants etc. Thus, radio typically does not need a LOS if the frequencies are not too high (then radio waves behave more and more like light). Current radio-based products offer higher transmission rates (e.g., 10 Mbps) than infrared. Shielding is not so simple and thus radio transmission can interfere with other senders or electrical devices can destroy data transmission via radio. Additionally, radio transmission is only permitted in certain frequency bands. Very limited ranges of license-free bands are available worldwide and those available are typically not the same in all countries.
WLAN technologies:
1. IEEE 802.11: infrared and radio both 2. HIPERLAN: radio only 3. Bluetooth: radio only
74
OSI 7 A p p licatio n lay er OSI 6 P re sen tatio n lay er OSI 5 S essio n lay e r OSI 4 T ra n sp o rt lay er OSI 3 N etw o rk la y er OSI 2 D ata L in k lay er OSI 1 P h y sica l lay er A p p licatio n lay er
HTTP
OSI m o d el
O S I m o d el ad ap te d fo r LAN
E x am p le LAN implimentation
W IR E L E S S L A N IE E E 802.11 standards
System architecture
Wireless networks can exhibit two different basic system architectures Infrastructure based and ad hoc based. The following figure 4.7 shows the components of an infrastructure and wireless part as specified for IEEE 802.11.
75
Several nodes, called stations (STAi) are connected to access point (AP). Stations are terminals with access mechanisms to the wireless medium and radio contact to the AP. The stations and the AP, which are within the same radio coverage form a basic service set (BSSi). The example shows two BSS - BSS1 and BSS2 - which are connected via a distribution system. A distribution system connects several BSSs via the AP to form a single network and there by extends the wireless coverage area. This network is now called an extended service set (ESS). Further more, the distribution system connects the wireless networks via the APs with a portal, which forms the internetworking unit to other LANs. The architecture of the distribution system is not specified further in the IEEE 802.11. It could consist of bridged IEEE LANs, wireless links or any other networks. However, distribution system services are defined in the standard. The APs support roaming, distribution system then handles data transfer between the different APs. Furthermore, APs provide synchronization with in a BSS, support power management, and can control medium access to support time bounded service. IEEE 802.11allows the building of ad hoc networks between stations, thus forming one or more BSSs is shown in figure 4.8.
76
In this case, BSS comprises a group of stations using the same radio frequency. Stations STA1, STA2 and STA3 are in BSS1, STA4 and STA5 are in BSS2. This means that for example that STA3 can communicate directly with STA2 but not with STA5. Several BSSs can either be formed via the distance between the BSSs or by using different carrier frequencies.
77
The MAC management supports the association and re-association of a station to an access point and roaming between different access points. Furthermore, it controls authentication mechanism encryption, synchronization of a station with regard to access point and power management to save battery power. MAC management also maintains the MAC management information base (MIB). The main tasks of the PHY management include channel tuning and PHY MIB maintenance. Finally, station management interacts with both management layer and is responsible for additional higher layer functions.
PHYSICAL LAYER
IEEE 802.11 supports three different physical layers: one layer based on infrared and two layers on the basis of radio transmission. All PHY variants include the provision of the clear channel assessment signal (CCA). The PHY layer offers a service access point (SAP) with 1 or 2 Mbps transfer rate to the MAC layer.
Synchronization: The PLCP preamble starts with 80-bit synchronization. This pattern is used for synchronization of potential receivers and signal detection by the CCA.
78
l
Start frame delimiters (SFD): The 16 bits indicate the start of the frame and thus provide frame synchronization. PLCP_PDU length word (PLW): The first field of the PLCP header indicates the length of the payload in bytes including the 32 bit CRC at the end of the payload .PLW can range between 0 and 4,095. PLCP signaling fields (PSF): Only one bit is currently specified in this 4-bit field indicating the data rate of the payload (1or 2 Mbit/s).
Header error check (HEC): The PLCP header is protected by a 16 bit checksum with the standard ITU-T generator polynomial G (x) = x^16+x^12+x^5+1.
Synchronization: The first 128 bits are not only used for synchronization, but also game setting, energy detection (for the CCA), and frequency offset compensation. Start frame delimiters (SFD): This 16-bit field is used for synchronization at the beginning of a frame. Signal: Only two values have been defined for this field to indicate the data rate of the payload.
79
l l
Service: This field is reserved for future use. Length: 16 bits are used for length indication of the payload.
Header error check (HEC): Signal, servers and length fields are protected by this checksum using the ITU-T CRC 16 standard polynomial.
Infrared
The PHY layer is based on infrared (IR) transmission, uses near visible light at 850-950 nm, which is not regulated apart from safety restrictions (using laser instead of LEDs). The standard does not require a line-of-sight between sender and receiver, but should also work with diffuse light. This allows for pointto-multipoint communication. The maximum range is about 10 m if no sunlight of heat sources interferes with the transmission. Typically, such a network will only work in buildings, e.g., classrooms, meeting rooms etc. Frequency reuse is very simple-a wall is more than enough to shield one IR based IEEE 802.11 network from another. The Table 4.3 summaries the various features of the spread spectrum and infrared specifications.
S p read S p ec tru m 2 .4 - 2 .4 3 8 5 G H z F re q u en cy 5 .7 2 5 5 .8 2 5 G H z M a x im u m co v erag e 30 250 m , 4500 m 2 N o , b u t in p ractice th e rad io w av es L in e o f s ig h t req u irem en t P en etrate o n ly o n e co n crete w all. D SSS: 1 - 100 m W T ra n sm it p o w er FH SS: 10 - 100 m W In terb u ild in g u se P o ssib le w ith an ten n a R a te d s p eed 2 0 % to 5 0 % (% o f 1 0 M b p s w ire)
Table 4.3 : Features of spread spectrum & infrared specifications
80
DCF inter-frame spacing (DIFS): This parameter denotes the longest waiting time and thus the lowest priority for medium access. This waiting time is used for asynchronous data service within a contention period. PCF inter-frame spacing (PIFS): A waiting time between DIPS and SIFS (and thus a medium priority) is used for a time-bounded service. That is, an access point polling other nodes only has to wait PIFS for medium access. Short inter-frame spacing (SIFS): The shortest waiting time for medium access (and thus the
81
highest priority) is defined for short control messages, such as acknowledgment for data packets of polling responded.
If the mechanism is sensed idle for at least the duration of DIFS, a node can access the medium at once. This allows for short access delay under light load. But as soon as more and more nodes try to access the medium, additional mechanism is needed. If the medium is busy, nodes have to wait for the duration of DIFS, entering a contention phase afterwards. Each node now chooses a random backoff time with a contention window and additionally delays medium access for this random amount of time. As soon as a node senses the channel is busy it has lost this cycle and has to wait for the next chance, i.e. until the medium is idle again for at least DIFS. But if the randomized additional waiting time for a node is over and the medium is still idle, the node can access the medium immediately. The additionally waiting time is measured in multiples of slots. Slot time is derived from the medium propagation delay, transmitter delay and other PHY dependent parameters. To provide fairness IEEE 802.11 adds a backoff timer. Again each node selects a random waiting time with in the range of the contention window. As soon as the counter expires, the nodes access the medium. This means that deferred stations do not choose a randomized backoff time again but continue to count down. Thus longer waiting stations have the advantage over newly entering stations.
82
Figure 4.14 explains the basic access mechanism of IEEE 802.11 for five stations trying to send a packet at the marked points in time. Station has the first request from a higher layer to send a packet, waits for DIFS and accesses the medium, i.e., sends the packet. Station1, station2, and station5 have to wait at least until the medium is idle for DIPS again after station3 has stooped sending. Now all three stations choose a backoff time within the contention window and start counting down their backoff timers. Still, the access scheme has problems under heavy or light load. Depending on the size of the contention window (CW), the random values can either be too close together, causing too many collisions, or the values are too high, causing unnecessary delay. The contention window starts with a size of, e.g., CW min = 7. Each time a collision occurs, indicating a higher load on the medium, the contention window doubles up to a maximum of e.g. CW max =255 (the window can take on the values 7, 15, 31, 63,127, and 255). The larger the contention window is the greater is the resolution power of the randomized scheme. It is less likely to choose the same random back off time using a large CW. However, under a light load, a small CW ensures shorter access delays. This algorithm is also called exponential back off and is already familiar from IEEE 802.3 CSMA/CD in similar version.
83
Figure 4.15 shows a sender accessing the medium and sending its data. But now the receiver answers directly with an acknowledgement (ACK). The receiver accessed the medium after waiting for duration of SIFS and, thus, no other station can access the medium in the meantime and cause a collision. The other stations have to wait for DIFS plus their backoff time. This acknowledgement ensures the correct reception of the frame on the MAC layer, which is especially important in error-prone environments such as wireless connections. If no ACK is returned, the sender automatically retransmits the frame. But now the sender has to wait again and compete for the access right.
Figure 4.16 illustrates the use of RTS and CTS. After waiting for DIFS, the sender can issue a request to compare to other data packets. The RTS packet thus is not given any higher priority compared to other data packets. The RTS packet includes the receiver of the data transmission to come and the duration of the whole data transmission.
84
This duration specifies the time interval necessary to transmit the whole data frame and acknowledgement related to it. Every node receiving this RTS now has to set its net allocation vector (NAV) in accordance with the duration field. The NAV specifies then the earliest point in time at which the station can try to access the medium again. If the receiver of the data transmission receives the RTS, it answers with a clear to send (CTS) message after waiting for SIFS. This CTS packet contains the duration field again and all stations receiving this packet from the receiver of the intended data transmission have to adjust their NAV. The latter set of receivers need not be the same as the first set receiving the RTS packet. Now all nodes within receiving distance around sender and receiver are informed that they have to wait more time before accessing the medium. Basically, this mechanism reserves the medium for one sender exclusively. Finally, the sender can send the data after SIFS. The receiver waits for SIFS after receiving the data packet and then acknowledges whether the transfer was correct. Now the transmission has been completed and thus the NAV in each node marks the medium as free and the standard cycle can start again. However, the mechanism of fragmenting a user data packet into several smaller parts should be transparent for a user. Furthermore, the MAC layer should have the possibility of adjusting the retransmission frame size to the current error rate on the medium. Therefore, the IEEE 802.11 standard specifies a fragmentation mode. Again, a sender can send an RTS control packet to reserve the medium after a waiting time of DIFS. This RTS packet now includes the duration for the transmission of the first fragment and the corresponding acknowledgement. A certain set of nodes may receive answers with CTS, again including the duration of the transmission up to the acknowledgement. A set of receivers gets this CTS message and set the NAV. As shown in figure 6.10 the sender can now send the first data frame, frag1 after waiting only for SIFS. The new aspect of this fragmentation mode is that it includes another duration value in the frame frag1. This duration field reserves the medium for the duration of the transmission comprising the second fragment and its acknowledgement. several nodes may receive this reservation and adjust their NAV. The receiver of frag1 answers directly after SIFS with the acknowledgement packet ACK1 including the reservation for the next transmission as shown in figure 6.10. If frag2 was not the last frame of this transmission, it would also include a new duration for the third consecutive transmission. The receiver acknowledges this second fragment, not reserving the medium again. After ACK2, all nodes can compete for the medium again after having waited for DIFS.
85
controls medium access and polls the nodes. Ad hoc networks cannot use this function and thus provide no QoS, but only best effort in IEEE 802.11 WLANs. The point co-coordinator in the access point splits the access time into super frame periods as shown in figure 4.17. A super frame comprises a contention-free period and a contention period. The contention period can be used for the two access mechanisms presented above. The figure 4.18 also shows several wireless stations and the stations NAV.
At time t0 the contention-free period of the super frame should theoretically start, but another station is still transmitting data. This means that PCF also defers to DCF, and thus, the start of the super frame may be postponed. The only possibility of avoiding variation is not to have any contention period at all. After the medium has been idle until t1, the point coordinator has to wait for PIFS before accessing the medium. Since PIFS is smaller than DIFS, no other station can start sending earlier.
86
The point coordinator now sends data D1 downstream to the first wireless station. This station can answer at once after SIFS. After waiting for SIFS again, the point coordinator can poll the second station by sending D2. This station may answer upstream to the coordinator with data D2. Polling continues with the third node. This time the node has nothing to answer and, thus, the point coordinator will not receive a packet after SIFS. After waiting for PIFS, the coordinator can resume polling the stations. Finally, the point coordinator can issue an end marker (CF end), indicating that the contention period may start again. Using PCF it automatically sets the NAV, preventing other stations from sending. In the example, the contention free period planned initially would have been from t0 to t3. However, the point coordinator finished polling earlier thus shifting the end of the contention free period to t2. At t4, the cycle starts again with the next super frame.
Mac Frames
The figure 4.19 shows the basic structure of an IEEE 802.11 MAC data frame.
Frame control: These indicates the protocol version, the type of the frame (management, control, data), whether the frame has been fragmented, privacy information, and the 2 DS bits (distribution system bits), indicating the meaning of the four address fields in the frame. Duration ID: for the virtual reservation mechanism using RTS/CTS and during fragmentation, the duration field contains a value indicating the period of time in which the medium is occupied. Address 1 to 4: The four address fields contain standard IEEE 802 MAC addresses (48 bit each), as they are known from other 802.x LANs. The meaning of each address depends on the DS bits in the frame control field. Sequence control: Due to the acknowledgement mechanism it may happen that frames are duplicated. Therefore a seq. no. is used to filter duplicates.
87
Data: The MAC frame may contain arbitrary data (max. 2312 byte), which is transferred transparently from sender to the receiver(s). Checksum (CRC): Finally, a 32-bit checksum is used to protect the frame as this is common procedure in all 802.x networks.
MAC frames can be transmitted between mobile stations, between mobile stations and an access point, and between access points over a distribution system.
i) Synchronization:
Each node of an 802.11 network maintains an internal clock. To synchronize the clocks of all nodes, IEEE 802.11 specifies a timing synchronization function (TSF). Synchronized clocks are needed for power management, but also for coordination of the PCF, for synchronization of the hopping sequence in an FHSS system. Using PCF, the local timer of a node can predict the start of a super frame, i.e., the contention free and contention period. FHSS physical layers need the same hopping sequences for all the nodes to be able to communicate within a BSS. Within a BSS, timing is conveyed by the periodic transmission of a beacon frame. A beacon contains a timestamp and other management information used for power management and roaming. The timestamp is used by a node to adjust its local clock. The node is not required to hear every beacon to stay synchronized; however, from time to time internal clocks should be adjusted The transmission of a beacon frame is not always periodic, but is also deferred if the medium is busy. Within the infrastructure-based networks, the AP performs synchronization by transmitting the periodic beacon signal, whereas all other wireless nodes adjust their local timer to the time stamp. This is shown in the figure. 4.20.14 The AP is not always able to send its beacon B periodically if the medium is busy. However, the AP always tries to schedule transmissions according to the expected beacon interval (target beacon transmission time), i.e., beacon intervals are not shifted if one beacon is delayed. The timestamp of a beacon always reflects the real transmit time, not the scheduled time.
88
For ad hoc networks, the situation is slightly more complicated as they do not have an AP for beacon transmission. In this case, each node maintains its own synchronization timer and starts the transmission of a beacon frame after the beacon interval. Figure 4.21 shows an example where multiple stations try to send their beacon. However, the standard random back off algorithm is also applied to the beacon frames and thus, typically only one beacon wins. Now all other stations adjust their internal clock according to the received beacon and suppress their beacons for this cycle. If collision occurs, the beacon is lost. In this scenario, the beacon intervals can be shifted slightly in time because all clocks may vary and, thus also the start of a beacon interval from a nodes point of view. However, after synchronization all nodes again have the same consistent view.
89
The basic idea of the IEEE802.11 power management is to switch off the transceiver whenever it is not needed. Since the power management cannot know in advance when the transceiver has to be active for a specific packet, it has to wake up the transceiver periodically. Switching off the transceiver should be transparent to existing protocols and should be flexible enough to support different applications. However, throughput can be traded off for battery life. Longer periods save battery life but reduce average throughput and vice versa. The basic idea of power saving includes two actions for a station, sleep and awake, and buffering of data in senders it has to buffer data if the station is asleep. The sleeping station on the other hand has to wake up periodically and stay awake for a certain time. During this time, all senders can announce the destinations of their buffered data frames. If a station detects that it is a destination of a buffered packet it has to stay awake until the transmission takes place. Walking up at the right moment requires the timing synchronization function (TSF). All stations have to wake up or be awake at the same time. The following figure 4.22 shows an example with an access point and one station.
Power management in infrastructure-based networks is much simpler compared to ad hoc networks. In the latter case, there is no AP to buffer data in one location but each station needs the ability to buffer data if it wants to communicate with a power-saving station. All stations now announce a list of buffered frames during a period when they are all awake. Destinations are announced using ad hoc traffic indication map (ATIMs) the announcement period is called the ATM window.
90
Figure 4.23 shows a simple ad hoc network with two stations. Again, the beacon interval is determined by a distributed function (different stations may send the beacon). However, due to this synchronization, all stations within the ad hoc network wake up at the same time. All stations stay awake for the ATIM interval as shown in the first steps and go to sleep again if no frame is buffered for them. In the third step, station1 -has data buffered for station2. This is indicated in an ATIM transmitted by staiton1. Station2 acknowledges this ATIM and stays awake for the transmission. After the ATIM window, station1 can transmit the data frame, and station2 acknowledges its receipt. In this case, the stations stay awake for the next beacon.
iii) Roaming
Typical wireless networks within buildings require more than just one access point to cover all rooms. Depending on the solidity and material of the walls on one AP has a transmission range of 10-20 m if transmission is to have a decent quality. If a user walks around with a wireless station, the station has to move from one AP to another to provide uninterrupted service. Moving between APs is called roaming. The steps for roaming between AP are the following:
l
A station that the current link quality to its AP1 is too poor. The station then starts scanning for another AP. Scanning involves the active search for another BSS and can also be used for setting up a new BSS in case of ad hoc networks. IEEE 802.11 specifies scanning on single or multiple channels and differentiates between passive scanning and active scanning. Passive scanning means listening into the medium to find other networks, i.e., function within an AP. Active scanning comprises sending a probe on each channel and waiting for response. Beacon and probe response contain the information necessary to join the new BSS.
91
The station then selects the best AP for roaming based on, e.g., signal strength, and sends an association request to the selected AP2. The new AP2 answers with an association response. If the response is successful, the station has roamed to the new AP2 Otherwise; the station has to continue scanning for new APs. The AP accepting an associations request indicates the new station in its BSS to the distribution system (DS). The DS then update its database, which contains the current location of the wireless stations. This database is needed for forwarding frames between different BSSs, i.e., between the different APs controlling the BSSs, which combine to form an ESS.
4.11 SUMMARY
In this chapter we presented definition of signals and their characteristics. Specifically we introduced the concept of modulation, Carrier Signal, and noise, Bandwidth. These serve as basis for WLAN. We have touched upon different IEEE standards used in Wireless applications. The architecture and protocol of WLAN are covered in detail. Some of the related topics in MAC layer and power management are also discussed.
4.12 QUESTIONS
1. 2. 3. 4. 5. 6. 7. 8. 9. What are WLANs? What is modulation? What is a carrier signal? Define SNR? What is BW? Compare 802.11a, 802.11b, 802.11g and blue tooth. List out the advantages and disadvantages of WLAN? Compare Infrared and Radio transmission? Discuss the architecture of WLAN?
10. Briefly explain the WLAN protocol architecture? 11. Write a note on DSSS?
92
Chapter 5
5.1 INTRODUCTION
p until the mid 1970s cryptography was an arcane science practised largely by government and military security experts. A more serious attempt occurred in 1980, when the NSA (National Security Agency) funded the American council on education to examine the issue with a view to persuading congress to give it legal control of publications in the field of cryptography. As the eighties progressed, pressure focused more on the practice than the study of cryptography. This gave rise to the wide use of cryptography in all the fields of computer as well as Internet. With the introduction of the computer, the need for automated tools for protecting files and other information stored on the computer became evident. This is especially the case for a share system such as a time-sharing system and the need is even more acute for systems that can be accessed over a public telephone or data network. The generic name for the collection of tools designed to protect data and to thwart hackers is computer security. The second major change that affected security is the introduction of distributed systems and the use of networks and its communications facilities for carrying data between terminal user and computer and also between computers. Network security measures are needed to protect data during their transmission.
92
93
Cryptography is the science of securing data, whereas the cryptanalysis is the science of analyzing and breaking secure communication. Cryptanalysts are also called attackers. Classical cryptanalysis involves an interesting combination of analytical reasoning, application of mathematical tools, pattern finding, patience and determination.
Security attack: An action that compromises the security of information owned by an organization. Security mechanism: A mechanism that is designed to detect, prevent or recover from a security attack. Security Service: A service that enhances the security of the data processing systems and the information transfers of an organization. The services are intended to counter security attacks and they make use of one or more security mechanisms to provide the service.
94
knowledge of the encryption algorithms used. The four general categories of attacks are as shown in the diagram 5.1.
Destination
(b) Interruption
(c) Interception
(d) Modification
95
(e) Fabrication
Fig: 5.1 Security Threats
l
Interruption: An asset of the system is destroyed or becomes unavailable or unusable. This is an attack on availability. Interception: an unauthorized party gains access to an asset. This is an attack on confidentiality. The unauthorized party could be a person, a program, or a computer. Modification: An unauthorized party not only gains access to but tampers with an asset. This attack is on integrity. Fabrication: An unauthorized party inserts counterfeit objects into the system. This is an attack on authenticity.
Attacks are mainly categorized into passive and active attacks (figure 5.2).
Passive Attacks
Active Attacks
Passive Attack: In this attack the goal of opponent is to obtain information that is being transmitted. Their exists 2 types of passive attacks. They are release of message contents and traffic analysis.
l
96
mail message, and a transferred file may contain sensitive or confidential information. It is necessary to prevent the opponent from learning the contents of the transmissions.
l
The second traffic analysis is more subtle. Suppose that we had a way of masking the contents of messages or other information traffic so that opponents, even if they captured the message could not extract the information form the message. The common technique for masking contents is encryption. If we had encryption protection in place, an opponent might still be able to observe the pattern of these messages. The opponent could determine the location and identity of communication hosts and could observe the frequency and length of messages being exchanged. This information might be useful in guessing the nature of the communication that was taking place.
Passive attacks are very difficult to detect because they do not involve any alteration of the data. The emphasis in dealing with passive attacks is on prevention of the attack rather than detection.
Active attacks
These attacks involve some modification of the data stream or the creation of a false stream and it has been divided into 4 categories like masquerade, replay, and modification of messages and denial of service. Masquerade: This takes place when on entity pretends to be a different than other entity. This includes one of the other form of active attacks i.e. replay or modification of messages or denial of service. Replay: This involves the passive capture of a data unit and its subsequent retransmission to produce an unauthorized effect. Modification of messages: This means that some portion of the message is altered or that messages are delayed or reordered to produce an unauthorized effect. Denial of service: This prevents or inhibits the normal use or management of communications facilities. This attack will have a specific target. For example and entity may suppress all messages directed to a particular destination. Another form of service denial is the disruption of an entire network, either by disabling the network or by overloading it with message so as to degrade performance. Active attacks present the opposite characteristics of passive attacks. Active attacks are difficult to prevent their success. Prevention is difficult because to do so it require physical protection of all communications facilities and paths at all times. Instead, the goal is to detect them and to recover from any disruption or delays caused by them. Because the detection has a deterrent effect, it may also contribute to prevention.
97
general security services that encompass the various functions required of an information security facility. The classification of security services are classified as follows:
l
Confidentiality: This is the main service offered by the cryptography. This ensures that the information in a computer system and transmitted information are accessible only for reading by authorized parties. This type of access includes printing, displaying and other forms of disclosure including simply revealing the existence of an object. Authentication: This ensures that the origin of a message or electronic document is correctly identified, with an assurance that the identity is not false. Integrity: Ensures that only authorized parties are able to modify computer system assets and transmitted information. Modification includes writing, changing, changing status, deleting, creating, and delaying or replaying of transmitted messages. Non-repudiation: This requires that neither the sender nor the receiver of a message be able to deny the transmission. Access control: Requires that access to information resources may be controlled by or for the target system. Availability: Requires that computer system assets be available to authorized parties when needed.
98
Principal Message
Opponent
A security-related transformation on the information to be sent. Some secret information shared by the two principals and it is hoped, unknown to the opponent.
A trusted third part may be needed to achieve secure transmission. For example, a third party may be responsible for distributing the secret information to the two principals while keeping it from any opponent.
99
Or a third party may be needed to arbitrate disputes between the two principals concerning the authenticity of a message transmission The above general model shows that there are four basic tasks in designing a particular security service: 1. Design an algorithm for performing the security-related transformation. The algorithm should be such that an opponent cannot defeat its purpose. 2. Generate the secrete information to be used with the algorithm. 3. Develop methods for the distribution and sharing of the secret information 4. Specify a protocol to be used by the two principals that makes use of the security algorithm and the secret information to achieve a particular security service. A general model of other situations, which reflects a concern for protecting an information system from unwanted access is as shown in the figure 5.4. Hackers are the persons who attempt to penetrate systems that can accessed over a network. The hacker can be someone who, with no malign intent, simply gets satisfaction from breaking and entering a computer system. Or, the intruder can be a disgruntled employee who wishes to damage, or a criminal who seeks to exploit computer assets for financial gain.
Information system Competing resources Opponent Access Channel Gatekeeper function Data Process Software
Information access threats- intercept or modify data on behalf of users who should not have access to that data.
100
l
The security mechanisms needed to cope with unwanted access fall into two broad categories as shown in figure 1.4. The first category might be termed a gatekeeper function. It includes passwordbased login procedures that are designed to deny access to all but authorized users and screening logic that is designed to detect and reject viruses and other similar attacks. Once either an unwanted user or unwanted software gains access, the second line of defense consists of a variety of internal controls that monitor activity and analyze stored information in an attempt to detect the presence of unwanted intruders.
101
a way as to hide its substance is called encryption. An encrypted message is cipher text. The process of turning cipher text back into plaintext is called decryption as shown below.
M or P that stands for message or plaintext denotes either plain text. It can be a steam of bits, a text file, a bitmap, a stream of digitized voice, a digital video image. As far as a computer is concerned, M is simply binary data. The plaintext can be intended for either transmission or storage, which is to be decrypted. C denotes Ciphertext which is also a binary data. The size of the C can sometimes be the same size as M, or it may be larger than M. The encryption function E operates on M to produce C or in mathematical function E(M)=C. In the reverse process, the decryption function D operates on C to produce M i,e, D(C) = M. The whole point of encrypting and decryption a message is to recover the original plaintext. Both encryption and decryption operations use the keys (i.e they are dependent on the key and this fact is denoted by the K subscript), so the functions become Ek(M)=C Dk(C)=M Dk(Ek(M))=M
102
Cryptanalyst
X1 K1
Message Source
Encryption algorithm K
Decryption algorithm
X Destination
Key Source
Secure channel
A source produces a message in plaintext, x=[x1,x2xm]. The elements of X are letters in some finite alphabet. The alphabet usually consists of the 26 letters. Nowadays, the binary alphabet {0,1} is typically used. For encryption, a key of the form k=[k1,k2kj] is generated. If the key is generated at the message source, then it must also be provided to the destination by means of some secure channel. ( There is also possibility that a third party could generate the key and securely deliver it to both source and destination). With the message X and the encryption key K as input, the encryption algorithm form the cipher text Y=[y1,y2.yN]. we can write this as Y=Ek(X). This notation indicates that using encryption algorithm E as a function of the Plaintext X, with the specific function determined produces Y by the value of the key. The intended receiver, in possession of the key, is able to invert the transformation: X=Dk(Y). An opponent, observing Y but not having access to K or X, may attempt to recover X or K or both X and K. If the opponent knows the encryption (E) and decryption (D) algorithms, he tries to recover X by generating a plaintext estimate X1. By identifying the key or the decrypting the message he can read future messages as well, in which case an attempt is made to recover K by generating an estimate K1.
5.8 CRYPTOGRAPHY
Cryptographic systems are generically classified along three dependent dimensions: 1. The type of operations used for transforming plaintext to cipher text: All encryption algorithms are based on two general principles: Substitution, in which each element in the plaintext (bit, letter, group of bits or letters) is mapped into another element, and transposition, in which elements in the plaintext are rearranged.
103
2. The number of keys used: If both sender and receiver use the same key, the system is referred as symmetric or singe-key or secret-key or conventional encryption. If the sender and receiver each uses a different key, the system is referred to as asymmetric, two key or public key encryption. 3. The way in which the plaintext is processed: This cipher processes the input of one block of elements at a time, producing an output block for each input block.
5.8.1 Cryptanalysis
The whole point of cryptography is to keep the plaintext (or the key, or both) secrete from the opponents (also called adversaries, attackers, interceptors, interlopers, intruders, opponents, or simply the enemy). The process of attempting to discover X (Message or key) or both is known as cryptanalysis. There are four general types of cryptanalytic attacks. 1. Cipher text-only attack: The cryptanalyst has the cipher text of several messages, all of which have been encrypted using the same encryption algorithm. The cryptanalysts job is to recover the plaintext or key of any messages used to encrypt the messages, in order to decrypt other messages encrypted with the same key. Given: C1=Dm(P1),C2=Ek(p2)ci =Ek(Pi) Deduce: Either P1, P2,.Pi,k; or an algorithm to infer Pi+1 from Ci+1 = Ek (Pi+1) 2. Known plaintext attack: The cryptanalyst has the access to the cipher text as well as plaintext of the messages. Cryptanalysts job is to deduce the key (or keys) used to encrypt the messages or an algorithm to decrypt any new messages encrypted with the same key (or keys). Given: P1,C1 =Ek(P1),P2,C2=Ek(P2),..Pi,Ci=Ek(Pi) Deduce: Either k, or an algorithm to infer Pi+1 from Ci+1=(Pi+1) 3. Chosen plaintext attack: The cryptanalyst not only has access to the cipher text and associated plaintext for several messages, but also chooses the plaintext that gets encrypted. This is more powerful than a plaintext attack, because the cryptanalyst can choose specific plaintext blocks to encrypt, which might yield more information about the key. Given: P1,C1 =Ek(P1),P2,C2=Ek(P2),..Pi,Ci=Ek(Pi), Where the cryptanalyst gets to choose P1,P2Pi Deduce: Either k, or an algorithm to infer Pi+1 from Ci+1=Ek(Pi+1) 4. Adaptive chosen plaintext attack: This is a special case of chosen plaintext attack. In this
104
attack the cryptanalyst can choose and modify the plaintext that is encrypted, based on the results of previous encryption. The cipher text-only attack is the easiest attack to defend against other attacks. The analyst is able to capture one or more plaintext messages as well as their encryptions. For example, a file that is encoded in the Postscript format always begins with the same pattern, or there may be a standardized header or banner to an electronic funds transfer message, and so on. These are the examples of known plaintext. From this knowledge, the analyst is able to deduce the key on the basis of the way in which the known plaintext is transformed. In general, the analyst is able to choose the message to encrypt the messages using certain patterns that can be expected to reveal the structure of the key.
5.9 STEGANOGRAPHY
Stenography hides the message (secrete) in other messages. Generally the sender writes an innocuous message and then conceals a secret message on the same piece of paper. Historical tricks include invisible inks, tiny pin punctures on selected characters, minute differences between handwritten characters, pencil marks on typewritten character, grilles which cover most of the message except for a few characters and so on. Some examples are listed below:
l
Character marking: Selected letters of printed or type written text are over- written in pencil. The marks are ordinarily not visible unless the paper is held at an angle to bright light. Invisible ink: A number of substances can be used for writing but leave no visible trace until heat or some chemical is applied to the paper. Pin punctures: Small pin punctures on selected letters are ordinarily not visible unless the paper is held up in front of a light. Typewriter correction ribbon: Used between lines typed with a black ribbon, the results of typing with the correction tape are visible only under a strong light.
The advantage of steganography is that the parties can employ the stenographers to reveal the secrecy of the messages. But this has more disadvantages when compared to encryption. Stenography requires a lot of overhead to hide few bits of information and once the system is discovered, it becomes useless without maintaining the secrecy. This can be overcome by first encrypting the message and then hiding that message using stenography maintains the secrecy of the information.
105
In the classical encryption techniques there are four types of substitution ciphers:
1. A simple substitution cipher or monoalphabetic cipher- This is the one in which each character of the plaintext is replaced with a corresponding character of cipher text. The cryptograms in newspapers are simple substitution ciphers. 2. A homophonic substitution cipher is like a simple substitution cryptosystem, except a single character of plaintext can map to one of several characters of cipher text. 3. A Polygram substitution cipher is one in which blocks of characters are encrypted in-groups. 4. A polyalphabetic substitution cipher is made up of multiple simple substation ciphers. For example, there might be five different simple substation ciphers used; the particular one used changes with the position of each character of plaintext.
Caesar Cipher
This is the most famous substitution algorithm, in which each plaintext character is replaced by the character three to the right modulo 26. ( i,e A is replaced D , B is replaced by E) For example: Plain: m e e t m e t o m o r r o w Cipher: PHHW PH WRPRUURZ Note that the alphabet is warped around, so that the letter following Z is A. The transformation can be listed by the following possibilities:
106
Plain: a b c d e f g h i j k l m n o p q r s t u v w x y z Cipher: D E R G H I J K L M N O P Q R S T Y V W X Y Z A B C If the numerical values are assigned value of each letter is equal to (a=1,b=2, etc,) then the algorithm is expressed as follows. For each plaintext letter p, substitute the cipher text letter C: C= E(p) = ( p + 3 ) mod (26) A shift may be of any amount, so that the general Caesar algorithm is C = E ( p ) = ( p + k ) mod(26) Where k takes value in the range 1 to 25. The decryption algorithm is given by P = D ( C ) = ( C k ) mod(26) If the given cipher text is a Caesar cipher, then a brute-force cryptanalysis is easily performed, by simply trying all the 25 possible keys. Figure 2.4 shows the results of applying this strategy to the example cipher text. In this case the plaintext leaps out as occupying the third line. Plain text : meet me after the toga party Cipher text are:
PHHW oggv nffu meet ldds kccr jdbq iaap hzzo gyyn fxxm ewwl dvvk
PH og nf me ld kc jb ia hz gy fx ew dv
DIWHU chvgt bgufs after zesdq ydrcp xcqbo wbpan vaozm uznyl tymxk sxlwj rwkvi
WKH vjg uif the sgd rfc qeb pda ocz nby max lzw kyv
WRJD vqic uphb toga snfz rmey qldx pkcw ojbv niau mhzt lgys kfxr
SDUWB rctva qbsuz party ozqsx nyprw mxoqv lwnpu kvmot julns itkmr hsjlq grikp
107
jxu iwt hvs gur ftq esp dro cqn bpm aol znk ymj xli jewq idvp hcuo gvtn fasm ezrl dyqk cxpj bwoi avnh zumg ytlf xske fghjo epgin dofhm cnegl bmdfk alcej zkbdi yjach xizbg whyaf vgxze ufwyd tevxe
cuuj btti assh zrrg yqqf xppe wood vnnc unnb tlla skkz rjjy qiix
cu bt as zr yq xp wo vn um tn sk rc gr
qvjuh puitg othsf nsgre mrfqd lqecp kpdob jocna imbmz hmaly glzkx fkyjw ejxiv
The brute force cryptanalysis is used because of the following three characteristics:
l l l
The encryption and decryption algorithms are known. There are only 25 keys to try. The language of the plaintext is known and easily recognizable.
108
The two principal methods are used in substitution ciphers to lessen the extent to which the structure of the plaintext survives in the ciphertext. One approach is to encrypt multiple letters of plaintext, and the other is to use multiple cipher alphabets.
M C E L U
O H F P V
N Y G Q W
A B I/J S X
R D K T Z
109
In this case, the keyword is monarchy. The matrix is constructed by filling the letters of the keyword from left to right and from top to bottom, and then filling in the remainder of the matrix with the remaining letters in alphabetic order. The letters J and I count as one letter. Plaintext is encrypted two letters at a time, according to the following rules: 1. Repeating plaintext letters that would fall in the same pair are separated with a filler letter, such as x, so that bolloon would be enciphered as ba lx lo on. 2. Plaintext letters that fall in the same row of the matrix are each replaced by the letter to the right, with first element of the row circularly following the last. For example, ar is encrypted as RM. 3. Plaintext letters that fall in the same column are each replaced by the letter beneath, with the top element of the row circularly following the last. For example, mu is encrypted as CM. 4. Otherwise, the letter that lies in its own row replaces each plaintext letter and the column occupied by the other plaintext letter. Thus, hs becomes BP and ea becomes IM(or JM, as the encipherer wisher). The palyfair cipher is a great advance over simple monoalphabetic ciphers. For one thing, whereas there are only 26 letters, there are 26x26=676 diagrams, so that identification of individual digrams is more difficult. .
110
or
K=
17 21 2
17 18 2
5 21 19
Where C and P are column vectors of length 3, representing the plaintext and cipher text, and K is a 3x3 matrix, representing the encryption key. Operations are performed mod 26. Example , consider the plain text paymoremoney. And use the encryption key.
K=
17 21 2
17 18 2
5 21 19
The first three letters of the plain text are represented by the vector (15 0 24). Then K(15 0 24)= (375 819 486) mod26 = 11 13 18) = LNS. Continuing in this fashion, the cipher text for the entire plaintext is LNSHDLEWMTRW. Decryption requires using the inverse of the matrix K. the inverse k-1 of a matrix K is defined by the equation kk-1 K=1, where I is the matrix that is all zeros except for once along the main diagonal from upper left to lower right. The inverse of a matrix does not always exist, but when it does, it satisfies the proceeding equation. In this case, the inverse is
4 K -1= 15 24
9 17 0
15 6 17
17 21 2
17 18 2
5 21 19
4 15 24
9 17 0
15 6 17
442 495 52
442 780 19
1 = mod 26 0 0
0 1 0
0 0 1
It is easily seen that if the matrix K is applied to the cipher text, then the plaintext is recovered to explain how the inverse of a matrix is determined, we make an exceedingly brief excursion into linear algebra:- The interested reader must consult any text on that subject for greater detail. For any square matrix (m x n), the determinant = the sum of all the products that can be formed by taking exactly one
111
element from each row and exactly one element from each column, with certain of the product terms preceding by menus sign. For a 2 x 2 matrix the determinant is K11, K22 K12 K21.For a 3 x 3 matrix the value for determinant is K11 K22 K33 + K21 K32 K13 + K13 K12 K23 K31 K22 K13 K21 K12 K33 K11 K32 K23. If a square matrix A has a non zero determinant then the inverse of the matrix is computed as [A-1]ij =(-1)i+j (Dij) /dt (A), where ( Dij) is the sub-determinant formed by deleting the ith row and jth column of A and dt (A) is the determinant of A. For our purposes all arithmetic is done mode 26. In general terms, the Hill system can be expressed as follows C = Ek ( P ) = KP C = K-1 KP = P
P = Dk ( C ) = K-1
As with playfair, the strength of the Hill cipher is that it completely hides single-letter frequencies. Indeed, with Hill, the use of a larger matrix hides more frequency information. Thus a 3x3 Hill cipher hides not only single letter but also two-letter frequency information. This is strong against a cipher textonly attack; it is easily broken with a known plaintext attack. For an mXn Hill cipher, suppose we have m plaintext ciphertext pairs,each of length m. we label the pairs Pj =(P1j,P2jPmj ) and Cj=(C1j,C2j..Cjm)such that Cj=KPj for 1<=j<=m and for some unknown key matrix k. Now define two m x m matrices X=(Pij) and Y=(Cij). Then we can form the matrix equation Y=XK. If X has an inverse, then we can determine K=X-1Y. If X is not invertible, then a new version of X can be formed with additional plaintext ciphertext pairs until an invertible X is obtained
112
A pure transposition cipher is easily recognized because it has the same letter frequencies as the original plaintext. For the type of columnar transposition just shown, cryptanalysis is fairly straightforward and involves laying out the ciphertext in a matrix and playing around with column positions. The transposition cipher can be made significantly more secure by performing more than one stage of transposition .
5.11 SUMMARY
This chapter covered the introduction to cryptography , network services, network mechanisms and different types of attacks. The importance of gatekeeper in the network is also discussed in detail. We discussed upon many issues such as Attacks, services, and security mechanism. Different types of ciphers are discussed in detail.
5.12 QUESTIONS
1. 2. 3. 4. 5. 6. What is cryptography? Explain cryptographic algorithms Explain different types of attacks Explain briefly on security mechanisms Explain conventional encryption model? What is STEGANOGRAPHY?
Cryptography and Network Security : William Stallings, 3rd Edition, Pearson India Pvt, Ltd, 2005
vvvvv