Sunteți pe pagina 1din 23

Chapter 1

Web Essentials: Clients, Servers, and Communication


Essential Elements of WWW

Web Browsers Used to surf the web

Web Servers Used to supply information to the browser

Computer Networks Support Browser-Server Communication

Hyper Text Transfer Protocol-Used for bulk of web communication

The Internet

Technical origin: ARPANET (late 1960s)

Starts from US Department of Defense(DoD)

Renamed to ARPA(Advanced Research Projects Agency)-Research on


Computer Networking

One of earliest attempts to network heterogeneous, geographically dispersed


computers

1969 -> 4 computers/4 sites /4 different OS

Email first available on ARPANET in 1972 (and quickly very popular!)

ARPANET access was limited to select DoD-funded organizations

Open-access networks

Regional U.S. University networks (e.g., SURAnet)

1982 South Eastern University Research Association Network- University


of Maryland

CSNET for CS departments not on ARPANET

Computer Science Network funded by US National Science Foundation

1983 MILNET- 113 nodes connected Universities and Organizations involved in DoD
sponsored research

ARPANET developed FTP(File Transfer),SMTP(Email Operations)

TCP/IP-host-host communication within LAN

1982-Computers on outside networks connected through MODEM-asynchronous


communication- sending email(ARPANET and CSNET)

Many institutions connected through Phone Net(Modem) to perform email operations


- Dial-up (less expensive, delay occurred)
- Dedicated /Leased Connection (More Expensive / no delay)

NSFNET (1985-1995)

Primary purpose: connect supercomputer centers

Secondary purpose: provide backbone to connect regional networks

The Regional Networks Connected with NSFNET backbone are:


SURAnet
NYSERNet(Ithaca)
JvNCnet (Princeton)
SDSCnet (San Diego)

NSFNET backbone operated speed 56Kbits/Sec Maximum speed of home dial-up


line today

1988- speed upgraded to 1.5 Mbits/Sec (T1) /1991-45 Mbits/Sec (T3)

1988 Networks in Canada and France connected to NSFNET- This overcomes


ARPANET

Internet -1990 Center of Internet Collection of Computers connected via the


public backbone and communicating across networks using TCP/IP communication
protocol.

The Internet

1990 Commercial Internet dial-up first offered (Commercial traffic


economic, increase network usage, leading to reduced unit costs through
economics of scale, provide less expensive network for research and
education)

1991- Internet is used for conduit of information by scientists at research


institutions, entertainment and commerce

Backbone initially supplied by NSFNET, privately funded (ISP fees)


beginning in 1995

ISP (Internet Service Providers) Private telecommunication firms


connect directly with the Internet backbone Use and pay by end users

Definition INTERNET- It is a collection of computers that can communicate with one


another using TCP/IP over an open, global communication networks
Basic Internet Protocols

Communication protocol: how computers talk detailed specification of how


communication between two computers will be carried out in order to serve some
purpose

Internet Protocol specifies

High level behavior of software implementing the protocol

low-level details ( specific fields of information that will be contained in a


communication message, the order in which these field will appear, number of
bits in each field ,how these bits should be represented)

Internet protocols developed as part of ARPANET research

ARPANET began using TCP/IP in 1982

Designed for use both within local area networks (LANs) and between networks

Important

TCP/IP

UDP,DNS and Domain names

Higher Level protocols

Transmission Control Protocol/Inernet Protocol


Two different protocols combined to perform services web browsing, file
downloads, accessing remote databases
IP

IP is the fundamental protocol defining the Internet (as the name implies!)

IP address: (Key element of IP)

32-bit number (in IPv4) sequence of four decimal numbers separated by


periods

Written as four dot-separated bytes,


e.g. 192.0.34.166

Each decimal represents one byte of IP address

Each device on the Internet has one/more IP addresses

IP Function

IP software function: transfer data from source device to destination device

IP source software creates a packet represent the data

Header: source and destination IP addresses, length of data, etc.

Data itself

If destination is on same network send packet directly to the destination

If destination is on another LAN, IP Software sent packet to a gateway that connects


to more than one network
Eg: (Source)SURAnet(University of Delaware) Gateway backbone
Destination-San Diego

Route The sequence of computers that a packet travels through from source to
destination

BGP-4 (Border Gateway Protocol) pass network connectivity information between


gateways choose good next hop

Limitations of IP:

No guarantee of packet delivery (IP software checksum-corrupted packets can


be simply dropped)

Communication is one-way (source to destination)

Transmission Control Protocol (TCP)

TCP higher level protocol extends IP to provide additional functionality

adds reliable communication based on the concept of a connection on top of IP

Provides guarantee that packets delivered

Provide two-way (full duplex) communication

TCP also adds concept of a port

Used to communicate with many different applications on a machine

TCP header contains port number representing an application program on the


destination computer

Some port numbers have standard meanings


Example: port 25 is normally used for email transmitted using the Simple
Mail Transfer Protocol (SMTP)

IANA (Internet Assigned Numbers Authority)


0-1023 applications run by systems at boot up with administrative permissions
1024 -65535 user applications - available first-come-first served to any application

TCP/IP split long messages into shorter for transport over Internet and transparently
reassembling them on receiving side (Fragmentation & Reassembly)

User Datagram Protocol (UDP)

Alternative protocol for TCP in that:

Builds on IP

Provides port concept

Unlike TCP in that:

Not provide two-way connection concept

Not provide guaranteed delivery

Advantage of UDP vs. TCP:

Speed for simple tasks

Lightweight(less functionality ,less overhead), so faster for short /one-time


messages

Heavyweight (TCP)

Domain Name Service (DNS)

One Internet application run using UDP

DNS is the phone book for the Internet

Provide mechanism to map back and forth between host names and IP
addresses

DNS often uses UDP for communication

More DNS servers available (port 53)

Computer DNS service (to convert host name to IP address) use UDP
software to send UDP message to DNS server

Host names

Sequence of Labels separated by dots, e.g., www.example.org

Final label in the host name is top-level domain

Generic: .com, .org, .edu, .biz , etc.

Country-code: .us, .il, etc.

Top level domains are divided into sub domains / second-level domains, which can be
further divided into sub domains, etc.

E.g., in www.example.com, example is a second-level domain

A host name plus domain name information is called the fully qualified domain
name (FQDN) of the computer

Eg: www.example.com is the FQDN

Above, www is the host with local name,example is a second-level domain


and .com is top level domain

User level tool to query Internet DNS- program provides command-line access to
DNS (on most systems)

Lookup - to find IP address for given FQDN


nslookup www.example.org

Reverse lookup - to find a host name for given IP address


nslookup 192.0.34.166

Single IP address associated with multiple domain names

(E.g:192.0.34.166 -www.example.org,www.example.com)

Eventhough multiple qualified names are associated with IP address ,only one of the
name is returned by reverse lookup. This is called as canonical name
(www.example.com) of the host others are referred as aliases

Analogy to Telephone Network

Internet ~ the physical telephone network (provides basic communication


infrastructure)

TCP ~ calling someone who answers, having a conversation, and hanging up

UDP ~ calling someone and leaving a message

DNS ~ directory assistance

Higher-level Protocols

These protocols are used to communicate once a TCP connection has been established

Many protocols build on TCP

Telephone analogy: TCP specifies how we initiate and terminate the phone
call, but some other protocol specifies how we carry on the actual
conversation

Some examples:

SMTP (transfer email between different mail servers)

FTP (file transfer between machines)

Telnet (type commands in one computer and execute on remote computer)

HTTP (communication between webserver and web browsers , transfer of Web


documents )

IP (key component in the definition of Internet)

HTTP (key component in the definition of WWW)

World Wide Web


History

1979 Usenet news group - public sharing of information

FTP large file sharing b/w source and destination

First Internet chat software-IRC (Internet Relay Chat) private and public chat
facilities

Public information grew;locate those information also grew

Technologies for information management and search on the Internet developed

Information management technologies: one of several systems for organizing


Internet-based information

1990, Gopher Information servers ( provide simple hierarchical view of


documents)

WAIS (Wide Area Information System) Indexing and retrieving information

ARCHIE searching online information using FTP

Two types of software (server- provide information to other Internet systems,


client- to access information provided by server)

Client and Server communicate over the Internet by HTTP a communication protocol
built on the top of TCP/IP

Distinctive feature of Web:

support for hypertext (text containing links)

Communication via Hypertext Transport Protocol (HTTP)

Document representation using Hypertext Markup Language (HTML)

Gopher has links but documents are plain text, ARCHIE and WIAS provide no
support for links , but HTML hyper links, page layout facilities, inline
graphics

Definition :
The World Wide Web is the collection of machines (Web servers) on the
Internet (collection of machines globally connected via IP) that provide
information, particularly HTML documents, via HTTP.

Machines that access information on the Web are known as Web clients. A Web
browser is software used by an end user to access the Web.

Hypertext Transport Protocol (HTTP)

HTTP is a form of communication protocol detailed specification of how web server


and client should communicate

HTTP is based on the request-response communication model:

Client sends a request

Server sends a response

HTTP is a stateless protocol:

The protocol does not require the server to remember anything about the client
between requests.

Normally implemented over a TCP connection (80 is IANA standard port number for
HTTP)

Typical browser-server interaction:

User enters Web address in browser

Browser uses DNS to locate IP address

Browser opens TCP connection to server

Browser sends HTTP request over connection

Server sends HTTP response to browser over connection

Browser displays body of response in the client area of the browser window

The information transmitted using HTTP is often entirely text

Can use the Internets Telnet protocol to simulate browser request and view server
response

HTTP Request Message

Structure of the request message:

start line

header field(s) (one or more)

blank line

Message body (optional)

Start line

Example: GET / HTTP/1.1

Consists of three parts, single space used to separate adjacent parts:

HTTP request method

Request-URI portion of web address

HTTP version

HTTP Version

Initial version of HTTP was HTTP/0.9

First Internet RFC(Request for comments) HTTP/1.0 Current Internet Draft


Standard [RFC-2616]

HTTP/1.1- supported by all operational browsers and servers.

Newer version in future - change start line value

Request URI

Second portion of start line

Concatenation of http://,value of the host header field (www.example.org) and


the the Request-URI forms a string Uniform Resource Identifier

It identifies location of resource on the web

URI

URIs are of two types:


1. Uniform Resource Name (URN)
Can be used to identify resources with unique names, such as books

(which have unique ISBNs)


It has three colon - separated parts
urn:ISBN:0-1404-4417-3

First part is scheme name (urn)


Second part is namespace identifier (ISBN)
Third part is namespace-specific string (0-1404-4417-3)
2. Uniform Resource Locator (URL)

Specifies location at which a resource can be found

In addition to http, some other URL schemes are https, ftp, mailto,
telnet and file

Request-URI

Uniform Resource Identifier (URI) consists of two parts


- scheme : (case sensitive & lower case)
- another part depends on scheme web

addresses

Syntax: scheme : scheme-depend-part

Ex: In http://www.example.com/
the scheme is http

Request-URI is the portion of the requested URI that follows the host name
(which is supplied by the required Host header field)

Ex: / is Request-URI portion of http://www.example.com/

Request methods

HTTP/1.1 defines a CONNECT method-to create secure connection,

Common request methods:

GET

Used if link is clicked or address typed in location bar of browser or


browser downloads images for display within an HTML document

No body in request with GET method

POST

Used when submit button is clicked on a form

Form information contained in body of request

HEAD

OPTIONS

Store the body of the message for future use

DELETE

Returns list of HTTP methods

PUT

Requests that only header fields (no body) be returned in the response

No resource associated with this Request-URI

TRACE

Return a copy of the complete HTTP request message used primarily


for test purpose

HTTP Request

Structure of the request:

start line

header field(s)

blank line

optional body

Header fields and MIME types

Header field name starts with first character of a line

Header field structure syntax:

field name : field value

Field name host name is not case sensitive

Field value may continue on multiple lines(more than one value) by starting
continuation lines with one/more white spaces or tabs

white space is allowed to precede or follow the field value

Field values may contain MIME types, quality values, and wildcard characters
(*s)

Multipurpose Internet Mail Extensions (MIME) RFC-2045

To pass variety of types of information, including graphics and applications through


email and Internet message protocols

For specifying content type of a message has two parts case insensitive strings

MIME content type syntax :


top-level type / subtype

Examples: text/html, image/jpeg

[IANA-MIME] Current registered MIME types

Private / unregistered MIME types indicated by x- 0r X-

HTTP Quality Values and Wildcards

Quality values to indicate preferences

Syntax: string of the form;q=num,

Example header field with quality values:


accept:
text/xml,text/html;q=0.9,
text/plain;q=0.8,image/jpeg,
image/gif;q=0.2,*/*;q=0.1

Quality value applies to all preceding items

Higher the value, higher the preference

Wild card - */* all possible MIME types

Note use of wildcards to specify quality 0.1 for any MIME type not specified earlier

Top level MIME types

application

audio

image

message

model

multipart

text

video

Common MIME CONTENT TYPES

text/html HTML document

Image/gif Image represented Graphics Interchange Format

Image/jpeg Image in Joint Picture Expert Group format

Text/plain human readable text with no embedded formatting information

Application/octect stream arbitrary binary data(may be executable)

Application/x-www-form-urlencoded Data sent from a web form to a web server for


processing

HTTP Request

Common header fields:

Host: host name from URL (required)

User-Agent: type of browser sending request

Accept: MIME types of acceptable documents

Connection: value close tells server to close connection after single


request/response

Content-Type: MIME type of (POST) body, normally application/x-wwwform-urlencoded

Content-Length: bytes in body

Referer: URL of document containing link that supplied URI for this HTTP
request

HTTP Response

Structure of the response:

status line

header field(s)

blank line

Message body (optional)

HTTP Response

Status line

Example: HTTP/1.1 200 OK

Three fields separated by white space parts:

HTTP version - used by server software when formatting the response

Numeric status code indicate type of response


Three-digit decimal numbers
1st digit general class of status code
last 2 digits define the specific status within the

specified class

reason phrase (intended for human use)

HTTP Response

Status code

Three-digit number

First digit is class of the status code:

1=Informational provide information to client

2=Success Request has been successfully processed

3=Redirection (alternate URL is supplied)

4=Client Error (Clients request is not valid)

5=Server Error (An error occurred during server processing of a valid


client request)

Other two digits provide additional information

Common HTTP/1.1 Status Code

200 OK Request processed normally

301 Moved Permanently URI for the requested resource has changed

307 Temporary Redirect - URI for the requested resource has temporarily changed

401-Unauthorized- password protected & user not provide valid password

403 Forbidden- resource is present at server but read protected

404 Not Found- resource not present

500 Internal server error-server software detected internal failure

HTTP Response

Structure of the response:

status line

header field(s)

blank line

optional body

HTTP Response

Common header fields:

Connection, Content-Type, Content-Length

Date: date and time at which response was generated (required)

Location: alternate URI if status is redirection

Last-Modified: date and time the requested resource was last modified on the
server

Expires: date and time after which the clients copy of the resource will be outof-date

ETag: a unique identifier for this version of the requested resource (changes if
resource changes)

Cache Control-Client Caching

A cache is a repository for copies of information that originates elsewhere

A copy of information placed in a cache in order to improve system performance


(small, high speed cache to hold some of the data in RAM memory-slower than cache
memory)

A cache is a local copy of information obtained from some other source

Most web browsers use cache to store requested resources so that subsequent requests
to the same resource will not necessarily require an HTTP request/response

Ex: icon appearing multiple times in a Web page

Client Caching

Client Caching

Client Caching

Client Caching

Client Caching

Cache advantages

(Much) faster than HTTP request/response

Less network traffic

Less load on server

Cache disadvantage

Cached copy of resource may be invalid (inconsistent with remote version)

Client Caching

Validating cached copy of resource:


1. Send HTTP HEAD request which returns only the status line and header portion of the
response

The response message contains a Last-Modified time and this time precedes
the value of the Date header field returned with the cached resource , so
cached copy is valid otherwise cached copy is invalid and the browser should
send a normal GET request for the response

2. Check ETag header in response; Compare ETag returned by a HEAD request with the
cached resource. If ETag values are match then the cached copy is valid
3. Server can determine in advance the earliest time at which a resource will change, The
server can return that time in an Expires header. In this case, as long as the Expires time has
not been reached ,the client may use the cached copy of the resource without need to validate
with the server. If Expires time is not included in a response header was sent, use heuristic
algorithm to estimate value for Expires

Ex: Expires = 0.01 * (Date Last-Modified) + Date

Character Sets

Every document is represented by a string of integer values (code points)

The mapping from code points to characters is defined by a character set

Each character can be represented by single byte

Some header fields have character set values:

Accept-Charset: request header listing character sets that the client can
recognize

Ex: accept-charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Content-Type: can include character set used to represent the body of the
HTTP message

Ex: Content-Type: text/html; charset=UTF-8

Character Sets

US-ASCII 7-bit integer - many characters , mathematical symbols , human language


representation , graphical symbols not contained in US-ASCII charset

Single world wide charset needed Unicode It is an attempt to provide a single


character set that encompasses every human language representation as well as all
other commonly used symbols

Unicode standards Basic Multilingual Plane(BMP) which covers most of the


commonly used characters in every modern language use 16-bit character code to 21bit integers

Technically, many character sets are actually character encodings

An encoding represents code points using variable-length byte strings

Most common examples are Unicode-based encodings UTF-8 and UTF-16

IANA maintains complete list of Internet-recognized character sets/encodings [IANACHARSET]

Character Sets

Typical US PC produces ASCII documents

US-ASCII character set can be used for such documents, but is not recommended

UTF-8 and ISO-8859-1 are supersets of US-ASCII and provide international


compatibility

UTF-8 can represent all ASCII characters using a single byte each and
arbitrary Unicode characters using up to 4 bytes each

ISO-8859-1 is 1-byte code that has many characters common in Western


European languages, such as

21-bits are used for character, then the request and response message in between
client server is long(three times longer than ASCII)

Character sets

Character encoding is a bit string that must be decoded into a code-point integer that
is then mapped to a character according to the definition provided by some character
set

Represent character using variable-length bit strings

Common characters represented using shorter strings

Less common characters represented using longer strings

Web Clients

It is a software that accesses a web server by sending an HTTP Request message and
processing the resulting HTTP response

Many possible web clients:

Text-only browser (lynx)

Running on Mobile phones

Browsers that speak a page

Web clients not designed to directly use by humans Ex: Software Robots
(software-only clients, e.g., search engine crawlers)

etc.

Any web client that is designed to directly support user access to web server is known
as User Agent

Web Browsers - History

Text based browsers run on specialized platforms - Sun Micro Systems

First graphical browser running on general-purpose platforms: Mosaic (1993)-NCSANational Centre for Super computer Applications

Netscape Communications Corporations (Netscape navigator), Microsoft (Internet


Explorer),

War b/w Netscape Navigator & IE(Victorious)

Mozilla open source code, Firefox,Opera Safari

All browsers support HTTP Communication

Web Browsers

Browser Bars

Client area primary region displays a document

Title bar- title of the document currently displayed by the author, display browser
name as well as standard window management control

Menu bar set of drop down menus

Navigation toolbar- standard push button control,Back,Forward,Reload,Stop and


Print

Location Bar- enter URL to display the document located at the specified URL

Status bar- display message and icons related to the status of the browser(Resolving
host, Connecting to, Waiting for, Transferring data from,Done)

Basic Browser Functions

Convert web addresses (URLs) to valid HTTP request message

Server is specified host name use DNS to convert it into IP address

Establish TCP connection using IP address of the specified web server

Send HTTP request over TCP connection and wait for servers response

Render (position the text and graphics appropriately within the browser
window and display) documents returned by a server

HTTP URLs

Browser uses authority to connect via TCP

Request-URI included in start line (/ used for path if none supplied)

Fragment identifier not sent to server (used to scroll browser client area)

Http-scheme URL

http-scheme URL (http://)

Authority- Address of Internet web server-The portion of an http URL following


the :// string and before the next slash(/) It consists of FQDN and port number

Path- The portion from the slash following the authority through the (?) or through
the end of URL if there is no ? Mark(/a/b/c.txt)

Query String- Following the path there may be a question mark followed by
information(string form-to pass search terms to a web server)up to a (#) sign.

The browser forms the Request URI portion of an HTTP request message from a
URL by concatenating the path and query portions of the URL (/a/a/c.txt?
t=win&s=chess)

Fragment- final optional part of an URL exclude number sign.The string contained in
the fragment is known as fragment identifier - used by browser to scroll HTML
documents

URL

User types URL in the location bar ,HTTP request message start line:
GET /a/b/c.txt?t=win&s=chess HTTP/1.1
..
Host: www.example.org:56789

Host header field authority portion of URL

Fragment portion of URL not sent to server but used by browser to scroll Http
response HTML document

User-controllable Features

Standard features

Save web page(including images) to disk File|Save Page As

Find string in page similar to word processor (Edit|Find in This Page)

Fill forms automatically with past datas (passwords, CC numbers, ) Edit|


Save Form Info,Edit|Fill in Form save and retrieve form information in
mozilla,Tools|Form Manager used to manage saved form information

Set preferences customize browser functionality(Acceptlanguage(Navigator|Languages)->Language For Web pages, character set
(Navigator|Languages)-> Character Coding , cache properties (Advanced
->Cache->Set Cache Options box) and HTTP parameters (Advanced
->HTTP Networking ->Direct Connections Option Box) Edit|Preferences

User-controllable Features

Modify display style (e.g., increase font sizes) View -> Use style,View->Text
Zoom

Document meta information - Display raw HTML and HTTP header info (e.g.,
Last-Modified)

Choose browser themes (skins) View|Apply Theme|Get New Themes

View history of web addresses visited (Go->History)

Bookmark favorite pages for easy return

Web Browsers

Additional functionality:

Automatic URL Completion

Execution of scripts (e.g., Form validation drop-down menus)

Event handling (e.g., mouse clicks, mouse movement, Events not under user
control browser finish its rendering)

GUI for controls (e.g., buttons change)

Secure communication with servers(encode information)

Display of non-HTML documents (e.g., PDF) via plug-ins

Managing Cookies

Web Servers

Basic functionality:

Server calls TCP software and wait for connection request to one|more ports

When connection request is received ,server dedicates sub task to handle this
connection

Subtask establish TCP connection

Receive HTTP request via TCP

Map Host header to specific virtual host (one of many host names sharing an
IP address)

Map Request-URI to specific resource associated with the virtual host

File: Return file in HTTP response

Program: Run program and return output in HTTP response

Map type of resource to appropriate MIME type and use to set Content-Type
header in HTTP response

Log information about the request and response

TCP connection is kept alive,server monitor the request coming from client
until the length of time has elapsed

Web Servers-History

NCSA developed httpd web server 1990

Mid 1990 discontinued to develop web server

Several individuals running httpd created updates to the open source httpd software
updates are called as patches a patchy server Apache Server first public
release free-open source software April 1995 widely used server

Microsoft IIS-Internet Information Server all features of Apache drawback-Run


only on windows

Apache runs on Windows,Linux,Macintosh

IIS Vs Apache

IIS run programs VBScript, ASP

Apache Perl ,PHP PHP Hypertext Processor

Both run java with the help of servlet container

The servlet container provides JVM(Java Virtual Machine) to run java(servlets) and
provide communication b/w servlet and Apache/IIS

Apache software foundation Tomcat free-open source standalone web server


that communicate directly with web clients serve documents stored in the server
m/cs file system run non java programs

Web Server Configuration & Tuning

Servers have a large number of configuration parameters

Server Configuration broken into two areas: External Communication and Internal
Processing

Tomcat two java packages

Coyote HTTP/1.1 Communication

Catalina- Actual Servlet Container

Some Coyote communication parameters(affect external communication):

Allowed/blocked IP addresses

Max. simultaneous active TCP connections

Max. queued TCP connection requests

Keep-alive time for inactive TCP connections

Web Server Configuration & Tuning

These parameters decides performance of a server; Changing the values of these and
similar parameters in order to optimize performance is often referred to as tuning the
server

Modify parameters to tune server performance

Increase threads increase memory requirements-lead to slower responses to


individual requests

Tuning trial and error change the parameter values Load generation and stress test
tools used to simulate requests to web server and analyze the performance

Web Servers-Internal Processing

Some Catalina container parameters:

Which client machine send HTTP requests to the server

Virtual host names and associated ports

Logging preferences

Mapping from Request-URIs to server resources

Password protection of resources

Use of server-side caching

Tomcat Web Server

HTML-based server administration

Browse to
http://localhost:8080
and click on Server Administration link

localhost is a special host name that means this machine

Tomcat Administration tool

Server at default port :8080 http://localhost:8080

Tomcat included in JWSDP(Java Web Services Developer Pack)

This service has five components:

Coyote component - handle HTTP communication and port

Host virtual host (absolute and relative URL)

Logger record information about Server activity

Realm Access control user creation and assign roles


(Admin,Developer,End user)

Valve allow and deny list of clients

Connector Component

Handles HTTP communication and port information

Make change Save button press temporary change

Click commit changes button permanent change

Fields are:Accept Count,Connection Timeout,IP Address,Port Number,Min Spare


Threads,Max Threads,Max. Spare Threads

Defining Virtual Host

Host component is used

Key fields:Name(FQDN),Application base,Deploy on Startup,Auto Deploy

Web application It is a collection of files and programs that work together to


provide a particular function to web users.

Web site run two web applications

Defining Virtual Host

1. One is used by administrators that provides maintenance functionality


2. Another is used by external clients that provides customer functionality

Tomcat web application is represented by context component.

Each host and context associated with directory in the servers file system

The directory associated with host value of Application Base Field(Relative or


Absolute path)

The directory associated with context is specified by Document Base field

Logging

Web server logs record information about server activity

The primary web server log recording normal activity is an access log , a file that
records information about every HTTP request processed by the server

It may also contains debugging and other server information

Valve component is used

Fields are: Directory,Pattern,Prefix,Resolve Hosts,Rotatable,Suffix

Log entry Syntax: %h %l %u %t %r %s %b (www.example.org admin


[20/Jul/2005:08:03:22 -0500] GET /admin/frameset.jsp HTTP/1.1 200 920]

Information in log entry: Host Name,User Name,Date and Time of response plus the
time zone,Start line of HTTP request,HTTP status code of response,Number of bytes
sent in body of response

Logging

Advantage Log files are used by Log analyzers- produce reports on various
aspects of site usage ( Number of accesses per day,% of request thatvreceived error
status code,break down of accesses by domain)

Useful for server tuning,locating software problems,modifying site content to satisfy


audience

Access Control

provide automatic password protection for resources that it serves

Two stage process


A database of user names is created.Each user name is associated with

password and list of roles (Administrator,Developer,End User)


Associating resources with required roles

Realm component is used for database related services

Valve component Object type of Remote Host Valve and Remote Address Valve
specify allow and deny list of clients (*.example.org allow list baduser.example.orgdeny list)

Secure Server

HTTP request and response messages simple text-carried by TCP/IP ,each message
travel through number of machines before reaching its destination

Any machine other than the sender or receiver that extracts information from network
messages Eavesdropper

To prevent eaves dropper to obtain sensitive information encrypt before transmission


over any public communication network

Use https scheme

TLS Transport Layer Security (TLS 1.0)

SSL Secure Socket Layer (SSL 3.0)

[RFC-2246]

Secure Server

Client browser securely communicate with server TLS Handshake

During handshake client and server agree on various parameters to encrypt messages

Server sends certificate to the client, using certificate avoid meet-in-the-middleattack-avoid unauthorized access

Default port for HTTP communication is 80,TLS/SSL is 443

To access the root of secure server on local host at port 8443,https://localhost:8443/

Create self-signed certificate


keytool genkey alias tomcat keyalg RSA

Case Study

BLOG- Simple tool for writing and reading a web log(blog).

User blogger able to add text entries to the blog.Most recent entry appear at the
beginning of the web page,followed by next most recent and so on for all entries made
during the current month.

S-ar putea să vă placă și