Sunteți pe pagina 1din 53

1

IT2353 Web Technology UNIT-I 1.0 Web Essentials: Clients, Servers, and Communication. 1.1 - The Internet: The Internet is a system that lets computers all over the world to communicate with each other. The Internet is a network of networks that connects all over the world. a) Internet is a Network: The computers on the Internet can talk to one another because they are networked; they are connected in some way so that they can exchange information with one another electronically on the Internet, the connections take many different forms: o They can be connected directly using wire or fibre optic cables. o Some are connected through local and long distance telephone lines. o Some even use wireless satellite communication. b) Internet is a Loose Organisation: No single entity or organization controls it. Its like a neighbourhood where people communicate because they are able to, but they have not established a formal organization with rights, rules and leaders. c) Internet is a Relaxing place: Internet resources enable users to exchange information in an open, public way. These resources provide a forum where users can write and post messages for other users and where they can read messages posted by others. d) Internet is a Business Tool: Increasingly global business climate and the great extent to which big companies rely on computers, the Internet looks like a great vehicle for national and international business communication. Now a days, several organizations are running their business through Internet for various purposes like document exchange, product specifications, payment etc. e) Internet is a Mailbox: The most-used Internet facility is electronic mail also known as e-mail. Everyone who lives in the Internet world has a unique Internet name, email id. To send a message to an Internet user, anywhere in the world, all the sender has to know is the address of the recipient. The sender types up a

UNIT-I

IT2353/Web Technology

IT/RMKEC

message in his/her favorite word processing or e-mail software programs hops onto the Internet, types in the address and sends the message on its way. The message should arrive at the recipients computers almost instantly. After all its travelling through wire at nearly the speed of the light. f) Internet is a Information Pool: Whatever the information the users in need, it can be easily located and utilized from the Internet. Today Internet is the best utility that is available to all the users at cheaper prices to know about things. It is an information pool that satisfies varied level of users from kid to old age, from student to business professions, researchers and Government organization.

1.2

BASIC INTERNET PROTOCOLS:

A computer communication protocol is a detailed specification of how communication between two computers will be carried out in order to serve some purpose. The Internet protocols specifies both the high level behavior of software implementing the protocol and the low level details communication message. 1.2.1 TCP/IP: TCP/IP are actually two different protocols. But they are treated as a single protocol is that bulk of the services associate with the Internet-email, web browsing, downloads, accessing remote databases are built on top of both the TCP and IP protocols. Internet Protocol(IP) is fundamental to the definition of the Internet. A key element of IP is the IP address, which is simply a 32 bit number. Each device on the Internet has one or more IP addresses associated with it. IP addresses are written as a sequence of four decimal numbers separated by periods (called dots) as in 192.0.34.166. Each decimal number represents one byte of the IP address. Function of IP software is to transfer data from one computer to another computer. of the specific fields of information in the

UNIT-I

IT2353/Web Technology

IT/RMKEC

When an application on the source computer wants to send information to a destination the application on the source computer calls IP software and provides it with data to be transferred along with the an IP address for each of the source and destination computers. The IP software running on the source creates a packet, which is a sequence of bits representing the data to be transferred along with the source and destination IP addresses and Header Information such as length of the data. It the destination computer is on the same local network: Then the IP software will send the packet to the destination directly via this network. The another network: Then the IP software will send the packet to the gate way which is a device that is connected to the sources network as well as to atleast one other network. The gateway will select a computer on one of the other networks to which it is attached and send the packet on to that computer. The packet will perhaps get through more hops until reached the destination. The IP software on the destination will receive the packet and pass its data upto an application that is waiting for the data. The packet from SURANET source is transmitted to the gateway which is connected to the SURANET. As the NSFNET gateway is not connected to the SURANET, so the packet is routed to another gateway say Telnet gateway. From there, it is routed to the NSFNET. So, the packet would need to go through at least one other gateway on the NSFNET. The sequence of computers that a packet travels through from source to destination is known as route. The sequence of computer that are used for routing the information(packet) is chosen by Border Gateway Protocol. IP software also adds some error detection inform action to each packet it creates. So if a packet is corrupted, that can usually be detected by the recipient and discard the corrupted packets. So, IP-based communication is unreliable. It alone is not a particularly good form of communication for many internet applications.

UNIT-I

IT2353/Web Technology

IT/RMKEC

TCP: It is a higher level protocol that extends IP to provide additional functionality on the concept of a connection. Once a connection is established, TCP provides reliable data transmission demanding an acknowledgement for each packet it sends via IP. The sender sets the timer, after sending each packet, the TCP software on the receiving side sends a packet containing an acknowledgement for every TCP-based packet it receives. If the sender does not receives an ack. packet before its timer expires, then it resends the packet and restarts the timer. Another feature that TCP adds to IP is the concept of a port. The port concept allows TCP to be used to communicated with many different applications on a machine. by

TCP[25] represents a TCP header containing 25 as the port number Mail server ask TCP to listen for requests to port 25. This request is received by the machine(mail client) and that IP message contains a TCP message with port 25 indicated in its header, then the data contained within the TCP message will be returned to the mail server application.

1.2.2 UDP, DNS and DOMAIN NAMES: UDP is an alternative protocol to TCP that also builds on IP.
UNIT-I IT2353/Web Technology IT/RMKEC

The main feature that UDP adds to IP is the port concept UDP does not provide the two-way connection or guaranteed delivery of TCP. UDP is speed when compared to TCP for simple tasks. One Internet application that is often run using USP rather than TCP is the Domain Name Service (DNS) Every device on the internet has an IP address such as 192.0.34.166. But humans generally find it easier to refer to machines by names, such as www.example.org DNS provides a mechanism for mapping back and forth between IP addresses and host names. Basically there are number of servers(DNS) on the internet, each listening through UDP software to a port. Conversion takes place as follows: Computer on the Internet needs DNS service. It uses UDP software running on its system to send a UDP message to one of these DNS servers, requesting the IP address. Then server will send back a UDP message containing the IP address. Internet host names consist of a sequence of labels separated by dots. Ex: www.google.com www world wide web service that is used for information retrieval purpose. .com It is commercial orgn. It is top-level domain. It can be 2 or 3 characters. Domain .com .net .org .gov .mil .edu .web .tv .info Description Commercial organization ISPs &Network related Organisation Non-commercial organization Government agencies Military Educational Institutions Web based company Television channel Information site

UNIT-I

IT2353/Web Technology

IT/RMKEC

Normally two letters are used to segregate between the countries. .au .us .ca .in .uk 1.2.3 Australia U.S. Canada India United kingdom

HIGHER LEVEL PROTOCOLS Higher level protocols are used to communicate once a TCP connection has been established. SMTP & FTP are widely used higher-level protocols that are used to communicate over TCP connections. SMTP supports transfer of e-mail between different email servers. FTP is used for transferring files between machines. Another protocol Telnet is used to execute commands typed into one computer on a remote computer. The primary TCP-based protocol used for communication between web servers and browsers is called the HTTP (Hyper Text Transfer Protocol)

1.3 THE WORLD WIDE WEB Public sharing of information has been a part of the Internet. The Usenet newsgroup service began in 1979 and provided a means of posting information that would be read by users on other systems with the appropriate software. Large files were often shared by running an FTP server application. The first internet chat software in widespread use is IRC, provided both public and private chat facilities. The amount of information publicly available on the internet grew, the need to locate information also grew. Various technologies for supporting information management and search on the internet were developed. Some of the more popular information management technologies In 1990 Gopher Information Servers provided a simple hierarchical view of documents.
UNIT-I IT2353/Web Technology IT/RMKEC

Wide Area Information system for indexing and retrieving information. Archie tool searching online information archives accessible via FTP. All the above technologies two types of software. (i) Server (ii) Client

Server: An Internet-connected computer that wishes to provide information to other internet systems must run server software. Client: A system that wishes to access the information provided by servers must run client software. The communication between the client and the server over the internet by following a communication protocol built on top of TCP/IP. The protocol used by the web is the HyperText Transport Protocol(HTTP) This is a generic protocol that supports a client requesting a document from a server and the server returning the requested document. Most web pages are written using HTML(HyperText Markup Language) HTML pages can contain the familiar web links to other documents on the web. HTML also provides extensive page layout facilities, including support for online graphics has added significantly to the commercial appeal of the web. WWW can be informally defined as the collection of machines on the internet that provide information via HTTP, and particularly those that provide HTML document. 1.3.1 HYPEXTEXT TRANSPORT PROTOCOL: HTTP is a form of communication protocol, in particular a detailed specification of how web clients and servers should communicate. The basic structure of HTTP communication follows what is known as a request-response model. An HTTP interaction is initiated by a client sending a request message to the server, the server generates a response message. HTTP does not use network protocol to send these messages, but the request and response are both sent within a TCP-style connection between the client and the server. Ex: When we type the address in the address bar and pressed enter key, after that i)
UNIT-I

Browser created a message conforming to the HTTP protocol


IT2353/Web Technology IT/RMKEC

ii) iii) iv) v)

Used DNS to obtain an IP address for the Domain Name Address. Created a TCP connection with the machine at the IP address obtained. Sent the HTTP message over this TCP connection Receive back a message containing the information that is displayed in the client area of the browser.

HTTP request and response messages often consist of plain text in a fairly readable form. Using Telnet enter the request at a command prompt. Telnet is in port 80 given by IANA standard port for HTTP web servers. $ telnet www.example.org 80 Trying 192.0.34.166... Connected to www.example.com (192.0.34.166). Escape character is ^]. GET / HTTP/1.1 Host: www.example.org HTTP/1.1 200 OK Date:Thu,09Oct200320:30:49GMT This is the HTTPs basic structure. 1.4 HTTP REQUEST MESSAGE 1.4.1 OVERALL STRUCTURE: Every HTTP request message has the same basic structure. Start line Header field(s) (one or more) Blank Line Message body (optional) In the Previous example : GET/HTTP/1.1 is the start line Every start line consists of three parts. i) ii) iii) Request Method Request-URI portion of web address HTTP version } Connect }

} HTTP REQUEST } }HTTP RESPONSE }

1.4.2 HTTP VERSION: The initial version of HTTP is HTTP/0.9


UNIT-I IT2353/Web Technology IT/RMKEC

The first internet RFC regarding HTTP described HTTP/1.0 In 1997, HTTP/1.1 was formally defined and is currently an Internet Draft Standard. Essentially all operational browsers and servers support HTTP/1.1 1.4.3 REQUEST-URI: The second part of the start line is known as the Request-URI The concatenation of the string http://, the value of the Host header

field(www.example.org), the Request-URI (/) forms a string called as a Uniform Resource Identifier (URI) http://www.example.org/ URI is an identifier that is intended to be associated with a particular resource on the www. Every URI consists of two parts: i) ii) Scheme (which appears before the colon(:)) Another part that depends on the scheme

Web addresses for the most part use the http scheme In this scheme URI represents the location of a resource on the web. A URI of this type is said to be a Uniform Resource Locator. Uniform Resource Name is designed to be unique name for a resource rather than specifying a location at which to find the resource. Ex: An edition of war and peace has an ISBN of 0-1404-4417-3 associated with it. This is the only book world wide with this number. So this book has an associated URN, which can be written as follows: urn:ISBN:0-1404-4417-3 URI for a URN always consist of three colon-separated parts. First part is the scheme name which is always urn for a URN-type URI Second part is the name space identifier(ISBN) Third part is the name space-specific string.

1.4.4-REQUEST METHOD The primarily HTTP method is GET

UNIT-I

IT2353/Web Technology

IT/RMKEC

10

This method is used when URL typed into the location bar of the browser. This method is used by default when link is clicked in a document displayed in browser and when the browser downloads images for display within an HTML document. The POST method Typically used to send information collected from a form displayed within a browser, such as an order-entry form, back to the web server. Other request methods are HEAD Return the same HTTP header fields that would be returned if a GET method were used, but not return the message body OPTIONS Return a list of HTTP methods that may be used to access the resource specified by the Request-URI PUT TRACE Store the body of this message on the server for future use Return a copy of the complete HTTP request message. Used primarily for test purposes.

1.4.5- HEADER FIELD AND MIME TYPES HTTP/1.1 defines a number of header fields, which are commonly used by modern browsers. Each header field begins with a field name, such as Host followed by a colon and then a field value. Ex: HTTP request sent by a browser consists of a start line, 10 header fields, and a short message body.

POST/Servlet/Echo HttpRequest HTTP/1.1 Host: www.example.org:56789 User-agent: Mozilla/5.0 Accept: text/xml Accept-language: en-us,en; Connection: keep-alive Keep-alive: 300

---- Start line } } } } } } }


IT/RMKEC

Content-type:application/x-www-form-urlencoded
UNIT-I IT2353/Web Technology

11

Content-length: 13 Doit = click+me - short message body Some of the features of header fields Header names are not case sensitive

} Header fields

Header field value may wrap onto several lines by preceding each continuation line with one or more spaces or tabs. Use of MIME types in several header field values. MIME Multipurpose Internet Mail Extensions It refers to a standard that can be used to pass a variety of types of information including graphics and applications through e-mail as well as through other Internet message protocols. The content of the MIME message is specified using a two part Case insensitive string known as content-type of the message Header field value use so-called quality values to indicate preferences. Ex: q=num->decimal number between 0 & 1, with higher number representing greater preference. 1.5-HTTP RESPONSE MESSAGE HTTP response message consists of a status line, header fields and the body of the response in the following format: Status line Header Field(s) Blank line Message body(optional) 1.5.1 RESPONSE STATUS LINE: Ex: HTTP/1.1 200 OK Like request, response message the status line consists of three fields. i) ii) iii) HTTP version used by server software when formatting the response. Numeric code indicating the type of response Text string that presents the information represented by the numeric status code in human readable form. In this example, the status code is 200 and the reason phrase is OK
UNIT-I IT2353/Web Technology IT/RMKEC

12

This status code indicated that no error were detected by the server. The body of a response having this status code should contain the resource requested by the client. All status codes are three digit decimal numbers. First digit represents the general class of status code. Digit 1 Class Informational Use Provides information to client before request processing has been completed. 2 3 4 5 Success Redirection Client error Server Error Request has been successfully processed. Client needs to use a different resource to fulfill request. Clients request is not valid. An error occurred during server processing of a valid client request.

Status codes: Code 200 301 307 401 403 404 500 Reason Phrase OK Moved permanently Temporary redirect Unauthorised Forbidden Not found Internal Server Error

The HTTP standard recommends reason phrase for all status code. 1.5.2 RESPONSE HEADER FIELDS:

Some of the header fields used in HTTP request messages, including connection, content type and content length are also valid in response messages.

UNIT-I

IT2353/Web Technology

IT/RMKEC

13

The content type of a response can be any one of the MIME type values specified by the Accept header field of the corresponding request. Some of the common response header fields are: Field Name Date Use Time at which response was generated by the server Server Information identifying the server software generating this response Last-modified Time at which the resource returned by this request was last modified. Expires Time after which the client should check with the server before retrieving the returned resource from the clients cache. Location Used in responses with redirect status code to specify new URI for the requested resource. 1.5.3 CACHE CONTROL Most web browsers automatically cache on the client machine many of the resources that they request from servers via HTTP. For example, if an image such as button icon is included in a web page, a copy of the image obtained from the server will typically be cached in the clients file system. If another page at the same site uses the same image the image can be retrieved from the client file system rather than sending again the request to the server. HTTP caching reduce the network communication, and reduce load on the web server. Key drawback in using cache: Information in a cache can become invalid, if the image is modified on the server, but a client accesses its cached copy of the older version of the image. The above drawback can be avoided in several ways:

UNIT-I

IT2353/Web Technology

IT/RMKEC

14

i)

Guaranteeing that a cached copy of a resource is valid is for the client to ask server whether or not the clients copy is valid. This is done with a relatively little communication by sending an HTTP request for the resource using the HEAD method, which returns only the status line and header portion of the response. With the last-modified time field, can know about the validity.

ii)

With ETag The client can then simply compare the Etag with the stored ETag. If match, then the cached copy is valid otherwise it is not.

1.5.4 CHARACTER SETS: Characters are represented by integer value within a computer. A character set defines the mapping between these integers, or code points, and characters. For example US-ASCII is the character set used to represent the characters used in HTTP header field names and also used in key portions of many other Internet protocols. For web pages, which are meant to be viewed throughout the world, a single worldwide character set be used. The Unicode standard is an attempt to provide a single character set that encompass every human language representation as well as all other commonly used symbols. In addition to the character set, browsers also accept certain character encodings. A character en-coding is a bit string that must be decoded into a code-point integer that is then mapped to a character according to the definition provided by some character set. A character encoding often represents characters using variable-length bit strings. UTF-8 use variable numbers of 8 bit values. UTF-16 use variable numbers of 16 bit values. The Accept-charset header field is sued by a client to tell a server the character sets and character encodings that it will accept as well as its preferred character sets or encodings. Ex: accept-charset: ISO-8859-1, utf-8, q=0.7, *;q=0.7 Said that the client would prefer to receive documents using the ISO-8859-1 characterset or the UTF-8.
UNIT-I IT2353/Web Technology IT/RMKEC

15

A webserver can inform a client about the character set/encoding used in a returned document by adding a char set parameter to the value of the content-type header field. Ex: content-type: text/html; charset=UTF-8. Content type indicates the body of the message is an HTML document. Charset indicated that the body is written using the UTF-8 character encoding. 1.6 WEB CLIENTS A web client is software that accesses a web server by sending an HTTP request message and processing the resulting HTTP response. Web browsers running on desk top or lap top computers are the most common form of web client software. Other client software are text-only browsers, browsers running on cell phones and browsers that speak a page rather than displaying the page. In general, any web client that is designed to directly support user access to web servers is known as a user agent.

1.6.1 BASIC BROWSER FUNCTIONS: The window of a typical modern browser is split into several rectangular regions known as bars. The primary region is the client area, which displays a document. The title bar displays a title assigned by the document author to the document currently displayed within the client area. The title bar also displays the browser name as well as standard window management controls. The menu bar contains a set of drop down menus much like most other applications that incorporate a GUI. The browsers navigation toolbar contains standard push-button controls that allow the user to return to a previously viewed web page(Back), reverse effect of pressing Back(Forward), ask the server for an updated version of the page currently

UNIT-I

IT2353/Web Technology

IT/RMKEC

16

viewed(Reload), halt page downloading currently in progress(Stop), and print the client area of the window(print). The Navigation toolbar also contains a text box, known as Location bar, where a user can enter a URL and the Enter key is pressed in order to request the browser to display the document located at the specified URL. If search key is pressed instead of Enter key, it will go to search engine. Clicking down-arrow at the right side of the location bar produce a drop down menu of recently used URLs that can be visited again with a single click. Finally the status bar displays messages and icons related to the status of the browser. The messages displayed in the left portion of the status bar are normally information about the communication between client and server. The right portion of the status bar consists of a icon indicating browser online and that the browser is communicating with the server over insecure communication channel. A primary task of any browser is to make HTTP requests on behalf of the browser user. The browser must perform a number of tasks. i) ii) Reformat the URL entered as a valid HTTP request message. If the server is specified using a host name, use DNS to convert to appropriate IP address. iii) iv) Establish a TCP connection using the IP address of the specified web server. Send the HTTP request over the TCP connection and wait for the servers response. v) 1.6.2 URLS An HTTP scheme URL consists of a number of pieces. Ex: http://www.example.org:56789/a/b/c.txt?t=win&s=chess#para5 www.example.org:56789 is the authority of the URL consists of either a fully qualified domain name or an IP address of an Internet Web sever, optionally followed by a colon (:) and a port number. IF the port number is omitted, then a TCP connection to port 80 is implied. Display the document contained in the response.

UNIT-I

IT2353/Web Technology

IT/RMKEC

17

www.example.org is the authority fully qualified domain name. 56789 is the port number. Following the authority through the question mark(?) is called the path of the URL. The leading slash is part of the path, but the question mark is not. In the example /a/b/c.txt is the path. The information between but not including ? and the number sign (#) is the query portion of the URL. In the xample t=win & s=chess is the query portion. Query portion was intended to pass search terms to a web server. The final optional part of the URL, is the fragment of the URL. The text after the number sign(#) is the fragment. Para5 is the fragment. It is used by the browsers to scroll the HTML documents. 1.6.3 USER CONTROLLABLE FEATURES. Graphical browsers provide many user-controllable features. i) Save: Most documents can be saved by the user to the client machines file system. If the document is an HTML page that contains such as images, then the browser will attempt to save all these documents locally so that the entire page can be displayed from the local file system. ii) Find in Page: Standard documents can be searched with a function that is similar to that provided by most word processors. iii) Automatic form filling: The browser can remember information entered on certain forms such as billing address, phone numbers etc. iv) Preferences: Users can customize browser functionality in a wide variety of ways. Some preference settings related to the HTTP are: Accept Language Default character set/encoding: The character set/encoding to be assumed for documents. Cache properties: The amount of local storage allocated to the cache and the conditions controlling when a cached file will be validated are set. HTTP settings: The version of HTTP used and whether or not the client will keep connections alive is set.
UNIT-I IT2353/Web Technology IT/RMKEC

18

v)

Style Definition: The user can define certain aspects affecting how the browser renders HTML pages, such as font sizes, back ground and fore ground colors etc.

vi)

Documents meta-information: Interested users can view information about the displayed document, such as the documents MIME type, character encoding, size etc.

vii) viii)

Themes: The navigation bar can be modified by applying a certain name. History: The browser will automatically maintain a list of all pages visited within the last several days.

ix)

Bookmarks: Users can explicitly bookmark a web page, that is, save the URL for that page for an indefinite length of time.

1.6.4 ADDITIONAL FUNCTIONALITY: Browsers perform a number of other functions including: i) Automatic URL completion: If the user has entered a URL in a location bar and begins to type it again the URL will be completed automatically by the browser. ii) Script Execution: Browsers can run programs(scripts). For example validating data entered on a form before sending it to a web server. iii) Event Handling: When a user performs an action, such as clicking on a link or button in a web page, the browser treat this as the occurrence of a event. Variety of actions in response to an event takes place. iv) Management of form GUI: If a web page contains a form with fill-in fields, the browser must allow the user to perform standard-text-editing functions within these fields. v) Secure Communication: When a user sends sensitive information such as a credit card number, to a server, the browser can encode this information in a way that prevents any machines from obtaining the information. vi) Plug-in Execution: Most browsers support some form of plug-in protocol that allows the browsers capabilities by other software.

UNIT-I

IT2353/Web Technology

IT/RMKEC

19

1.7 WEB SERVERS Basic functionality found in most web servers as well as some specific instructions for accessing and modifying the parameters of the web servers Tomcat 5.0

1.7.1- SERVER FEATURES: The primary features of web servers is to accept HTTP requests from web clients and return an appropriate resources in the HTTP response. The basic functionality involves a number of steps. i) The server calls on TCP software and waits for connection requests to one or more ports. ii) When a connection request is received, the server dedicates a subtask to handling this connection. iii) iv) The subtask establishes the TCP connection and receives an HTTP request. The subtask examines the Host header field of the request to determine which virtual host should receive this request and invokes software for this host. v) The virtual host software maps the Request-URI field of the HTTP request start line to a resource on the server. vi) If the resource is a file, the host software determines the MIME type of the file and creates an HTTP response that contains the file in the body of the response message. vii) If a resource is a program, the host software runs the program, providing it with information from the request and returning the output from the program as the body of an viii) HTTP response message.

The server normally logs information about the request and response in a normal text file.

ix)

If the TCP connection is kept alive, the server subtask continues to monitor the connection until a certain length of time has elapsed, the client sends another request of the client initiates a connection close.

UNIT-I

IT2353/Web Technology

IT/RMKEC

20

1.7.2- SERVER HISTORY: NCSAs httpd web server is a starting point for server development. Apache server, free open-source server in April 1995. Microsofts IIS provides essentially of the features found in Apache. IIS can run only on windows OS. Apache can run on windows, linux and macintosh system. Tomcat is a popular, free and open-source servlet container developed and maintained by the Apache software foundation, the same organization developed web server also. Tomcat can also be run as a standalone web server that communicates directly with the web clients.

1.7.3- SERVER CONFIGURATION AND TUNING Broadly server configuration can be broken into two areas: external communication and internal processing. In Tomcat, this corresponds to two separate Java packages; Coyote provides the HTTP/1.1 communication. Catalina is the actual servlet container. Some if the coyote parameters, affecting external communication include the following: IP addresses and TCP ports that may be used to connect to this server. Number of subtasks(thread) that will be created when the server is initialized. This many TCP connections can be established simultaneously with minimal overhead. Maximum number of threads that will be allowed to exist simultaneously. Maximum number of TCP connection requests that will be queued if the server is already running its maximum number of threads. Length of time the server will wait after serving an HTTP request.

The setting of these parameters can have a significant influence on the performance of a server, changing the values of these and similar parameters in order to optimize performance is often referred to as tuning the server. As with all optimization problems, there are various trade offs involved in attempting to tune a server. For example increasing the maximum number of simultaneous threads that may execute increases memory requirements.
UNIT-I IT2353/Web Technology IT/RMKEC

21

The internal catalina portion of Tomcat also has a number of parameter settings that affect functionality. i) ii) iii) iv) Which client machine send HTTP requests to the server. Which virtual hosts are listening for TCP connections on a given port. What logging will be performed. How the path portion of request-URI will be mapped to the servers file system or resources. v) vi) Whether or not the servers resources will be password protected, Whether or not resources will be cached in the servers memory.

1.7.4 DEFINING VIRTUAL HOSTS: The Host component is used to define a virtual host. The virtual host name should normally be a fully qualified domain name that would be used by visitors to web site. Host supplied as part of Java Web Services Developer Pack is given unqualified name Local Host. This is a special name that DNS system treats as a reference to a special IP address, 127.0.0.1 If an IP message is sent to this address, the IP software causes the message to loop back to itself for receipt. The virtual host should only be accessible if the browser runs on the server machine. The value of the default host name field for this service is Local Host. This means that if a user browses to this service using a URL with a host name other than local host, the request will be passed to the local host virtual host. Additional most component with name www.example.org was added to this service. Then this new virtual host would handle requests containing a Host header field with value www.example.org while all requests with any other host value would continue to be handled by the default local host virtual host. Web applications is a collection of files and programs that work together to provide a particular function to web users. Web site might run two web applications:
UNIT-I IT2353/Web Technology IT/RMKEC

22

Use by administrators of the site that provides maintenance functionality and Another for use by external clients that provides customer functionality.

In Tomcat, a web application is represented by a context component. In a every host, there can be number of contexts. Each host and context is associated with a directory in the servers file system. The directory associated with a host is specified by the value of the application base field. If this value is relative path name- a path name does not begin with a / in linux or with a drive specification such as c:/ in windows, then it is taken as relative to the directory in which JWSDP 1.3 was installed. Absolute path name starts with / For example for the local host, the application base is webapps. This webapps will be in the absolute path as /usr/java/jwsdp1.3/webapps Using a relative path name for the application base value is generally recommended. 1.7.5 LOGGING Web server logs record information about server activity. The primary web server log recording normal activity is an access log, a file that records information about every HTTP request processed by the server. A web server may also produce one or more message logs containing a variety of debugging and other information generated by web applications as well as possible by the web server itself. Finally, information written to the standard output and error streams by the web server or applications may also be logged. The following information is contained in this log entry. i) ii) iii) iv) v) vi) Host name of client machine making the request. User name used to log in. Data & time of response, plus the time zone. Start line of HTTP request HTTP status code of response. Number of bytes sent in body of response.

UNIT-I

IT2353/Web Technology

IT/RMKEC

23

An advantage of using this log format is that a variety of log analysers have been developed that can read logs in this formats and produce reports on various aspects of a sites usage. Ex: report on number of accesses per day, percentage of requests that received error status code. This will be useful ofr server tuning. Analog is one popular free log analyzer available. 1.7.6 ACCESS CONTROL Tomcat can provide automatic pass word protection for resources that it serves. This is a 2 stage process I-stage: i) ii) A database of user names is created. Each user name is assigned a pass word and also roles(roles may be administrator, developer, end user) iii) Some users may be assigned to multiple roles.

II-stage: i) Certain resources can only be accessed by users who belong to certain roles and who have authenticated themselves as belonging to one of these roles by logging in with an appropriate user name and pass word. 1.7.7 SECURE SERVERS HTTP request and response messages are sent as simple text files. Because these messages are carried by TCP/IP, each message may travel through a number of machines before reaching its destination. It is possible that some machine along the route will extract information from the IP messages it forwards for nefarious purposes. Furthermore, it is often possible for other machines sharing a local network with the sending or receiving machine to snoop the network and view messages associated with other machines as if they were sent to the snooper.

UNIT-I

IT2353/Web Technology

IT/RMKEC

24

Eaves dropper: Any machine other than the sender or receiver that extracts information from network messages. To prevent eavesdropper from obtaining sensitive information, sensitive information should be encrypted before being transmitted over nay public communication network. The standard means of indicating to a browser that it should encrypt an HTTP request is to use the https scheme on the URL for the request. https://www.example.org Tomcat supports the TLS 1.0 and other earlier protocols. To enable the secure server tomcat features: i) ii) Obtain and install a certificate Configure the server to listen for TLS connections on some port.

1.8 MARKUP LANGUAGES: XHTML For communication between browsers and servers; most documents are written using Hypertext Markup Language (HTML) HTML is not a single language. It is name for a family of related languages. Difference between XHTML & HTML HTML, is an application of SGML, allows an author to omit certain tags and use attribute minimization. XHTML(Extensible HyperText Markup Language) is an application of XML, it does not permit the omission of any tags or the use of attribute minimization. 1.8.1 AN INTRODUCTION TO HTML <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html> <head> <title>Helloworld.html</title> </head> <body> <p> Hello world </p>
UNIT-I IT2353/Web Technology IT/RMKEC

25

</body> </html> This document, like every HTML document, contains two types of information: i) The markup information, which is contained in tags consisting of angle bracket tag delimiters (< and >) plus the text contained between these delimiters. ii) The character data of the document, which is everything outside of the markup tags and displayed by the browser. In this case, the character data consists of the two strings HelloWorld.html and Hello World! as well as some white space. <!DOCTYPE It is special markup information called the document type declaration. This tag is to identifies the particular version of HTML used to write the program. In this case, the file is written in the XHTML 1.0 Strict version. Everything in the Hello World! document following the document type declaration the text from <html downis called the document instance. Each of the tags in this example document instance is either a start tag or an end tag. The start tag and its corresponding end tag, along with all of the document between the tags, is called an element of the document. The portion between the tags (not including the tags themselves) is called the content of the element. The first tag in the document instance of an HTML file must be a start tag for the root element, and the root element can only occur once in the document. In order to strictly conform with the XHTML 1.0 standard, the html start tag must also contain the xmlns="..." string shown. That string is an example of an attribute specification, which consists of an attribute name (xmlns in this case) and an attribute value (the string within quotes). It can be viewed as a tree (element tree)
html

head title

body p

Element Tree
UNIT-I IT2353/Web Technology IT/RMKEC

26

Any text contained in the head element does not appear directly in the primary window area (the client area) of the browser window. Instead, the head element is used for providing certain instructions to the browser. Here the title element, which directs the browser to display the element content as the window title The body element contains the information that is to be displayed in the client area of the browser. This documents body contains a single paragraph (p) element. A p element in particular tells the browser that its content represents a single paragraph of text and s should be displayed accordingly. 1.9 HTMLS HISTORY AND VERSIONS HTML was initially defined by Tim Berners-Lee, in 1990. It is developed for science and engineering professionals. After the revision, the elements of the language are still less. The elements in use as of November 1992 included the title and paragraph elements , the elements for creating hyperlinks, headings, simple lists, glossaries, examples and address blocks There was no facility for producing tables or fill-in forms, much less for including images within a document. By February 1993 publicly released a preliminary version of Mosaic browser. By September 1993 Same browser, in Windows and Macintosh versions was made available. In addition to displaying images within documents, the NCSA Mosaic browser could play video clips as well as sounds. Microsoft deployed a similarly large development team to work on its Internet Explorer browser. During the period from 1993 through 1997, HTML was being defined operationally by the elements that browser software developers chose to implement and the ways in which their browsers responded to these elements. In October of 1994, Tim Berners-Lee launched the World Wide Web Consortium (W3C), in part with the goal of producing standards for HTML as well as other web technologies.
UNIT-I IT2353/Web Technology IT/RMKEC

27

HTML version 2.0 became a standard over six months. Version 3.2 was adopted as a standard by the W3C in January of 1997, The W3C released its HTML 4 recommendation in December of 1997. The current version of this recommendation, HTML 4.01, is the standard. 1.10 BASIC XHTML SYNTAX & SEMANTICS

1.10.1 - DOCUMENT TYPE DECLARATION Every XHTML document must begin with a document type declaration. Each HTML specification provides such a declaration that can be used at the beginning of documents intended to conform with the specification. There are three flavors of both the HTML 4.01 and XHTML 1.0 specifications, each with its own document type declaration. The three flavors are: o Strict: Initial Version. o Transitional: A superset of StrictHTML. Elements and attributes that should not be used if possible because they will likely be eliminated from HTML recommendations at some future time. Frameset: A superset of the Transitional flavor that includes a feature allowing several subwindows (frames) to be displayed within a browsers client area. Many documents on the Web today begin with a document type declaration for the HTML 4.01 Transitional flavor. 1.10.2 WHITE SPACE IN CHARACTER DATA: The information that lies outside the markup of the document, and to a large extent is the textual content of the web page produced by the document. any string of white space characters within character data is replaced by a single blank. <body> <p> Hello World!

UNIT-I

IT2353/Web Technology

IT/RMKEC

28

This is my second HTML paragraph. </p> </body> Although the text within the p element is typed into the HTML document as two paragraphs (there is a blank line between two pieces of text), the browser displays all of the text as a single paragraph with a single space between the two sentences A simple way to tell the browser that the text in this example to be displayed as two paragraphs is to use two p elements instead of one:

<p> Hello World! </p> <p> This is my second HTML paragraph. </p> 1.10.3 UNRECOGNISED ELEMENTS AND ATTRIBUTES

A second feature of HTML that sometimes confuses beginning web authors is that browsers dont complain if a document contains element or attribute names that the browser does not recognize. For attributes with unrecognized names, the browser acts as if the attribute is not present at all. For unrecognized element names, the browser displays the content of the element as if the markup were not present.

<head> <titl> HelloWorldBadElt.html </title> </head> In this example, the browser treats the content of the titl element as if it were text typed directly within the head element. Text is not supposed to appear here, and the HTML standard does not specify how a browser should display such information. Mozilla chooses to display the text as if it were the initial content of the body.

UNIT-I

IT2353/Web Technology

IT/RMKEC

29

One implication of HTMLs handling of unrecognized tag names is that an HTML page may display correctly in a browser but still have typographical errors in its markup. For example, consider the following document body: <p> Hello World! </p> <l> This is my second HTML paragraph. </p> The second paragraph mistakenly begins with an l tag. Since l is not a valid element name in HTML, this tag will be ignored. Mozilla treats text that is contained directly in the body element as if it were the content of a p element. One simple way to perform validation checking is to use an HTML validator. This is a program that will analyze your document and not only catch typographical errors, but also help you to ensure that the HTML document conforms to the standards of the HTML version. 1.10.4 SPECIAL CHARACTERS Few characters must be used carefully in HTML documents. For example, the less-than symbol (<) is the special symbol used to begin tags. So, this < symbol is given as a reference &lt, & is the reference(begin), ; reference (end) Character < > & n Reference Character &lt; &gt; &amp; &quot; &apos &ntilde; &alpha; &forall;

Example Entity and Character References

UNIT-I

IT2353/Web Technology

IT/RMKEC

30

1.10.5 ATTRIBUTES Html element of any XHTML 1.0 document must contain an xmlns attribute specification. HTML element has a set of associated attributes that can be specified for it. The values of an elements attributes typically influence how the element is displayed or how it behaves, or they may supply identifying information. For example, the xmlns attribute identifies the XML namespace for the document, which can be considered identifying information. All XHTML attribute specifications have the form shown for xmlns: white space separates the attribute name from the element name in the start tag of the element. The attribute name is followed by an equals sign (=), optionally preceded and followed by white space; and the value of the attribute, enclosed in quotes, follows the equals sign. for example, an attribute specification such as value = "Ain't this grand!" is legal value = "He said, "She said", then sighed." is not. However, references may appear within an attribute value, so value = "He said, &quot;She said&quot;, then sighed." is valid. The &quot; references will be converted to double quotes when the document is parsed. Multiple attribute specifications can be included within a single tag by separating the specifications with white space. <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> o This above statement is compatible with software that understands. HTML 4.01, which does not contain the xml:lang attribute, as well as with software that understands XML, which defines the xml:lang attribute for use across arbitrary XML-based lang. o En-English o Multiple attribute specification can appear in any order. restrictions on attribute values
UNIT-I

newline characters should not appear within an attribute value;


IT2353/Web Technology IT/RMKEC

31

avoid including any leading or trailing white space, and avoid having multiple adjacent white space characters within attribute values

1.11 SOME FUNDAMENTAL HTML ELEMENTS: Example 1: Headings <html> <body> <h1>My First Heading</h1> <p>My first paragraph.</p> </body> </html> - Produces section headings for an HTML document. - h1 represents a top level heading. - h2 subheadings and so on Example 2: Link: The Anchor Element <html> <body> <a href="http://www.w3schools.com"> This is a link</a> </body> </html> The a or anchor tag for creating clickable kink within a document. Example 3: Images: The img Element <html> <body> <img src="s1.jpg" width="104" height="142" /> </body> </html> causes a graphics capable browser to display an image that, when clicked

Example 4: Nested Elements <html> <body>

UNIT-I

IT2353/Web Technology

IT/RMKEC

32

<p><b>This text is bold</b></p> <p><strong>This text is strong</strong></p> <p><big>This text is big</big></p> <p><em>This text is emphasized</em></p> <p><i>This text is italic</i></p> <p><small>This text is small</small></p> <p>This is<sub> subscript</sub> and <sup>superscript</sup></p> </body> </html> This is an example for nesting elements. - It includes one element as part of the content of the another element. Example 5: pre and br Element <html> <body> <pre> This is preformatted text. It preserves both spaces and line breaks. </pre> <p>The pre tag is good for displaying computer code:</p> <pre> for i = 1 to 10 print i next i </pre> </body> </html>

Output: This is preformatted text. It preserves both spaces and line breaks. The pre tag is good for displaying computer code: for i = 1 to 10
UNIT-I IT2353/Web Technology IT/RMKEC

33

print i Example 6: Horizontal Rule Element <html> <body> <p>The hr tag defines a horizontal rule:</p> <hr /> <p>This is a paragraph</p> <hr /> <p>This is a paragraph</p> <hr /> <p>This is a paragraph</p> </body> </html> Output: The hr tag defines a horizontal rule:

This is a paragraph

This is a paragraph

This is a paragraph Inserts a horizontal line to the document. Example 7: Image Link Element <html> <body> <p>Create a link of an image: <a href="default.asp"> <img src="smiley.gif" alt="HTML tutorial" width="32" height="32" /> </a></p> <p>No border around the image, but still a link: <a href="default.asp"> <img border="0" src="smiley.gif" alt="HTML tutorial" width="32" height="32" />
UNIT-I IT2353/Web Technology IT/RMKEC

34

</a></p> </body> </html> Example 8: Comment Element

<html> <body> <!--This comment will not be displayed--> <p>This is a regular paragraph</p> </body> </html>

Example 9: Break Element <html> <body> <p>This is<br />a para<br />graph with line breaks</p> </body> </html>

HTML Images - The <img> Tag and the Src Attribute In HTML, images are defined with the <img> tag. The <img> tag is empty, which means that it contains attributes only, and has no closing tag. To display an image on a page,need to use the src attribute. Src stands for "source". The value of the src attribute is the URL of the image to display. Syntax for defining an image: <img src="url" alt="some_text"/> The URL points to the location where the image is stored. An image named "boat.gif", located in the "images" directory on "www.example.com" has the URL:

http://www.example.com/images/boat.gif.
UNIT-I IT2353/Web Technology IT/RMKEC

35

The browser displays the image where the <img> tag occurs in the document. If an image tag is between two paragraphs, the browser shows the first paragraph, then the image, and then the second paragraph. HTML Images - The Alt Attribute The required alt attribute specifies an alternate text for an image, if the image cannot be displayed. The value of the alt attribute is an author-defined text: <img src="boat.gif" alt="Big Boat" /> The alt attribute provides alternative information for an image if a user for some reason cannot view it (because of slow connection, an error in the src attribute, or if the user uses a screen reader). HTML Images - Set Height and Width of an Image The height and width attributes are used to specify the height and width of an image. The attribute values are specified in pixels by default: <img src="pulpit.jpg" alt="Pulpit rock" width="304" height="228" />

HTML <map> Tag <img src="planets.gif" width="145" height="126" alt="Planets" usemap="#planetmap" /> <map name="planetmap"> <area shape="rect" coords="0,0,82,126" href="sun.htm" alt="Sun" /> <area shape="circle" coords="90,58,3" href="mercur.htm" alt="Mercury" /> <area shape="circle" coords="124,58,8" href="venus.htm" alt="Venus" /> </map> The <map> tag is used to define a client-side image-map. An image-map is an image with clickable areas. The name attribute of the <map> element is required and it is associated with the <img>'s usemap attribute and creates a relationship between the image and the map. The <map> element contains a number of <area> elements, that defines the clickable areas in the image map.

UNIT-I

IT2353/Web Technology

IT/RMKEC

36

HTML <area> Tag

The <area> tag defines an area inside an image-map (an image-map is an image with clickable areas). The <area> element is always nested inside a <map> tag. The coords attribute specifies the x and y coordinates of an area in an image-map. The coords attribute is used together with the shape attribute to specify the size, shape, and placement of an area. The coordinates of the top-left corner of an area are 0,0. x1,y1,x2,y2 Specifies the coordinates of the left, top, right, bottom corner of the

rectangle (for shape="rect") x,y,radius Specifies the coordinates of the circle center and the radius (for

shape="circle") x1,y1,x2,y2,..,xn,yn Specifies the coordinates of the edges of the polygon. If the first

and last coordinate pairs are not the same, the browser will add the last coordinate pair to close the polygon (for shape="poly") 1.12 RELATIVE URLS: URLs are used as the values of some attributes in HTML documents. By default the webapps/ROOT subdirectory within your JWSDP 1.3 installation

directory is the document base for the root URL path / of your Tomcat web server. Running web server in the machine http://localhost:8080/ MultiFile.html will cause your server to look for a file named MultiFile.html in the webapps/ROOT directory. For example <img src="valid-xhtml10.png"> The web server is running the program http://localhost:8080/multifile.html which is in webapps/ROOT. Then the web server automatically searches for the src file in the same directory webapps/ROOT. A complete URL which begins with a scheme is called absolute URL. Each relative URL is short hand for some absolute URL.

UNIT-I

IT2353/Web Technology

IT/RMKEC

37

A relative URL that looks like a file name, such as valid-xhtml10.png is converted to an absolute URL by replacing any characters following the final slash(/) in the base URL with the relative URL. For example, if a document at http://www.example.org/PageWithAnchor.html contained the markup <a href="#section1">... then the corresponding absolute URL would be http://www.example.org/ PageWithAnchor.html#section1. Absolute URLs Corresponding to Relative URLs When the Base URL is http://www.example.org/a/b/c.html Relative URL d/e.html ../f.html ../../g.html ../h/i.html /j.html /k/l.html Absolute URL http://www.example.org/a/b/d/e.html http://www.example.org/a/f.html http://www.example.org/g.html http://www.example.org/a/h/i.html http://www.example.org/j.html http://www.example.org/k/l.html

1.13 LISTS The three types of lists supported by HTML: _ Unordered: A bullet list _ Ordered: A numbered list _ Definition: A list of terms and definitions for each <ul> <li>Bulleted list item</li> <li>Bulleted list item 2</li> </ul> <ol> <li>Numbered list item</li> <li>Numbered list item 2</li> </ol> <dl> <dt>Term</dt> <dd>Definition of term</dd> <dt>Term 2</dt> <dd>Definition of term 2</dd>
UNIT-I IT2353/Web Technology IT/RMKEC

38

</dl> First, the type of list is indicated by using either a ul (unordered list), ol (ordered list), or dl (definition list) All three elements are block elements. So a list by default begins on a new line when displayed in the browser. Each item in an unordered or ordered list is made the content of an li (list item) element. For a definition list, each term is made the content of a dt (definition term) element, and each term description is made the content of a dd element that immediately follows the dt element being described. Lists can be nested to produce an outline layout. For example, the markup <ul> <li>Bulleted list item <ul> <li>Nested list item</li> <li>Nested list item 2</li> </ul> </li> <li>Bulleted list item 2</li> </ul> 1.14 TABLES HTML provides a fairly sophisticated model for presenting data in tabular form. Columns and rows will automatically size to contain their data, although there are also various ways to specify column widths; individual table cells can span multiple rows and/or columns; header and/or footer rows can be supplied; and so on. For example, a table of student grades could be written as follows

<table border="5"> <tr> <td>Kim</td><td>100</td><td>89</td> </tr> <tr> <td>Sandy</td><td>78</td><td>92</td> </tr> <tr> <td>Taylor</td><td>83</td><td>73</td>
UNIT-I IT2353/Web Technology IT/RMKEC

39

</tr> </table> A tr (table row) element is used to contain each row. Within a row, a td (table data) element marks each element of the row. The number of rows is determined by the number of tr elements in the table, and the number of columns is determined by the maximum number of td elements contained within any row. The border attribute in the table start tag tells the browser to display the table using a 5pixel-wide border and 1-pixel-wide rules. In general, if any positive integer n is used for the value of border, then an n-pixel-wide border and 1-pixel-wide rules will be displayed. A value of 0 for this attribute turns off both the border and the rules. <table border="5"> <caption> COSC 400 Student Grades </caption> <tr> <td>&nbsp;</td><td>&nbsp;</td><th colspan="2">Grades</th> </tr> <tr> <td>&nbsp;</td><th>Student</th><th>Exam 1</th><th>Exam 2</th> </tr> <tr> <th rowspan="2">Undergraduates</th><td>Kim</td><td>100</td><td>89</td> </tr> <tr> <td>Sandy</td><td>78</td><td>92</td> </tr> <tr> <th>Graduates</th><td>Taylor</td><td>83</td><td>73</td> </tr> </table> The caption element is used to define a caption for the table. If a caption element is used with a table, the caption start tag must must appear immediately after the start tag of the table element.

UNIT-I

IT2353/Web Technology

IT/RMKEC

40

The second new element, th (table header), is much like the td element, except that a typical browser will format the content of a th element in boldface and center it horizontally within the column. <td>&nbsp;</td> to indicate that the first column of the second row should be left blank. The &nbsp; reference is included to ensure that the cell rules are displayed. colspan is used to tell the browser that a table element should cover more than one column. rowspan is used for an element that covers more than one row For performance and other reasons, if a large image is to be displayed on a web page, it will often be sliced into several smaller images that are downloaded separately and displayed next to one another to recreate the large image. Tables are frequently used to position the smaller images adjacent to one another so that they appear to be a single larger image. the table element has two attributes that control spacing within the table: cellspacing and cellpadding. The cellspacing attribute determines the amount of space between two adjacent cells, or between a side of the cell and the border of the table, while cellpadding determines the amount of space between the content of a cell and the edge of the cell. <table border="1" cellpadding="5"> <tr> <th>cellspacing</th><th>cellpadding</th><th>Example</th> </tr> <tr> <td>10</td><td>10</td> <td id="nested"> <table border="1" cellspacing="10" cellpadding="10"> <tr> <td> <img src="CFP1.png" style="display:block" alt="Cucumber and Flower Pot" height="86" width="67" /> </td> <td> <img src="CFP2.png" style="display:block" alt="Cucumber and Flower Pot" height="86" width="45" /> </td> </tr>
UNIT-I IT2353/Web Technology IT/RMKEC

41

</table> </td> </tr> The td element with id value nested is another table. Any desired depth of recursion of tables are possible. Each img element assigns a value of display:block to its style attribute. Here the image should be treated as a block element for display purposes 1.15 FRAMES HTML frames are essentially a means of having several browser windows open within a single larger window. Such a window is created by using one or more frameset elements after the heading element, rather than using a body element as all of our previous pages have used. The document type declaration is also different for framed pages than it is for standard web pages. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Java 2 Platform SE v1.4.2</title> </head> <frameset cols="20%,80%"> <frameset rows="1*,2*"> <frame src="overview-frame.html" id="upperLeftFrame" name="upperLeftFrame"></frame> <frame src="allclasses-frame.html" id="lowerLeftFrame" name="lowerLeftFrame"></frame> </frameset> <frame src="overview-summary.html" id="rightFrame" name="rightFrame"></frame> </frameset> </html> The first frameset statement says to create two rectangular subspaces, or views, within the browser window.

UNIT-I

IT2353/Web Technology

IT/RMKEC

42

The first (and therefore leftmost) of these subspaces covers 20% of the.width of the browser window, and the second covers the remaining 80% of the window. This top-level frameset element contains two child elements: another frameset and a frame. The child frameset specifies that its subspace (the left 20% of the browser window) is further divided into two views. Since these two views are specified using the rows attribute, they are stacked one on top of the other and both occupy the full width of the child framesets subspace. The notation 1*,2* indicates that the vertical space should be allocated so that the second (and therefore lower) view is twice as tall as the first view. Framesets can be defined recursively, with one frameset defined within another. Each frame is essentially a browser window, which occupies a subspace of the screen as defined by the frameset(s) containing the frame. In the example considered, the frame named upperLeftFrame occupies the upper left corner (20% wide by 1/3 high) of the browser window, the frame lowerLeftFrame the lower left corner (20% wide by 2/3 high), and the frame rightFrame the right side (80% wide by 100% high). The src attribute of a frame tells the browser the URL of a document to be loaded into the frame initially. If an HTML document is loaded into the frame and the user clicks a hyperlink within that frame, then the document named in the href attribute of the hyperlinks a element will be loaded into the frame. Other frames in the browser window will not be affected. However, it is also possible for activity in one frame to cause a change in another frame. For example, theHTML contained in the frame named upperLeftFrame might contain an anchor such as the following: <a href="java/applet/package-frame.html" target="lowerLeftFrame">

This anchor specifies lowerLeftFrame as the value of its target attribute, if the user clicks the link corresponding to this anchor, then the URL specified by the href attribute of the anchor will be loaded into lowerLeftFrame.

UNIT-I

IT2353/Web Technology

IT/RMKEC

43

One use of frames, then, is to provide a relatively easy means of writing HTML documents that provide navigation tools. Navigation page can be loaded into its own frame, and a second frame can be specified as a target for each of the anchor elements contained in the navigation page. A nonframe alternative is to include the navigation links directly in every page of the collection. Disadvantages of using frames: i) ii) Frames can cause confusion for end users. Avoid frames is that they assume that the document is going to be displayed on a monitor large enough to comfortably accommodate them. A key exception is that frames are still used for certain specialized technical applicationssuch as documents produced using the JavadocTM documentation systemin which the end users may be expected to have the expertise needed to deal with the vagaries of frames and to be viewing documents on standard monitors. 1.16 HTML FORMS

The <form> tag is used to create an HTML form for user input. The <form> element can contain one or more of the following form elements: <input> <textarea> <button> <select> <option> An HTML form is used to pass data to a server. Example: An HTML form with two input fields and one submit button: <form action="form_action.asp" method="get"> First name: <input type="text" name="fname" /><br /> Last name: <input type="text" name="lname" /><br /> <input type="submit" value="Submit" /> </form>
UNIT-I IT2353/Web Technology IT/RMKEC

44

TextArea: <textarea rows="2" cols="20"> from basic HTML to advanced XML, SQL, ASP, and PHP. </textarea> The <textarea> tag defines a multi-line text input control. A text area can hold an unlimited number of characters, and the text renders in a fixedwidth font (usually Courier). The size of a text area can be specified by the cols and rows attributes, or even better; through CSS' height and width properties. Select:Create a drop-down list with four options: <select> <option value="volvo">Volvo</option> <option value="saab">Saab</option> <option value="mercedes">Mercedes</option> <option value="audi">Audi</option> </select> The <select> tag is used to create a drop-down list. The <option> tags inside the <select> element define the available options in the list. Input: The <input> tag is used to select user information. <input> elements are used within a <form> element to declare input controls that allow users to input data. An input field can vary in many ways, depending on the type attribute. CheckBox: <input type=checkbox name= c1> The value for type allows to create a checkbox style interface for the form.

Radio: <input type=radio name= r1 value=p>

UNIT-I

IT2353/Web Technology

IT/RMKEC

45

The value for type allows to create a radio button style interface for the form. It is designed to offer user a choice from pre-determined options. However radio is designed to accept only one response from among its option. Text: The first possible value for the type attribute is text, which creates a single line text box of the length. First name: <input type="text" name="fname" /> Password: A password option is nearly identical to the text option except that it responds to typed letters with bullet points. Passoword: <input type="passoword" name="fname" /> Example: Form Validation

<HTML> <HEAD> <TITLE> FORMS</TITLE> </HEAD> <BODY><center><h2> APPLICATION FORM</h2> <TABLE BORDER="5"> <FORM NAME="F1" METHOD="POST"> <TR> <TD> NAME:</TD><TD><INPUT TYPE="text" NAME="name"> </TD> </TR> <TR> <TD> EMAIL ID:</TD><TD><INPUT TYPE="text" NAME="emailid"> </TD> </TR> <TR> <TD> EXPERIENCE:</TD><TD><INPUT TYPE="radio" NAME="fresh" value="FRESH">FRESH </TD> <TD><INPUT TYPE="radio" NAME="experience" value="EXPERIENCE">EXPERIENCE
UNIT-I IT2353/Web Technology IT/RMKEC

46

</TD> </TR> <TR> <TD> DATE OF BIRTH:</TD> <TD> <SELECT NAME="date"> <OPTION VALUE="1">1</OPTION> <OPTION VALUE="2">2</OPTION> <OPTION VALUE="3">3</OPTION> <OPTION VALUE="4">4</OPTION> <OPTION VALUE="5">5</OPTION> <OPTION VALUE="6">6</OPTION> <OPTION VALUE="7">7</OPTION> <OPTION VALUE="8">8</OPTION> <OPTION VALUE="9">9</OPTION> <OPTION VALUE="10">10</OPTION> <OPTION VALUE="11">11</OPTION> <OPTION VALUE="12">12</OPTION> <OPTION VALUE="13">13</OPTION> <OPTION VALUE="14">14</OPTION> <OPTION VALUE="15">15</OPTION> <OPTION VALUE="16">16</OPTION> <OPTION VALUE="17">17</OPTION> <OPTION VALUE="18">18</OPTION> </SELECT> </TD> <TD> <SELECT NAME="month"> <OPTION VALUE="1">1</OPTION> <OPTION VALUE="2">2</OPTION> <OPTION VALUE="3">3</OPTION> <OPTION VALUE="4">4</OPTION> <OPTION VALUE="5">5</OPTION> <OPTION VALUE="6">6</OPTION> <OPTION VALUE="7">7</OPTION> <OPTION VALUE="8">8</OPTION> <OPTION VALUE="9">9</OPTION> <OPTION VALUE="10">10</OPTION> <OPTION VALUE="11">11</OPTION> <OPTION VALUE="12">12</OPTION> </SELECT> </TD> <TD>
UNIT-I IT2353/Web Technology IT/RMKEC

47

<SELECT NAME="year"> <OPTION VALUE="1991">1991</OPTION> <OPTION VALUE="1992">1992</OPTION> <OPTION VALUE="1993">1993</OPTION> <OPTION VALUE="1994">1994</OPTION> <OPTION VALUE="1995">1995</OPTION> <OPTION VALUE="1996">1996</OPTION> <OPTION VALUE="1997">1997</OPTION> <OPTION VALUE="1998">1998</OPTION> <OPTION VALUE="1999">1999</OPTION> <OPTION VALUE="2010">2000</OPTION> <OPTION VALUE="2011">2011</OPTION> <OPTION VALUE="2012">2012</OPTION> </SELECT> </TD> </TR> <TR> <TD>EDUCATIONAL QUALIFICATION:</TD> <TD> <SELECT NAME="edu"> <OPTION VALUE="B.TECH">B.TECH</OPTION> <OPTION VALUE="B.E">B.E</OPTION> <OPTION VALUE="Bsc">Bsc</OPTION> </SELECT> </TD> </TR> <TR> <TD>SPECIALISATION:</TD> <TD> <SELECT NAME="edu"> <OPTION VALUE="B.TECH">JAVA</OPTION> <OPTION VALUE="B.E">C++</OPTION> <OPTION VALUE="Bsc">ORACLE</OPTION> </SELECT> </TD> </TR> <TR> <TD>SKILLS:</TD> <TD> <INPUT TYPE="checkbox" NAME="C1" value="C++">C++ </TD> <TD>
UNIT-I IT2353/Web Technology IT/RMKEC

48

<INPUT TYPE="checkbox" NAME="C2" value="JAVA">JAVA </TD> </TR>

<TR> <TD>ENTER COMMENTS BELOW</TD> <TD><INPUT TYPE="textarea" NAME="ta" rows="5" cols="30"></td> </tr> </table></center> </body> </html> 1.17 XML XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to describe data XML tags are not predefined. User can define own tags XML is a W3C Recommendation The Difference Between XML and HTML XML is not a replacement for HTML. XML and HTML were designed with different goals: XML was designed to describe data, with focus on what data is HTML was designed to display data, with focus on how data looks HTML is about displaying information, while XML is about carrying information. Example: i) <?xml version=1.0> ii) <note> iii) <to>Tove</to> iv) <from>Jani</from> v) <body>Hai!</body> vi) </note> Line (i) which defines the XML version. Line (ii) describes the root element. Line(iii)(iv)(v) child elements of the root. Line(vi) defines the end of the root element.
UNIT-I IT2353/Web Technology IT/RMKEC

49

1.17.1 - XML SYNTAX: All XML Elements Must Have a Closing Tag In HTML, some elements do not have to have a closing tag: <p>This is a paragraph. <br> In XML, it is illegal to omit the closing tag. All elements must have a closing tag: <p>This is a paragraph.</p> <br /> XML Tags are Case Sensitive XML tags are case sensitive. The tag <Letter> is different from the tag <letter>. Opening and closing tags must be written with the same case: <Message>This is incorrect</message> <message>This is correct</message> XML Elements Must be Properly Nested In HTML, you might see improperly nested elements: <b><i>This text is bold and italic</b></i> In XML, all elements must be properly nested within each other: <b><i>This text is bold and italic</i></b> XML Documents Must Have a Root Element XML documents must contain one element that is the parent of all other elements. This element is called the root element. <root> <child> <subchild>.....</subchild> </child> </root>

XML Attribute Values Must be Quoted XML elements can have attributes in name/value pairs just like in HTML. In XML, the attribute values must always be quoted.

UNIT-I

IT2353/Web Technology

IT/RMKEC

50

Study the two XML documents below. The first one is incorrect, the second is correct: <note date=12/11/2007> <to>Tove</to> <from>Jani</from> </note> <note date="12/11/2007"> <to>Tove</to> <from>Jani</from> </note> The error in the first document is that the date attribute in the note element is not quoted. Well formed XML Documents : It is document that conforms to the XML syntax rules. Valid XML Documents: A valid XML Document is a well formed document, which is also conforms to the rules of a Document Type Definition(DTD) 1.17.2 DTD XHTML 1.0 is defined by a set of text files known collectively as an XML document type definition (DTD). The basic elements of a DTD Ex: <!ELEMENT html (head, body)> <!ATTLIST html lang NMTOKEN #IMPLIED xml:lang NMTOKEN #IMPLIED dir (ltr|rtl) #IMPLIED id ID #IMPLIED xmlns CDATA #FIXED 'http://www.w3.org/1999/xhtml'> <!ENTITY gt "&#62;">

The first of these lines is an example of an element type declaration. Element type declarations are used to specify the set of all valid elements in the language defined by the DTD. The html element must have two children, a head element followed by a body element. The second line begins a tag that represents an XML attribute list declaration.
UNIT-I IT2353/Web Technology IT/RMKEC

51

Element type has five attributes: lang, xml:lang, dir, id, and xmlns. It also provides information such as the valid set of values for each attribute and default value information. . The final line is an example of an XML entity declaration. Such a tag is essentially a macro definition, associating the name gt (an entity) with the string &#62;. 1.17.3 ELEMENT TYPE DECLARATION <!ELEMENT html (head, body)> The string immediately following ELEMENT is the name of the element type being declared,in this case html. The XHTML DTD contains exactly one element type declaration for each element in the language. <!ELEMENT select (optgroup|option)+> is a choice specification type modified with the + iteration character, and says that a select element may contain any number of optgroup and option elements in any order, as long as one or the other of these two elements appears at least once. Furthermore, sequence and choice specifications (optionally with iterator characters) can be nested, and an element name within a sequence or choice may have an iterator character suffixed to it. 1.17.4 - ATTRIBUTE LIST DECLARATIONS An attribute list declaration is included in the DTD for each element that has attributes.

Attribute list declaration begins with the keyword ATTLIST followed by an element type name and specifies the names for all attributes of the named element, the type of data that can be used as the value of each attribute, and default value information.

For example, consider the example from the beginning of this section. <!ATTLIST html lang NMTOKEN #IMPLIED xml:lang NMTOKEN #IMPLIED dir (ltr|rtl) #IMPLIED
UNIT-I IT2353/Web Technology IT/RMKEC

52

id ID #IMPLIED xmlns CDATA #FIXED 'http://www.w3.org/1999/xhtml'> Each line after the first is a sequence of three elements: the attribute name, the attribute type, and the default declaration. The attribute type specifies the type of data that may be specified as the attribute value.

Attribute Type Name token Enumerated Identifier Identifier reference Identifier reference list Character data

Syntax NMTOKEN (string1 | string2 | ... ) ID IDREF IDREFS CDATA

Usage Name (word) List of all possible attribute values Type for id attribute Reference to an id attribute value List of references to id attribute values Arbitrary character d

Default Declaration: # IMPLIED No default value provided by DTD, Attiribute options #FIXED followed by any valid value Default provided by Dtd any valid value default value provided overridden by user. # REQUIRED no default value provided by dtd, attribute required. 1.17.5 - ENTITY DECLARATIONS XML DTD can contain entity declarations, each of which begins with the keyword ENTITY followed by an entity name and its replacement text, such as <!ENTITY gt "&#62;"> This statement defines the value of the entity reference &gt; to be the string &#62;, which in turn is an XML character reference to the greater-than character (which corresponds to the code point 62 in the Unicode Standard). An entity such as gt is known as a general entity and can only be referenced from within documents defined by the DTD. XML also provides for a different type of entity that can be referenced from within DTDs and not from documents. Such entities are called parameter entities.

UNIT-I

IT2353/Web Technology

IT/RMKEC

53

A parameter entity declaration is indicated in the DTD by following the ENTITY keyword with a percent sign (%), as in these two declarations taken from the XHTML1.0 Strict DTD: 1.18 CREATING HTML DOCUMENT HTML documents can be created in a number of ways. One way is to open a simple text editor, such as Notepad in Microsoft Windows, and simply type in the markup and content of the document. A slightly more automated approach is to use an editor such as Emacs that can be customized to provide certain HTML/XML-specific features, such as tag highlighting, smart tabbing, automated generation of closing tags, and much more. Higher-level tools that generate HTML from user input. EX:-,create a document using familiar word-processing features and have the word processor automatically convert the text formatting and other markup-type operations you applied to the document into an equivalent HTML document. Several applications are available that provide word-processing-type functionality Mozilla 1.4 package includes a Composer feature under the Window menu. Composer appears to be like many other word processors: it has WYSIWYG features for selecting text size and weight. It incorporates many HTML specific features i) ii) Ability to insert named anchor element Editing HTML document directly.

HTML documents can be generated directly or using HTML tools.

UNIT-I

IT2353/Web Technology

IT/RMKEC

S-ar putea să vă placă și