Sunteți pe pagina 1din 45

A SEMINAR REPORT ON CGI PROGRAMMING

Submitted By BARAD DIPA B.(090210107048) Guided By PROF.R.P.SANDHANL

Department of Computer Engineering, Government Engineering College, Bhavnagar(GTU)


Oct-2011-12
1

GOVERNMENT ENGINEERING COLLEGEBHAVNAGAR

This is to certify that Shri BARAD DIPA BHIKHUBHAI Roll No.48, B.E. (C.E) Sem-V class has satisfactorily completed the course in Seminar on COMMON GATEWAY INTERFACE Within four walls of Government Engineering College,Bhavnagar-364002.

Guide: ____________ PROF. R. P.SANDHANI

PROF. G.M. CHAUHAN Examiners:2

Head of the Department

Prof. G.M .Chauhan Examiners:Head of the Department

ABSTRACT
The main theme throughout by this project is the design and creation of virtual hypermedia documents . A few thing to note are : All of the example in CGI are in perl , although same of the common modules are presented in the numerous languages mentioned above. The phrases CGI programs and CGI scripts will be used interchangeably throughout the CGI. CGI cover the client server interaction, including a look at the environmental variables, working with forms, and server side includes (SSI).

The From there, we discuss CGI programs that return virtual documents using various c and java.

Covers techniques for debugging CGI programs, and lists same common mistakes and methods for finding your programming error.

ACKNOWLEDGEMENT
Vision Without Action Is Dream, Action Without Vision Is Time Pass , But Action Without Vision can Change The World Action is very necessary to convert our Dream in the reality. Our dream is to develop the project on CGI to make a project Successfully , one needs help, understanding and co-ordination from all those who are directly or indirectly involved this.

Many people have contributed to make this project a reality. We would like to express my gratitude to Mr.Bhikhubhai Barad & Mr.Rahul B. Barad for his guidance throughout the project.

CONTENTS
CERTIFICATE ABSTRACT ACKNOWLEDGEMENT

Page no.
02 03 04

INTRODUCTION Introduction of CGI programming Internal workings CGI Application CGI Working CGI Application Programming of CGI Introduction C/C++ ( Unix , windows , macintosh) C shell ( Unix only) Perl ( Unix , windows , macintosh) Tcl ( Unix only) Visual Basic (Windows only) Applescript ( macintosh only)

07 07 09 11 12 14 14 15 15 16 16 17 17

Web Programming Languages

18

Introduction Characteristics of Web Programming Languages HTML (HyperText Markup Languages) Languages and Interfaces

18 19 20 28

CGI Programming in C

32

Using C program as a CGI Script The I/O Libraries Code Structure CGI Environment Variables

32 35 36 37

CGI Programming in JAVA Server Side Input Handling Java Java output Java compilation in Unix

40 40 42 42

Chapter 1

INTRODUCTION

INTRODUCTION OF CGI PROGRAMMING

CGI (Common Gateway Interface) programs are programs that exist, and are run on, a web server. They are normally run by a client computer by clicking a button in their browser. CGI programs usually perform some task like a search, or storing information on the server, and also normally generate a dynamic HTML page in response to the user's request.

As you traverse the vast frontier of the World Wide Web, you will come across documents that make you wonder, "How did they do this?" These documents could consist of, among other things, forms that ask for feedback or registration information, imagemaps that allow you to click on various parts of the image, counters that display the number of users that accessed the document, and utilities that allow you to search databases for particular information. In most cases, you'll find that these effects were achieved using the Common Gateway Interface, commonly known as CGI.

One of the Internet's worst-kept secrets is that CGI is astoundingly simple. That is, it's trivial in design, and anyone with an iota of programming experience can write rudimentary scripts that work.

It's only when your needs are more demanding that you have to master the more complex workings of the Web. In a way, CGI is easy the same way cooking is easy: anyone can toast a muffin or poach an egg. It's only when you want a Hollandaise sauce that things start to get complicated.

CGI is the part of the Web server that can communicate with other programs running on the server. With CGI, the Web server can call up a program, while passing user-specific data to the program (such as what host the user is connecting from, or input the user has supplied using HTML form syntax). The program then processes that data and the server passes the program's response back to the Web browser.

CGI isn't magic; it's just programming with some special types of input and a few strict rules on program output. Everything in between is just programming. Of course, there are special techniques that are particular to CGI, and that's what this book is mostly about. But underlying it all is the simple model shown in Figure 1.1.

Figure 1.1: Simple diagram of CGI

Internal Workings of CGI

So how does the whole interface work? Most servers expect CGI programs and scripts to reside in a special directory, usually called cgi-bin, and/or to have a certain file extension. (These configuration parameters are discussed in the Configuring the Server section in this chapter.) When a user opens a URL associated with a CGI program, the client sends a request to the server asking for the file.

For the most part, the request for a CGI program looks the same as it does for all Web documents. The difference is that when a server recognizes that the address being requested is a CGI program, the server does not return the file contents verbatim. Instead, the server tries to execute the program. Here is what a sample client request might look like:

GET /cgi-bin/welcome.pl HTTP/1.0 Accept: www/source Accept: text/html Accept: image/gif User-Agent: Lynx/2.4 libwww/2.14 From: shishir@bu.edu

This GET request identifies the file to retrieve as /cgibin/welcome.pl. Since the server is configured to recognize all files inf the cgi-bin directory tree as CGI programs, it understands that it should execute the program instead of relaying it directly to the browser. The string HTTP/1.0 identifies the communication protocol to use.

Once the CGI program starts running, it can either create and output a new document, or provide the URL to an existing one. On UNIX, programs send their output to standard output (STDOUT) as a data stream. The data stream consists of two parts. The first part is either a full or partial HTTP header that (at minimum) describes what format the returned data is in (e.g., HTML, plain text, GIF, etc.).

A blank line signifies the end of the header section. The second part is the body, which contains the data conforming to the format type reflected in the header. The body is not modified or interpreted by the server in any way.

A CGI program can choose to send the newly created data directly to the client or to send it indirectly through the server. If the output consists of a complete HTTP header, the data is sent directly to the client without server modification.

10

Application of CGI
Forms One of the most prominent uses of CGI is in processing forms. Forms are a subset of HTML that allow the user to supply information. The Forms interface makes web browsing an interactive process for the user. Generally, forms are used for two main purposes. At their simplest, forms can be used to collect information from the user. But they can also be used in more complex manner to provide back and forth interaction. For example the user can be presented with a form listing the various documents available on the Server, as well as an option to search for particular information within these documents. A CGI program can process this information and return document(s) that match the users selection criteria.

Gateways Web gateways are programs or scripts used to access information that is not directly readable by the client. For example say you have an oracle database that contains baseball statistics for all the players on your
11

company team and you would like to provide this information on the web. How would you do it? You certainly cannot point your client to the database file (i.e., Open the URL associated with the file) and expect to see any meaningful data. CGI provide a solution to the problem in the form of a gateway. Virtual Documents Virtual or dynamic document creation is at the heart of CGI. Virtual documents are created on the fly in response to a users information request. You can create virtual HTML plain text image and even audio documents.

Working CGI Applications


Lycos World Wide Web Search Located at http://www.lycos.com, this server allows the User to search the Web for specific documents. Lycos returns a dynamic hypertext document containing the documents that match the user's search criteria.

Coloring Book
12

An entertaining application that displays an image for users to color.


ArchiePlex Gateway

A gateway to the Archie search server. Allows the user to search for a specific string and returns a virtual hypertext document.
Guestbook with World Map

A guestbook is a forms-based application that allows users to leave messages for everyone to see.

Japanese <-> English Dictionary

A sophisticated CGI program that queries the user for an English word, and returns a virtual document with

graphic images of an equivalent Japanese word, or vice verse

13

Chapter 2

Programming in CGI

You might wonder, "Now that I know how CGI works,

what programming language can I use?" The answer to that question is very simple: You can use whatever language you want, although certain languages are more suited for CGI programming than others. Before choosing a language, you must consider the following features: I. II. III. Ease of text manipulation Ability to interface with other software libraries and utilities Ability to access environment variables (in UNIX)

The ability of a language to interface with other software, such as databases, is also very important. This greatly enhances the power of the Web by allowing you to write gateways to other information sources, such as database engines or graphic manipulation libraries. Some of the more popular languages for CGI programming include AppleScript, C/C++, C Shell, Perl, Tcl, and Visual Basic. Here is a quick review of the advantages and, in some cases, disadvantages of each one.
1) C/C++ (UNIX, Windows, Macintosh)

C and C++ are very popular with programmers, and some use them to do CGI programming. These languages are not recommended for the novice programmer; C and C++ impose strict rules for variable and memory declarations, and type checking.
14

In addition, these languages lack database extensions and inherent pattern-matching abilities, although modules and functions can be written to achieve these functions. However, C and C++ have a major advantage in that you can compile your CGI application to create a binary executable, which takes up fewer system resources than using interpreters (like Perl or Tcl) to run CGI scripts.

2)C Shell (UNIX Only)

C Shell lacks pattern-matching operators, and so other UNIX utilities, such as sed or awk, must be used whenever you want to manipulate string information. However, there is a software tool, called uncgiand written in C, that decodes form data and stores the information into shell environment variables, which can be accessed rather easily. Obviously, communicating with a database directly is impossible, unless it is done through a foreign application. Finally, the C Shell has some serious bugs and limitations that make using it a dangerous proposition for the beginner.
3) Perl (UNIX, Windows, Macintosh)

Perl is by far the most widely used language for CGI programming! It contains many powerful features, and is very easy for the novice programmer to learn. The advantages of Perl include:

15

It is highly portable and readily available. It contains very simple and concise constructs. It contains extremely powerful string manipulation operators, as well as functions to deal with binary data. It makes calling shell commands very easy, and provides some useful equivalents of certain UNIX system functions. There are numerous extensions built on top of Perl for specialized functions; for example, there is oraperl(or the DBI Extensions), which contains functions for interfacing with the Oracle database. Because of these overwhelming advantages, Perl is the language used for most of the examples throughout this book.
4) Tcl (UNIX Only)

Tcl is gaining popularity as a CGI programming language. Tcl consists of a shell, tclsh, which can be used to execute your scripts. Like Perl, tclsh also contains simple constructs, but is a bit more difficult to learn and use for the novice programmer. Like Perl, Tcl contains extensions to databases and graphic libraries.
5) Visual Basic (Windows Only)

Visual Basic is to Windows what AppleScript is to the Macintosh OS as far as CGI programming is concerned.
16

With Visual Basic, you can communicate with other Windows applications such as databases and spreadsheets. This makes Visual Basic a very powerful tool for developing CGI applications on a PC, and it is very easy to learn. However, Visual Basic lacks powerful string manipulation operators.
6)AppleScript (Macintosh Only)

Since the advent of System 7.5, AppleScript is an integral part of the Macintosh operating system (OS). Though AppleScript lacks inherent pattern-matching operators, certain extensions have been written to make it easy to handle various types of data. AppleScript also has the power to interface with other Macintosh applications through AppleEvents. For example, a Mac CGI programmer can write a program that presents a form to the user, decode the contents of the form, and query and search a Microsoft FoxPro database directly through AppleScript.

17

Chapter 3
Table of Contents Introduction

Web Programming Languages

Characteristics of Web Programming Languages Languages and Interfaces

1) Introduction This document surveys current and planned languages and interfaces for developing World Wide Web based applications prefaced by a discussion of the characteristics of such languages. The principal goal of creating this document was to identify the various languages currently in use and to provide some insight into the context in which each language is used.

Secondarily, the authors sought some insight into the directions that Web programming was going, especially in the context of the intense publicity surrounding Sun's Java. This document does not attempt to provide in-depth tutorials on these languages and systems. It attempts to be complete in its listing of alternatives. References are provided to more information about each. Our intent is to keep this document current if it proves useful.
18

General purpose programming languages (e.g. C, C++, Objective-C, Pascal, COBOL, FORTRAN) have not been included in this survey unless there are specific uses of those languages for web programming other than conventional development of clients and servers. In most cases, only variants of such languages specialized for web programming are included here, and, in such cases, are generally listed by the variants' names.

2) Characteristics of Web Programming Languages


Just as there is a diversity of programming languages available and suitable for conventional programming tasks, there is a diversity of languages available and suitable for Web programming. There is no reason to believe that any one language will completely monopolize the Web programming scene, although the varying availability and suitability of the current offerings is likely to favor some over others. Java is both available and generally suitable, but not all application developers are likely to prefer it over languages more similar to what they currently use, or, in the case of non-programmers, over higher level languages and tools.

This is OK because there is no real reason why we must converge on a single programming language for the Web any more than we must converge on a single programming language in any other domain.
19

The Web does, however, place some specific constraints on our choices: the ability to deal with a variety of protocols and formats (e.g. graphics) and programming tasks; performance (both speed and size); safety; platform independence; protection of intellectual property; and the basic ability to deal with other Web tools and languages. Formats and protocols

The wide variety of computing, display, and software platforms


found among clients necessitates a strategy in which the client plays a major role in the decision about how to process and/or display retrieved information, or in which servers must be capable of driving these activities on all potential clients. Since the latter is not practical, a suite of Web protocols covering addressing conventions, presentation formats, and handling of foreign formats has been created to allow interoperability [Berners-Lee, CACM, Aug. 1994].

HTML (HyperText Markup Language)


HTML (HyperText Markup Language) is the basic language understood by all WWW (World Wide Web) clients. Unmodified HTML can execute on a PC under Windows or OS/2, on a Mac, or on a Unix workstation. HTML is simple enough that nearly anyone can write an HTML document, and it seems almost everyone is doing so. HTML is a markup language rather than a complete programming language. An HTML document (program) is ASCII text with embedded instructions (markups) which affect the way the text is displayed.

20

The basic model for HTML execution is to fetch a document by its name (e.g. URL), interpret the HTML and display the document, possibly fetching additional HTML documents in the process, and possibly leaving hot areas in the displayed document that, if selected by the user, can accept user input and/or cause additional HTML documents to be fetched by URL.

I.

Power HTML is limited in its computational power. This is intentional in its design, as it prevents the execution of dangerous programs on the client machine. However, Web programmers, as they have become more sophisticated in their applications, have increasingly been hamstrung by these limits. Tasks unable to be coded in HTML must either be executed on the server in some other language, or on the client in a program in some other language downloaded from a server. Both solutions are awkward for the programmer, often produce a sub-optimal segmentation of a application across program modules, both client and server, and reintroduce safety considerations.

21

II.

Performance Because of an HTML program's limited functionality, and the resulting shift of computational load to the server, certain types of applications perform poorly, especially in the context of clients connected to the Internet with rather low bandwidth dialup communications (<=28.8Kbps).

The performance problems arise from two sources: (a) an application which is highly interactive requires frequently hitting the server across this low bandwidth line which can dramatically and, at times, unacceptably slow observed performance ; and Requiring all computation to be done on the server increases the load on the server thereby reducing the observed performance of its clients..

(b)

When code is to be executed on a client, there are two main considerations: what gets shipped and what gets executed. There are three main alternatives for each of these: source code, a partially compiled intermediate format (e.g. byte code), and binary code. Because compilation can take place on the client, what is shipped is not necessarily what is executed.
22

Byte code, according to measurements presented at

the JAVA One conference can be 2-3x smaller than comparable binary code, so its transfer can be considerably faster; especially noticeable over low speed lines. Since transfer time is significant in the Web, this is a major advantage. Source code is also compact.

Execution performance clearly favors binary code over byte code , and byte code over source code. In general, binary code executes 10 - 100 times faster than byte code. Most Java VM developers are developing JIT (Just In Time) compilers to get the benefits of byte code size and binary speed. Java byte codes are downloaded over the net and compiled to native binary on the local platform. The binary is then executed, and, possibly, cached for later executions.

III.

Platform Independence Given the diversity of operating systems and hardware platforms currently in use on the Web, a great efficiency results from only dealing with a single form of an application. The success of HTML has proven this, and Java has seconded it.

23

The ability to deliver a platform-independent applicationis of great appeal to developers, who spend a large portion of their resources developing and maintaining versions of their products for the different hardware/software platform combinations. With Java, one set of sources and one byte compiled executable, can be maintained for all hw/sw platforms. While platform independence has long been a goal of language developers, the need to squeeze every last ounce of performance from software has often made this impractical to maintain, at least at the level of executable code. However, in the Web this concern becomes less important because transfer time is now a significant component of performance and can dominate execution time.

Platform independence can be achieved by shipping either byte code or source code. One advantage of shipping byte code over source code is that a plethora of source languages would require the client machines to maintain many compilers and/or interpreters for the source languages, while fewer byte code formats would require fewer virtual machines.

24

IV.

Preserving intellectual property


Although not currently discussed much as an issue, the ability to download safe, portable applets in some form less than source code is an additional advantage to developers who wish to protect their intellectual property. Looking at someone else's script or source to see how they do something and just tweaking it a little or copying a piece of it to do the same thing in one's own program doesn't feel like stealing.

But if one has to go to the effort of reverse engineering byte or binary code, it becomes more obvious that this code is someone else's intellectual property. For the vast majority of honest people on the Web, this subtle reminder may be enough. For some of the minority, the effort involved in reverse engineering may serve as a sufficient deterrent.

25

V.

Safety
Viruses have proven that executing binary code acquired from an untrusted , or even moderately trusted, source is dangerous. Code that is downloaded or uploaded from random sites on the web should not be allowed to damage the user's local environment. Downloading binary code compiled from conventional languages is clearly unsafe, due to the power of the languages. Even if such languages were constrained to some ostensibly safe subset, there is no way to verify that only the safe subset was used or that the compiler used was trustworthy (after all, it is under someone else's control).

VI.

Conclusion HTML is proving insufficient by itself to develop the myriad Web-based applications envisioned. As extended by server and client programs, the task is feasible, yet awkward and sub-optimal in terms of performance and safety. The ability to easily develop sophisticated Webbased applications optimally segmented between client and server in the context of the heterogeneous and dynamic environment of the Web while not compromising safety, performance, nor intellectual property, is the goal of current efforts.

26

The first significant result of those efforts is Java, a C++-derived language with capabilities specialized for Web-based application development. Java is compiled by the developer to a platform-independent byte code format, with byte codes downloadable via HTML browsers to the client, and interpreted by a virtual machine which can guarantee its safety. Sun is working to improve the safety, performance, comprehensiveness, and ubiquity of Java, and the industry appears to be accepting their approach. Safety is the biggest issue. The safety of a program is a function of the safety of the environment in which it executes, which is just another program. At some level, the user must acquire a potentially unsafe program from a trusted source. At present, we acquire Netscape, Java, and Windows from trusted (relatively) sources. Because there must be a trusted environment in which to execute safe, platform-independent programs and because users are only likely to trust a limited number of big name sources for that trusted environment, there has been speculation that diversity, including diversity in Web programming language choices, would be reduced

27

Languages and Interfaces

The languages and interfaces surveyed below represent various attempts to create the "ideal" Web programming language, usually by extending and restricting existing languages. Web programming languages have a variety of ancestors: scripting languages, shell languages, mark-up languages and conventional programming languages. Not all relevant languages are discussed. Some entries consist only of a link. They are languages we've seen mentioned as applicable to web programming in some way, but haven't investigated further. We hope to do so in the future.

i.

AppleScript
AppleScript is Apple's object-oriented English-like scripting language and development environment for the Macintosh. It is bundled with MacOS, and is used widely for all variety of scripting tasks on the Mac. Recently, it has been applied to web programming tasks. WebRunner enables the execution of AppleScript scripts embedded in HTML files to be executed on a client running Netscape.
28

ii.

CCI (Common Client Interface)


NCSA Mosaic CCI (Common Client Interface) is an interface specification (protocol & API) that enables client-side applications to communicate with NCSA Mosaic, the original web browser, to control Mosaic or to obtain information off the web via Mosaic. Note that this is not for invoking clientside applications (applets) from Mosaic, but for controlling Mosaic from the application. Invocation of client-side applications from a browser is currently specific to the browser, but most support NCSA helpers. Once the application is running, it can communicate with the browser with CCI. CCI is not the only interface currently defined for this purpose, but it seems to be meeting with some acceptance, as Tcl and Perl now support it.

iii.

Dylan
Dylan is a dynamic object-oriented programming language with a pascal-ish syntax, and a lisp-ish semantics. It was designed at Apple's Cambridge lab in cooperation with Carnegie-Mellon University and Harlequin, Inc., and reviewed by its potential user community, mostly former Common Lisp programmers disenchanted with C++.

29

The goal of the designers was to create a language with syntax, performance, and executable footprint acceptable to mainstream programmers (i.e. C/C++), but with many of the characteristics Lisp programmers value in Lisp (e.g. evolutionary development, optional type declarations, runtime safety, automatic storage management, and ease of maintenance).

iv. Icon
Icon is a full-featured programming language developed at the University of Arizona with a C-ish syntax and a SNOBOL heritage, making it particularly suitable for string processing, and, therefore, similar in this way to other languages being used for Internet programming. I've seen Icon mentioned in this context, but haven't come across any active efforts towards that end. v. JAVA Java is the leading contender for a full feature programming language targetted at Internet applications. It advantages are: familiarity (derived from C++), platform independence (will run on any platform which implements the Java Virtual Machine), performance (byte-code compiled faster than fully interpreted), and safety (downloaded applets are checked for integrity, and only interpreted by trusted Virtual Machine).
30

Chapter 4

CGI Programming in c
This chapter explains how to code FastCGI applications in C and how to build them into executables.

Two important warnings:


To avoid wasting your time, please checkfrom applicable local doc u ments or by contacting local webmasterwhether you can install and run CGI scripts written in C on the server. At the same time, please check how to do that in detailspecifically, where you need to put your CGI scripts. This document was written to illustrate the idea of CGI scripting to C program mers. In practice, CGI programs are usually written in other lan guages, such as Perl, and for good reasons: except for very simple cases, CGI programming in C is clumsy and errorprone.

31

Using a C program as a CGI script


In order to set up a C program as a CGI script, it needs to be turned into a binary executable program. This is often problematic, since people largely work on Windows whereas servers often run some version of UNIX or Linux. The system where you develop your program and the server where it should be installed as a CGI script may have quite different architectures, so that the same executable does not run on both of them. This may create an unsolvable problem. If you are not allowed to log on the server and you cannot use a binary-compatible system (or a crosscompiler) either, you are out of luck. Many servers, however, allow you log on and use the server in interactive mode, as a shell user, and contain a C compiler.

Normally, you would proceed as follows:


1) Compile and test the C program in normal interactive use. 2) Make any changes that might be needed for use as a CGI script. The program should read its input according to the intended form sub mis sion method. Using the default GETmethod, the input is to be read from the environment variable. QUERY_STRING. (The program may also read data from filesbut these must then reside on the server.) It should generate
32

output on the standard output stream (stdout) so that it starts with suitable HTTP headers. Often, the output is in HTML format. 3) Compile and test again. In this testing phase, you might set the environment variableQUERY_STRING so that it contains the test data as it will be sent as form data. E.g., if you intend to use a form where a field named foo contains the input data, you can give the command setenv QUERY_STRING "foo=42" (when using the tcsh shell) or QUERY_STRING="foo=42" (when using the bash shell). 4) Check that the compiled version is in a format that works on the server. This may require a recompilation. You may need to log on into the server computer (using Telnet, SSH, or some other terminal emulator) so that you can use a compiler there. 5) Upload the compiled and loaded program, i.e. the executable binary program (and any data files needed) on the server. 6) Set up a simple HTML document that contains a form for testing the script, etc.

33

You need to put the executable into a suitable directory and name it according to server-specific conventions. Even the compilation commands needed here might differ from what you are used to on your workstation. For example, if the server runs some flavor of Unix and has the Gnu C compiler available, you would typically use a compilation command likegcc -o mult.cgi mult.c and then move (mv) mult.cgi to a directory with a name likecgi-bin. Instead of gcc, you might need to use cc. You really need to check local instructions for such issues. The filename extension .cgi has no fixed meaning in general. However, there can beserverdependent (and operating system dependent) rules for naming executable files.Typical extensions for executables are .cgi and .exe. If you are converting a CGI application into a FastCGI application, in many cases you will only need to add a few lines of code. For more complex applications, you may also need to rearrange some code.

34

The I/O Libraries


The Fast CGI Software Development Kit that accompanies Open Market Web Server 2.0 includes I/O libraries to simplify the job of converting existing CGI applications to Fast CGI or writing new Fast CGI applications. There are two libraries in the kit: fcgi_stdio and fcgiapp. You must include one of these header files in your progr fcgi_stdio.h fcgiapp.h The fcgi_stdio library is a layer on top of the fcgiapp library, and we recommend strongly that you use it, both for converting existing CGI applications and for writing new FastCGI applications. The fcgi_stdio library offers several advantages:

Simplicity: there are only 3 new API calls to learn Familiarity: If you are converting a CGI application to FastCGI, you will find few changes between CGI and FastCGI. We designed our library to make the job of building a FastCGI application as similar as possible to that of building a FastCGI application: you use the same environment variables, same techniques for parsing query strings, the same I/O routines, and so on.

35

Convenience: the library provides full binary compatibility between CGI and Fast CGI. That is, you can run the same binary as either CGI or Fast CGI.

Code Structure
To structure code for Fast CGI, you separate your code into two sections: Initialization section, which is executed only once. Response loop section, which gets executed every time the Fast CGI script gets called. A response loop typically has the following format:

while (FCGI_Accept() >= 0) { # body of response loop }

The FCGI_Accept blocks until a client request comes in, and then returns 0. If there is a system failure, or the system administrator terminates the process, Accept will return -1

36

CGI Environment Variables


All the CGI program will have access to the following environment variables. These variables play an important role while writing any CGI program.

Variable Name CONTENT_TYPE

Description The data type of the content. Used when the client is sending attached content to the server. For example file upload etc. The length of the query information. It's available only for POST requests Return the set cookies in the form of key & value pair.

CONTENT_LENGTH HTTP_COOKIE

HTTP_USER_AGENT The User-Agent request-header field contains information about the user agent originating the request. Its name of the web browser. PATH_INFO QUERY_STRING REMOTE_ADDR The path for the CGI script. The URL-encoded information that is sent with GET method request. The IP address of the remote host making the request. This can be useful for logging or for authentication purpose. The fully qualified name of the host making the request. If this information is not available then REMOTE_ADDR can be used to get IR address. The method used to make the request. The most common methods are GET and POST. The full path to the CGI script.
37

REMOTE_HOST

REQUEST_METHOD SCRIPT_FILENAME

SCRIPT_NAME SERVER_NAME

The name of the CGI script. The server's hostname or IP Address

SERVER_SOFTWARE The name and version of the software the server is running.

A basic example The above-mentioned How the web works: HTTP and CGI explained is a great tutorial. The following introduction of mine is just another attempt to present the basics; please consult other sources if you get confused or need more information. Let us consider the following simple HTML form:
<form action="http://www.cs.tut.fi/cgibin/run/~jkorpela/mult.cgi"> <div><label>Multiplicand 1: <input name="m" size="5"></label></div> <div><label>Multiplicand 2: <input name="n" size="5"></label></div> <div><input type="submit" value="Multiply!"></div> </form>

It will look like the following on your current browser: Multiplicand 1: Multiplicand 2:
Multiply!

38

You can try it if you like. Just in case the server used isnt running and accessible when you try it, heres what you would get as the result: Multiplication results The product of 4 and 9 is 36.

Analysis of the example We will now analyze how the example above works. Assume that you type 4 into one input field and 9 into another and then invoke sub mis siontypically, by clicking on a submit button. Your browser will send, by the HTTP protocol, a request to the server at www.cs.tut.fi. The browser pick up this server name from the value of ACTION attribute where it occurs as the host name part of a URL. (Quite often, theACTION attribute refers, often using a relative URL, to a script on the same server as the document resides on, but this is not necessary, as this example shows.)

When sending the request, the browser provides additional information, specifying a relative URL, in this case /cgi-bin/run/~jkorpela/mult.cgi?m=4&n=9 This was constructed from that part of the ACTION value that follows the host name, by appending a question mark ? and the form data in a specifically encoded format.

The server to which the request was sent (in this case, www.cs.tut.fi) will then process it according to its own rules.

39

Chapter 5

CGI Programming in JAVA

These examples cover using Java for both the client and the server side of the CGI process. The client-side part covers using GET and POST from applets to talk to CGI programs (regardless of what language the CGI programs are written in). The server-side part covers implementing CGI programs in Java that handle GET and POST (regardless of whether the client uses HTML forms or applets), and also includes a URL decoder and CGI form parser in Java (and a similar parser for cookie values).

Server-Side Input Handling Java


Java handles GET and POST slightly differently. The parsing of the input is done for you by Java, so you are separated from the actual format of the input data completely. Your program will be an object subclassed off ofHttpServlet, the generalized Java Servlet class for handling web services.

Servlet programs must override the doGet() or doPost() messages, which are methods that are executed in response to the client. There are two arguments to these methods, HttpServletRequest request andHttpServletResponse response. Let's take a look at a very simple servlet program, the traditional HelloWorld (this time with a doGet method):

40

import import import import import

java.io.*; java.text.*; java.util.*; javax.servlet.*; javax.servlet.http.*;

public class Hello extends HttpServlet { public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { response.setContentType("text/html"); PrintWriter out = response.getWriter(); out.println("<html>"); out.println("<head>"); String title = "Hello World"; out.println("<title>" + title + "</title>"); out.println("</head>"); out.println("<body bgcolor=white>"); out.println("<h1>" + title + "</h1>"); String param = request.getParameter("param"); if (param != null) out.println("Thanks for the lovely param='" + param + "' binding."); out.println(""); out.println(""); } }

The argument HttpServletRequest request represents the client request, and the values of the parameters passed from the HTML FORM can be retrieved by calling the HttpServletRequest getParameter method.
41

This method takes as its argument the name of the parameter (the name of the HTML INPUT object), and returns as a Java String the value assigned to the parameter. In cases where the parameter may have multiple bindings, the method getParameterValues can be used to retrieve the values in an array of Java Strings -- note that getParameter will return the first value of this array.

Java Output
Let's look back at our Java code example. You'll see a number of differences between the Servlet code and the CGI approach. Output is all handled by the HttpServletResponse object, which allows you to set the content type through the setContentType method. Instead of printing the HTTP header yourself, you tell the HttpServletResponse object that you want the content type to be "text/html" explicitly.

Java Compilation in Unix Compiling Servlets in UNIX requires a few changes to your PATH and CLASSPATH environment variables. These changes have been made for you in the source file /afs/ir/class/cs145/all.env. They include the following additions:
setenv PATH /afs/ir/class/cs145/jsdk2.1:/usr/pubsw/apps/jdk1. 2/bin:${PATH} setenv CLASSPATH /afs/ir/class/cs145/jsdk2.1/servlet.jar:$CLASSPAT

42

If there are any difficulties, let us know. These have been tested on the elaine machines and are assumed to be operational on the leland Sparc machines (elaine, myth, epic, saga).

You also have to set up a specific directory structure to provide Servlets. The directory structure required by Servlets is essentially:
[anydir] [servletdir] webpages WEB-INF servlets

A shell script to build this hierarchy is provided at /afs/ir/class/cs145/code/bin/buildServletDirectory (after you run source /afs/ir/class/cs145/all.env (which you probably should just add to your .cshrc file), you can run buildServletDirectory by just typing the command).

You can store .html documents in your webpages directory, and they will be accessible at your Servlet address (see below), while all Servlets you write have to be located in the servlets directory to be recognized.

43

44

45

S-ar putea să vă placă și