Monday, September 04, 2006

How CGI Data Is Handled

When writing a CGI application, the developer has the same two concerns that developers of all applications have: data input and data output. This section discusses how data is input into a CGI application. The closing section of the chapter ("Returning the Results to the Client") discusses how data is output by the application.

In the UNIX environment, CGI applications receive their data from command line arguments, environment variables, and the standard input (known as stdin to C programmers). By querying the value of one of a predefined set of these variables, the UNIX CGI application can determine the server context that it has been run under and the data that was entered at the client side. The application then returns its data to the server using the standard output (stdout).

In the various Windows environments (Windows 3.x, Windows 95, and Windows NT), the operating system makes the use of environment variables and the standard input/output difficult. For this reason, the Win/CGI interface uses a spooling paradigm for passing data between the server and the CGI application. Before executing the CGI application, the server creates a CGI data file. This file contains the same data fields that are used in UNIX CGI applications, along with fields specific to Win/CGI. The name of the data file is passed to the CGI application as a command line argument. Likewise, the CGI application places its output into a file whose location the server specifies in one of the fields of the CGI data file.
Decoding the HTML Form

As you learned in Chapter 2, the client is not limited to simply retrieving resources from HTTP servers. There are also two HTTP request methods that allow the client to interact in some way with a resource.

The first method uses a GET request message that includes search terms in the resource address:

GET //myserver.com/cgi-win/search.exe?last=smith+first=jim

This request message could have been generated by clicking the Submit button of a form or by some sort of user agent. Likewise, the user of a Web browser such as Netscape could enter the address portion of this string into the text box provided for Web addresses. Any of these three actions causes the HTTP server to launch search.exe and pass it the search text that appears after the "?".

The second method uses a POST request message. The difference between a POST message and the GET message illustrated previously is that the POST message uses the portion of the message to transfer the form data. This method is used by the Submit button or by a user agent. Because the form data entered on the HTML form is sent in the , it cannot be entered into the address box of a Web browser.

With the second method, the Win/CGI interface requires that the server parse the HTML form data from the POST message it received. This data is then stored in either the CGI data file previously discussed or in an external file. In the later case, the server places an entry in the CGI data file that specifies the filename and length of this external file. The client application uses theContent-Type entity header to specify how the data is encoded in the . There are two Content-Types used: application/x-www-urlencoded and multipart/form-data. The first is the Content-Type used in most cases. The second Content-Type provides for uploading files from the client by using a multipart MIME message. To date, multipart MIME messages are not widely used in HTTP messages. The Content-Type header is passed to the CGI application as one of the fields in the CGI data file.