Monday, September 04, 2006

HTTP Request Messages

The HTTP Request message is the mechanism used to retrieve a data resource from a server. In order to maintain backward compatibility with the previous version of the HTTP protocol, the HTTP/1.0 protocol provides for both a full request (for HTTP/1.0) and a simple request (for HTTP/0.9) style of message. If an HTTP/1.0 server receives a simple request message, it must respond with an HTTP/0.9-compatible simple response message. Likewise, an HTTP/1.0 client should always generate a full request message.

Request Methods

As mentioned in the previous section, the syntax of the full-request message includes an element named , which has the following rule:

Method = "GET" | "HEAD" | "POST" |

The element indicates what operation should be performed on the data resource specified by the element. The acceptable methods for a given resource can change at any time. If a method is not allowed for a resource, the client receives notification of this in the and elements of the response message.

The following sections describe the named methods. The element allows for extensions to the HTTP/1.0 protocol. Both client and server must recognize these extended methods or the server will likely return a of 501 (not implemented).

The GET Method

As perhaps the easiest method to understand, the GET method merely instructs the server to return to the client the resource indicated by the element of the request message. If the points to a server application, the server returns the data output by the application, not the application itself.

Also, a Request-Header field named If-Modified-Since creates a conditional GET request. If the resource has been modified since the time value specified in the header, the resource is returned. If it has not been modified since that time, the server responds with a status code of 304 (not modified) and with no entity in the response message. This header field is used to perform client-side caching and to reduce network load.

The HEAD Method

The HEAD method is nearly identical to the GET method. The very important difference, however, is that the server must return only HTTP header information related to the resource. The resource (entity) itself must never be returned in response to a HEAD request.

The HEAD method allows spiders and agents operating on the Web to retrieve only necessary header information about a particular resource. This can be useful when checking the validity of hypertext links or checking a resource to see whether it has been modified since a particular date.

The POST Method

The POST method is used when sending entity information to a server. For instance, POST is used when filling out an HTML form on a Web page. The Submit button on the form typically performs a POST request and appends the form's field values to the request message as the element.

The POST method is usually performed on some type of application resource as opposed to a document resource. A successful POST request does not require the server to return an element in the response message. In some cases, the action may not produce a resource that can be identified by a URI. If no is returned, the server should indicate a of 200 (okay) or 204 (no content). If the action does produce an , the should return as 201 (created) and, of course, the should be transmitted to the client.

An entity header field called Content-Length is required on all POST messages. If it is invalid or missing, the server returns a or 400 (bad request).

Request Message Header Fields

The full-request message can contain any number of header fields that can be used to qualify the request or to provide information about the client making the request. The syntax for the request header is
Request-Header = Authorization | From | If-Modified-Since | Referer | User-Agent

Additional field names can be added only if all applications involved in a conversation recognize them as request header fields. Otherwise, unrecognized fields are considered Entity-Headers.

Authorization

The Authorization request-header field is used by user agents that wish to present some sort of credentials to the server. The format of the field is

Authorization = "Authorization:"

More on authentication appears in the last section of this chapter.

From

The From request-header field is sent by a user agent that wishes to provide the e-mail address of the person who is at the helm. The address should be a valid mailbox and should be sent only with the user's express knowledge and permission. This field should always be used by Web robots and crawlers to provide the e-mail address of the person who started the robot. The format of the field is as follows:

From = "From:"
If-Modified-Since

As mentioned in the section titled "The GET Method," the If-Modified-Since header field is used to produce a conditional GET request. The field uses this format:

If-Modified-Since = "If-Modified-Since:"

The resource is returned to the client only if the resource has been modified since the date specified in the element. If the element specifies an invalid date or if the date is later than the server's current date, the server essentially ignores the header field and returns the resource as though it is responding to a normal GET request.

Referer
The Referer header field specifies the URI of the resource from which the request message's element was obtained. This field must be sent only if the field has actually been obtained from a source that has an address. If a user has generated the element value (by typing in the address or selecting from a bookmark list, for example), this field must not be sent. It uses this format:
Referer = "Referer:"

User-Agent
User-Agent contains information about the user agent that generated the request message. This request-header field is useful to the server in logging server activity and also for creating responses that are specific for the given user agent. The field is not required but should be sent as a courtesy to the server, using this format:

User-Agent = "User-Agent:" 1*( | )

The convention for the element is to list the information in order of significance. Typically, this field's values include the product name of the user agent, the product version, and sometimes the operating system the user agent is running under.