How Java Web Servers Work
by Budi Kurniawan
04/23/2003
Editor's Note: this article is adapted from Budi's self-published book on Tomcat internals. You can find more information on his web site.
A web server is also called a Hypertext Transfer Protocol (HTTP) server because it uses HTTP to communicate with its clients, which are usually web browsers. A Java-based web server uses two important classes, java.net.Socket
and java.net.ServerSocket
, and communicates through HTTP messages. Therefore, this article starts by discussing of HTTP and the two classes. Afterwards, I'll explain the simple web server application that accompanies this article.
The Hypertext Transfer Protocol (HTTP)
HTTP is the protocol that allows web servers and browsers to send and receive data over the Internet. It is a request and response protocol--the client makes a request and the server responds to the request. HTTP uses reliable TCP connections, by default on TCP port 80. The first version of HTTP was HTTP/0.9, which was then overridden by HTTP/1.0. The current version is HTTP/1.1, which is defined by RFC 2616(.pdf).
This section covers HTTP 1.1 briefly; enough to make you understand the messages sent by the web server application. If you are interested in more details, read RFC 2616.
In HTTP, the client always initiates a transaction by establishing a connection and sending an HTTP request. The server is in no position to contact a client or to make a callback connection to the client. Either the client or the server can prematurely terminate a connection. For example, when using a web browser, you can click the Stop button on your browser to stop the download process of a file, effectively closing the HTTP connection with the web server.
HTTP Requests
An HTTP request consists of three components:
- Method-URI-Protocol/Version
- Request headers
- Entity body
An example HTTP request is:
POST /servlet/default.jsp HTTP/1.1
Accept: text/plain; text/html
Accept-Language: en-gb
Connection: Keep-Alive
Host: localhost
Referer: http://localhost/ch8/SendDetails.htm
User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)
Content-Length: 33
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
LastName=Franks&FirstName=Michael
The method-URI-Protocol/Version appears as the first line of the request.
POST /servlet/default.jsp HTTP/1.1
where POST
is the request method, /servlet/default.jsp
represents the URI and HTTP/1.1
the Protocol/Version section.
Each HTTP request can use one of the many request methods, as specified in the HTTP standards. The HTTP 1.1 supports seven types of request: GET
, POST
, HEAD
, OPTIONS
, PUT
, DELETE
, and TRACE
. GET
and POST
are the most commonly used in Internet applications.
The URI specifies an Internet resource completely. A URI is usually interpreted as being relative to the server's root directory. Thus, it should always begin with a forward slash (/
). A URL is actually a type of URI. The protocol version represents the version of the HTTP protocol being used.
The request header contains useful information about the client environment and the entity body of the request. For example, it could contain the language for which the browser is set, the length of the entity body, and so on. Each header is separated by a carriage return/linefeed (CRLF) sequence.
A very important blank line (CRLF sequence) comes between the headers and the entity body. This line marks the beginning of the entity body. Some Internet programming books consider this CRLF the fourth component of an HTTP request.
In the previous HTTP request, the entity body is simply the following line:
LastName=Franks&FirstName=Michael
The entity body could easily become much longer in a typical HTTP request.
HTTP Responses
Similar to requests, an HTTP response also consists of three parts:
- Protocol-Status code-Description
- Response headers
- Entity body
The following is an example of an HTTP response:
HTTP/1.1 200 OK
Server: Microsoft-IIS/4.0
Date: Mon, 3 Jan 1998 13:13:33 GMT
Content-Type: text/html
Last-Modified: Mon, 11 Jan 1998 13:23:42 GMT
Content-Length: 112
<html>
<head>
<title>HTTP Response Example</title></head><body>
Welcome to Brainy Software
</body>
</html>
The first line of the response header is similar to the first line of the request header. The first line tells you that the protocol used is HTTP version 1.1, the request succeeded (200 = success), and that everything went okay.
The response headers contain useful information similar to the headers in the request. The entity body of the response is the HTML content of the response itself. The headers and the entity body are separated by a sequence of CRLFs.
The Socket Class
A socket is an endpoint of a network connection. A socket enables an application to read from and write to the network. Two software applications residing on two different computers can communicate with each other by sending and receiving byte streams over a connection. To send a message to another application, you need to know its IP address, as well as the port number of its socket. In Java, a socket is represented by the java.net.Socket
class.
To create a socket, you can use one of the many constructors of the Socket
class. One of these constructors accepts the host name and the port number:
public Socket(String host, int port)
where host
is the remote machine name or IP address, and port
is the port number of the remote application. For example, to connect to yahoo.com at port 80, you would construct the following socket:
new Socket("yahoo.com", 80);
Once you create an instance of the Socket
class successfully, you can use it to send and receive streams of bytes. To send byte streams, you must first call the Socket
class' getOutputStream
method to obtain a java.io.OutputStream
object. To send text to a remote application, you often want to construct a java.io.PrintWriter
object from the OutputStream
object returned. To receive byte streams from the other end of the connection, you call the Socket
class' getInputStream
method, which returns a java.io.InputStream
.
The following snippet creates a socket that can communicate with a local HTTP server (127.0.0.1 denotes a local host), sends an HTTP request, and receives the response from the server. It creates a StringBuffer
object to hold the response, and prints it to the console.
Socket socket = new Socket("127.0.0.1", "8080");
OutputStream os = socket.getOutputStream();
boolean autoflush = true;
PrintWriter out = new PrintWriter( socket.getOutputStream(), autoflush );
BufferedReader in = new BufferedReader(
new InputStreamReader( socket.getInputStream() ));
// send an HTTP request to the web server
out.println("GET /index.jsp HTTP/1.1");
out.println("Host: localhost:8080");
out.println("Connection: Close");
out.println();
// read the response
boolean loop = true;
StringBuffer sb = new StringBuffer(8096);
while (loop) {
if ( in.ready() ) {
int i=0;
while (i!=-1) {
i = in.read();
sb.append((char) i);
}
loop = false;
}
Thread.currentThread().sleep(50);
}
// display the response to the out console
System.out.println(sb.toString());
socket.close();
Note that to get a proper response from the web server, you need to send an HTTP request that complies with the HTTP protocol. If you have read the previous section, "The Hypertext Transfer Protocol (HTTP)," you can understand the HTTP request in the code above.
|
The ServerSocket
Class
The Socket
class represents a "client" socket; a socket that you construct whenever you want to connect to a remote server application. If you want to implement a server application, such as an HTTP server or an FTP server, you need a different approach. This is because your server must stand by all the time, as it does not know when a client application will try to connect to it.
For this purpose, you need to use the java.net.ServerSocket
class. This is an implementation of a server socket. A server socket waits for a connection request from a client. Once it receives a connection request, it creates a Socket
instance to handle the communication with the client.
To create a server socket, you need to use one of the four constructors the ServerSocket
class provides. You need to specify the IP address and port number on which the server socket will listen. Typically, the IP address will be 127.0.0.1, meaning that the server socket will be listening on the local machine. The IP address the server socket is listening on is referred to as the binding address. Another important property of a server socket is its backlog, which is the maximum queue length for incoming connection requests before the server socket starts to refuse incoming requests.
One of the constructors of the ServerSocket
class has the following signature:
public ServerSocket(int port, int backLog, InetAddress bindingAddress);
For this constructor, the binding address must be an instance of java.net.InetAddress
. An easy way to construct an InetAddress
object is by calling its static method getByName
, passing a String
containing the host name:
InetAddress.getByName("127.0.0.1");
The following line of code constructs a ServerSocket
that listens on port 8080 of the local machine with a backlog of 1.
new ServerSocket(8080, 1, InetAddress.getByName("127.0.0.1"));
Once you have a ServerSocket
instance, you can tell it to wait for incoming connection requests by calling the accept
method. This method will only return when there is a connection request. It returns an instance of the Socket
class. This Socket
object can then be used to send and receive byte streams from the client application, as explained in the The Socket Class. Practically, the accept
method is the only method used in the application accompanying this article.
Source Code Download the HowWebServersWork.zip file for the example application. |
The Application
Our web server application is part of the ex01.pyrmont
package and consists of three classes:
HttpServer
Request
Response
The entry point of this application (the static main
method) is in the HttpServer
class. It creates an instance of HttpServer
and calls its await
method. As the name implies, await
waits for HTTP requests on a designated port, processes them, and sends responses back to the clients. It keeps waiting until a shutdown command is received. (The method name await
is used instead of wait
because wait
is an important method in the System.Object
class for working with threads.)
The application only sends static resources, such as HTML and image files, from a specified directory. It supports no headers (such as dates or cookies).
We'll now take a look at the three classes in the following subsections.
The HttpServer
Class
The HttpServer
class represents a web server and can serve static resources found in the directory indicated by the public static final WEB_ROOT
and all subdirectories under it. WEB_ROOT
is initialized as follows:
public static final String WEB_ROOT =
System.getProperty("user.dir") + File.separator + "webroot";
The code listings include a directory called webroot that contains some static resources that you can use for testing this application. You can also find a servlet that will be used for my next article, "How Servlet Containers Work."
To request a static resource, type the following URL in your browser's Address or URL box:
http://machineName:port/staticResource
If you are sending a request from a different machine from the one running your application, machineName
is the name or IP address of the computer running this application. If your browser is on the same machine, you can use localhost
for the machineName
. port
is 8080 and staticResource
is the name of the file requested and must reside in WEB_ROOT
.
For instance, if you are using the same computer to test the application and you want to ask the HttpServer
to send the index.html file, use the following URL:
http://localhost:8080/index.html
To stop the server, send a shutdown command from a web browser by typing the pre-defined string in the browser's Address or URL box, after the host:port
section of the URL. The shutdown command is defined by the SHUTDOWN
static final variable in the HttpServer
class:
private static final String SHUTDOWN_COMMAND = "/SHUTDOWN";
Therefore, to stop the server, you can use:
http://localhost:8080/SHUTDOWN
Now, let's have a look at the await
method that is given in Listing 1.1. The explanation of the code is to be found right after the listing.
Listing 1.1. The HttpServer
class' await
method
public void await() {
ServerSocket serverSocket = null;
int port = 8080;
try {
serverSocket = new ServerSocket(port, 1,
InetAddress.getByName("127.0.0.1"));
}
catch (IOException e) {
e.printStackTrace();
System.exit(1);
}
// Loop waiting for a request
while (!shutdown) {
Socket socket = null;
InputStream input = null;
OutputStream output = null;
try {
socket = serverSocket.accept();
input = socket.getInputStream();
output = socket.getOutputStream();
// create Request object and parse
Request request = new Request(input);
request.parse();
// create Response object
Response response = new Response(output);
response.setRequest(request);
response.sendStaticResource();
// Close the socket
socket.close();
//check if the previous URI is a shutdown command
shutdown = request.getUri().equals(SHUTDOWN_COMMAND);
}
catch (Exception e) {
e.printStackTrace();
continue;
}
}
}
The await
method starts by creating a ServerSocket
instance and then going into a while
loop.
serverSocket = new ServerSocket(
port, 1, InetAddress.getByName("127.0.0.1"));
...
// Loop waiting for a request
while (!shutdown) {
...
}
The code inside of the while
loop stops at the accept
method of ServerSocket
, which returns only when an HTTP request is received on port 8080:
socket = serverSocket.accept();
Upon receiving a request, the await
method obtains the java.io.InputStream
and the java.io.OutputStream
objects from the Socket
instance returned by the accept
method.
input = socket.getInputStream();
output = socket.getOutputStream();
The await
method then creates a Request
object and calls its parse
method to parse the raw HTTP request.
// create Request object and parse
Request request = new Request(input);
request.parse();
Next, the await
method creates a Response
object, sets the Request
object to it, and calls its sendStaticResource
method.
// create Response object
Response response = new Response(output);
response.setRequest(request);
response.sendStaticResource();
Finally, the await
method closes the Socket
and calls the getUri
method of Request
to check if the URI of the HTTP request is a shutdown command. If it is, the shutdown variable is set to true
and the program exits the while
loop.
// Close the socket
socket.close();
//check if the previous URI is a shutdown command
shutdown = request.getUri().equals(SHUTDOWN_COMMAND);
|
The Request
Class
The Request
class represents an HTTP request. An instance of this class is constructed by passing the InputStream
object obtained from a Socket
that handles the communication with the client. Call one of the read
methods of the InputStream
object to obtain the HTTP request raw data.
The Request
class has two public methods: parse
and getUri
. The parse
method parses the raw data in the HTTP request. It doesn't do much--the only information it makes available is the URI of the HTTP request, which it obtains by calling the private method parseUri
. The parseUri
method stores the URI in the uri
variable. Invoke the public getUri
method to return the URI of the HTTP request.
To understand how the parse
and parseUri
methods work, you need to know the structure of an HTTP request, which is defined in RFC 2616(.pdf).
An HTTP request contains three parts:
- Request line
- Headers
- Message body
For now, we are only interested in the first part of the HTTP request, the request line. A request line begins with a method token, is followed by the request URI and the protocol version, and ends with carriage-return linefeed (CRLF) characters. Elements in the request line are separated by a space character. For instance, the request line for a request for the index.html file using the GET
method is:
GET /index.html HTTP/1.1
The parse
method reads the whole byte stream from the socket's InputStream
passed to the Request
object, and stores the byte array in a buffer. It then populates a StringBuffer
object called request
using the bytes in the buffer
byte array, and passes the String
representation of the StringBuffer
to the parseUri
method.
The parse
method is given in Listing 1.2.
Listing 1.2. The Request
class' parse
method
public void parse() {
// Read a set of characters from the socket
StringBuffer request = new StringBuffer(2048);
int i;
byte[] buffer = new byte[2048];
try {
i = input.read(buffer);
}
catch (IOException e) {
e.printStackTrace();
i = -1;
}
for (int j=0; j<i; j++) {
request.append((char) buffer[j]);
}
System.out.print(request.toString());
uri = parseUri(request.toString());
}
The parseUri
method then obtains the URI from the request line. Listing 1.3 shows the parseUri
method. The parseUri
method searches for the first and the second spaces in the request and obtains the URI from there.
Listing 1.3. The Request
class' parseUri
method
private String parseUri(String requestString) {
int index1, index2;
index1 = requestString.indexOf(' ');
if (index1 != -1) {
index2 = requestString.indexOf(' ', index1 + 1);
if (index2 > index1)
return requestString.substring(index1 + 1, index2);
}
return null;
}
The Response
Class
The Response
class represents an HTTP response. Its constructor accepts an OutputStream
object, such as the following:
public Response(OutputStream output) {
this.output = output;
}
A Response
object is constructed by the HttpServer
class' await
method by passing the OutputStream
object obtained from the socket.
The Response
class has two public methods: setRequest
and sendStaticResource
. The setRequest
method is used to pass a Request
object to the Response
object. It is as simple as the code in Listing 1.4.
Listing 1.4. The Response
class' setRequest
method
public void setRequest(Request request) {
this.request = request;
}
The sendStaticResource
method is used to send a static resource, such as an HTML file. Its implementation is given in Listing 1.5.
Listing 1.5. The Response
class' sendStaticResource
method
public void sendStaticResource() throws IOException {
byte[] bytes = new byte[BUFFER_SIZE];
FileInputStream fis = null;
try {
File file = new File(HttpServer.WEB_ROOT, request.getUri());
if (file.exists()) {
fis = new FileInputStream(file);
int ch = fis.read(bytes, 0, BUFFER_SIZE);
while (ch != -1) {
output.write(bytes, 0, ch);
ch = fis.read(bytes, 0, BUFFER_SIZE);
}
}
else {
// file not found
String errorMessage = "HTTP/1.1 404 File Not Found/r/n" +
"Content-Type: text/html/r/n" +
"Content-Length: 23/r/n" +
"/r/n" +
"<h1>File Not Found</h1>";
output.write(errorMessage.getBytes());
}
}
catch (Exception e) {
// thrown if cannot instantiate a File object
System.out.println(e.toString() );
}
finally {
if (fis != null)
fis.close();
}
}
The sendStaticResource
method is very simple. It first instantiates the java.io.File
class by passing the parent and child paths to the File
class' constructor.
File file = new File(HttpServer.WEB_ROOT, request.getUri());
It then checks if the file exists. If it does, the sendStaticResource
method constructs a java.io.FileInputStream
object by passing the File
object. It then invokes the read
method of the FileInputStream
and writes the byte array to the OutputStream
output. Note that in this case, the content of the static resource is sent to the browser as raw data.
if (file.exists()) {
fis = new FileInputStream(file);
int ch = fis.read(bytes, 0, BUFFER_SIZE);
while (ch != -1) {
output.write(bytes, 0, ch);
ch = fis.read(bytes, 0, BUFFER_SIZE);
}
}
If the file does not exist, the sendStaticResource
method sends an error message to the browser.
String errorMessage = "HTTP/1.1 404 File Not Found/r/n" +
"Content-Type: text/html/r/n" +
"Content-Length: 23/r/n" +
"/r/n" +
"<h1>File Not Found</h1>";
output.write(errorMessage.getBytes());
Compiling and Running the Application
To compile and run the application, you first need to extract the .zip file containing the application for this article. The directory you extract the .zip file into is called the working directory and will have three sub-directories: src/, classes/, and lib/. To compile the application, type the following from the working directory:
javac -d . src/ex01/pyrmont/*.java
The -d
option writes the results to the current, not the src/, directory.
To run the application, type the following from the working directory:
java ex01.pyrmont.HttpServer
To test the application, open your browser and type the following in the URL or Address box:
http://localhost:8080/index.html
You will see the index.html page displayed in your browser, as in Figure 1.
Figure 1. The output from the web server
On the console, you can see something like the following:
GET /index.html HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)
Host: localhost:8080
Connection: Keep-Alive
GET /images/logo.gif HTTP/1.1
Accept: */*
Referer: http://localhost:8080/index.html
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98)
Host: localhost:8080
Connection: Keep-Alive
Summary
In this article, you have seen how a simple web server works. The application accompanying this article consists of only three classes and is not fully functional. Nevertheless, it serves as a good learning tool.
Budi Kurniawan is a senior J2EE architect and author.