The http 1.1 specification (RFC 2616) defines a number of methods: OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE and CONNECT.
Of these, the most familiar are GET and POST.
Web browsers rely on these two methods to send and receive data from web servers. (In compliance with the W3C Html 4 recommendation)
GET is meant for retrieving content from a web server. The requests should be idempotent. No critical state should change. Successive requests should return the same content. (This makes them ideal candidates for caching!)
POST on the other hand is typically used for operations that manipulate content on the server, such as adding, editing or removing content.
So, what about the other methods, you might ask?
Well it turns out browsers are not the only clients talking to our web servers. The web of 2009 has two more essential infrastructure components that our servers frequently have to deal with: proxy servers and web crawlers.
And it turns out these two types of clients are very fond of a third http method: HEAD.
HEAD is identical to GET except that only the http headers are returned. The body is discarded. This is primarily used for checking the validity of URLs. The load on the server will most likely remain the same as the content-length header must be returned (and thus potentially calculated based on the generated response body). Only the bandwidth is saved.
How do Java servlets deal with this?
Not too bad it turns out ! Deep inside the HttpServlet class (part of the Servlet API 2.5), we find the following code:
protected void doHead(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
NoBodyResponse response = new NoBodyResponse(resp);
doGet(req, response);
response.setContentLength();
}
Here is what the NoBodyResponse wrapper does (from the source code documentation):
A response that includes no body, for use in (dumb) "HEAD" support.
This just swallows that body, counting the bytes in order to set the content length appropriately. All other methods delegate directly to the HttpServletResponse object used to construct this one.
So this means the standard way of the servlet api to deal with with HEAD requests consists of:
Wrapping the response using the NoBodyResponse in order to suppress the body, but preserve the headers.
Execute the GET functionnality of the application (with the wrapped response object)
Set the content-length header of the response
Return the response headers to the client (without the body)
The Solution is:
Add a servlet filter (in web.xml) to lie about the http method and present all HEAD requests as GET.
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletRequestWrapper;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpServletResponseWrapper;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
/**
* Servlet filter that presents a HEAD request as a GET.
* The application doesn't need to know the difference, as this filter handles all the details.
*/
public class HttpHeadFilter implements Filter {
@Override
public void init(FilterConfig filterConfig) throws ServletException {
//Do nothing
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
HttpServletRequest httpServletRequest = (HttpServletRequest) request;
if (isHttpHead(httpServletRequest)) {
HttpServletResponse httpServletResponse = (HttpServletResponse) response;
NoBodyResponseWrapper noBodyResponseWrapper = new NoBodyResponseWrapper(httpServletResponse);
chain.doFilter(new ForceGetRequestWrapper(httpServletRequest), noBodyResponseWrapper);
noBodyResponseWrapper.setContentLength();
} else {
chain.doFilter(request, response);
}
}
@Override
public void destroy() {
//Do nothing
}
/**
* Checks whether the HTTP method of this request is HEAD.
*
* @param request The request to check.
* @return {@code true} if it is HEAD, {@code false} if it isn't.
*/
private boolean isHttpHead(HttpServletRequest request) {
return ("HEAD".equalsIgnoreCase(request.getMethod())
|| "HEAD".equalsIgnoreCase(request.getParameter("_method")));
}
/**
* Request wrapper that lies about the Http method and always returns GET.
*/
private class ForceGetRequestWrapper extends HttpServletRequestWrapper {
/**
* Initializes the wrapper with this request.
*
* @param request The request to initialize the wrapper with.
*/
public ForceGetRequestWrapper(HttpServletRequest request) {
super(request);
}
/**
* Lies about the HTTP method. Always returns GET.
*
* @return Always returns GET.
*/
@Override
public String getMethod() {
return "GET";
}
}
/**
* Response wrapper that swallows the response body, leaving only the headers.
*/
private class NoBodyResponseWrapper extends HttpServletResponseWrapper {
/**
* Outputstream that discards the data written to it.
*/
private final NoBodyOutputStream noBodyOutputStream = new NoBodyOutputStream();
private PrintWriter writer;
/**
* Constructs a response adaptor wrapping the given response.
*
* @param response The response to wrap.
*/
public NoBodyResponseWrapper(HttpServletResponse response) {
super(response);
}
@Override
public ServletOutputStream getOutputStream() throws IOException {
return noBodyOutputStream;
}
@Override
public PrintWriter getWriter() throws UnsupportedEncodingException {
if (writer == null) {
writer = new PrintWriter(new OutputStreamWriter(noBodyOutputStream, getCharacterEncoding()));
}
return writer;
}
/**
* Sets the content length, based on what has been written to the outputstream so far.
*/
void setContentLength() {
super.setContentLength(noBodyOutputStream.getContentLength());
}
}
/**
* Outputstream that only counts the length of what is being written to it while discarding the actual data.
*/
private class NoBodyOutputStream extends ServletOutputStream {
/**
* The number of bytes written to this stream so far.
*/
private int contentLength = 0;
/**
* @return The number of bytes written to this stream so far.
*/
int getContentLength() {
return contentLength;
}
@Override
public void write(int b) {
contentLength++;
}
@Override
public void write(byte buf[], int offset, int len) throws IOException {
contentLength += len;
}
}
}
Conclusion
We now have a drop-in solution, compatible with any web framework and any container. It allows us to transparently support http HEAD requests in our applications and finally treat web crawlers and proxy servers as first class citizens.