Inside the clickstream library you'll find a ClickstreamFilter
class that captures request information, a Clickstream
class that operates like a struct to hold data, and a ClickstreamLogger
class that captures session and context events to glue everything together. There's also a BotChecker
class that determines if a client is a robot (using simple logic, like "Did they request robots.txt?"). To view the data, the library provides a clickstreams.jsp
visitor summary page and a supporting viewstream.jsp
visitor detail page.
We'll look first at the ClickstreamFilter
class. All these examples are slightly modified from the original, for formatting and to fix portability issues, which I'll discuss later.
import java.io.IOException;
import javax.servlet.*;
import javax.servlet.http.*;
public class ClickstreamFilter implements Filter {
protected FilterConfig filterConfig;
private final static String FILTER_APPLIED = "_clickstream_filter_applied";
public void init(FilterConfig config) throws ServletException {
this.filterConfig = filterConfig;
}
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
// Ensure that filter is only applied once per request.
if (request.getAttribute(FILTER_APPLIED) == null) {
request.setAttribute(FILTER_APPLIED, Boolean.TRUE);
HttpSession session = ((HttpServletRequest)request).getSession();
Clickstream stream = (Clickstream)session.getAttribute("clickstream");
stream.addRequest(((HttpServletRequest)request));
}
// pass the request on
chain.doFilter(request, response);
}
public void destroy() { }
}
The doFilter()
method gets the user session, obtains the Clickstream
from the session, and adds the current request data to the Clickstream
. It uses a special FILTER_APPLIED
marker attribute to note if the filter was already applied for this request (as might happen during request dispatching) and to ignore any follow-on filtering action. You might wonder how the filter knows that the clickstream
attribute will be present in the session. That's because the ClickstreamLogger
places it there when the session is created. Here is the ClickstreamLogger
code:
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;
public class ClickstreamLogger implements ServletContextListener,
HttpSessionListener {
Map clickstreams = new HashMap();
public ClickstreamLogger() { }
public void contextInitialized(ServletContextEvent sce) {
sce.getServletContext().setAttribute("clickstreams", clickstreams);
}
public void contextDestroyed(ServletContextEvent sce) {
sce.getServletContext().setAttribute("clickstreams", null);
}
public void sessionCreated(HttpSessionEvent hse) {
HttpSession session = hse.getSession();
Clickstream clickstream = new Clickstream();
session.setAttribute("clickstream", clickstream);
clickstreams.put(session.getId(), clickstream);
}
public void sessionDestroyed(HttpSessionEvent hse) {
HttpSession session = hse.getSession();
Clickstream stream = (Clickstream)session.getAttribute("clickstream");
clickstreams.remove(session.getId());
}
}
The logger receives application events and uses them to bind everything together. On context creation, the logger places a shared map of streams into the context. This allows the clickstreams.jsp
page to know what streams are currently active. On context destruction, the logger removes the map. When a new visitor creates a new session, the logger places a new Clickstream
instance into the session and adds the Clickstream
to the central map of streams. On session destruction, the logger removes the stream from the central map.
The following web.xml
deployment-descriptor snippet wires everything together:
<filter>
<filter-name>clickstreamFilter</filter-name>
<filter-class>ClickstreamFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>clickstreamFilter</filter-name>
<url-pattern>*.jsp</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>clickstreamFilter</filter-name>
<url-pattern>*.html</url-pattern>
</filter-mapping>
<listener>
<listener-class>ClickstreamLogger</listener-class>
</listener>
This registers the ClickstreamFilter
and sets it up to handle *.jsp
and *.html
requests. This also registers the ClickstreamLogger
as a listener to receive application events when they occur.
The two JSP pages pull the clickstream
data from the session and context objects and use an HTML interface to display the current status. The following clickstreams.jsp
file shows the overall summary:
<%@ page import="java.util.*" %>
<%@ page import="Clickstream" %>
<%
Map clickstreams = (Map)application.getAttribute("clickstreams");
String showbots = "false";
if (request.getParameter("showbots") != null) {
if (request.getParameter("showbots").equals("true"))
showbots = "true";
else if (request.getParameter("showbots").equals("both"))
showbots = "both";
}
%>
<font face="Verdana" size="-1">
<h1>All Clickstreams</h1>
<a href="clickstreams.jsp?showbots=false">No Bots</a> |
<a href="clickstreams.jsp?showbots=true">All Bots</a> |
<a href="clickstreams.jsp?showbots=both">Both</a> <p>
<% if (clickstreams.keySet().size() == 0) { %>
No clickstreams in progress
<% } %>
<%
Iterator it = clickstreams.keySet().iterator();
int count = 0;
while (it.hasNext()) {
String key = (String)it.next();
Clickstream stream = (Clickstream)clickstreams.get(key);
if (showbots.equals("false") && stream.isBot()) {
continue;
}
else if (showbots.equals("true") && !stream.isBot()) {
continue;
}
count++;
try {
%>
<%= count %>.
<a href="viewstream.jsp?sid=<%= key %>"><b>
<%= (stream.getHostname() != null && !stream.getHostname().equals("") ?
stream.getHostname() : "Stream") %>
</b></a> <font size="-1">[<%= stream.getStream().size() %> reqs]</font><br>
<%
}
catch (Exception e) {
%>
An error occurred - <%= e %><br>
<%
}
}
%>
The package is fairly easy to download and install from the OpenSymphony Website. Place and compile the Java files in WEB-INF/classes
, put the JSP files in the Web application root, and modify the web.xml
file as instructed. To save you the hassle of even this much work, you can find a prepackaged WAR file available at http://www.javaworld.com/jw-06-2001/Filters/clickstream.war.
For the filter to work on Tomcat 4.0 beta 5, I found I had to make some slight portability modifications. The changes I made show some common pitfalls in servlet and filter portability, so I'll list them here:
- I had to add an extra import line to the JSP files:
<%@ page import="Clickstream" %>
. In Java you don't have to import classes within your own package, so on servers where JSPs compile into the default package, you don't need an import line like this. But on servers like Tomcat where JSPs compile into a custom package, you have to explicitly import classes in the default package. - I had to move the
<listener>
element in theweb.xml
file after the<filter>
and<filter-mapping>
elements, as required by the deployment descriptor DTD. Not all servers require elements to be in the proper order, but Tomcat does. - I had to change the
web.xml
mapping from/*.html
and/*.jsp
to the more correct*.html
and*.jsp
. Some servers are forgiving of the leading slash, but Tomcat rigidly enforces the rule that prohibits that slash. - Finally, I brought the
ClickstreamFilter
class up to the latest lifecycle API, changingsetFilterConfig()
to the newerinit()
anddestroy()
methods.
The downloadable WAR contains all these modifications and should run out-of-the-box across servers, although I haven't tested it widely.