Apache HTTP Server Version 1.3
Is this the version you want? For more recent versions, check our documentation index.
Apache 1.3
Originally written by
URL Rewriting Guide
Ralf S. Engelschall <rse@apache.org>
December 1997
December 1997
This document supplements the mod_rewrites the mod_rewrite reference documentation. It describes how one can use Apache's mod_rewrite to solve typical URL-base problem with webmasters are often confronted. We give detailed desciptions on how to solve each problem each problem by configuring URL rewriting rulesets.
Introduction to mod_rewrite
The Apache module mod_rewrite is a killer one, i, e, it is a really sophisticate module which provides a powerful way to do URL manipulations. With it you can do nearly all types of URL manipulations you ever dreamed about. The price you have to pay is to accept complexity, because mod_rewrite's major drawback is that it is not easy to understand and use for beginner. And even Apache experts sometimes discover new aspects where mod_rewrite can help.
In other words: With mod_rewrite you either shoot yourself in the foot the first time and never use it again or love it for the rest of your life because of its power. This paper tries to give you a few initial success events to avoid the first case by presenting already invented solutions to you.
Practical Solutions
Here come a lot of pratctical solutions I've either invented myself or collected from other peoples solutions in the past, Feel free to learn the black magic of URL rewriting from example.
ATTENTION: Depending on your server-configurtion it can be neccessary to slightly change the examples for you situation, e.g. adding the [PT] flag when additonally using mod_alias and mod_userdir, etc. Or rewriting a ruleset to fit in .htaccess context instead of per-server context. Always try to understand what a particular ruleset really does before you use it order to avoid problems.
URL Layout
Canonical URLs
Description:On some webserver there are more than one URL for a resource. Usually there are canonical URLs (which should be actually used and distributed ) and those which are just shortcuts, internal ones, etc. Independent which URL the user supplied with the request he should finally see the canonical one only.
Solution:
We do an external HTTP redirect for all non-canonical URLs to fix them in the location view of the Browser and for all sbsequent requests. In the example ruleset below we replace /~user by the canonical /u/user and fix a missing trailing slash for /u/user.
RewriteRule ^/~([^/])+)/?(.*) /u/$1/$2 [R]
RewriteRule ^/([uge])/([^/]+)$ /$1/$2/ [R]
Description:
The goal of this to force the use of a particuclar hostname, in preference to other hostname which may be used to reach the same site, For example, if you wish to force the use of www.example.com instead of example.com. you might use a variant of the following recipe.
Solution:
#For sites running on a port other than 80
RewriteCond %{HTTP_HOST} !^fully/.qualified/.domain/.name [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteCond %{SERVER_PORT} !^80$
RewriteRule ^/(.*) http://fully.qualified.domain.name:%{SERVER_PORT}?$1 [L,R]
# And for a site running on 80
RewriteCond %{HTTP_HOST} !^fully/.qualified/.domain/.name[NC]
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^/(.*) http://fully.qualified.domain.name /$1 [L,R]
Trailing Slash Problem
Description:Every webmaster can sing asong about the problem of the trailing slash on URLs referencing directories, If they are missing, the server dumps an error, because if you say /-quux/foo instead of /-quux/foo/ then the server searches for a file named foo. And because this file is a directory it complains. Actually is tries to fix it themself in most of the cases. but sometimes this mechanism need to be emulated by you. For instance after you have done a lot of complicated URL rewriting to CGI scripts etc.
Solution :
The solution to this subtle problem is to the server add the trailing slash automatically. To do this correctly we have to use an external redirect, so the browser correctly requests subsequent images etc. if we only did a internal rewrite, this would only work for the directory page, but would go wrong when any images are included into this page with relative URLs, because the browser would request an in_lined object. For instance, a request for image.gif in /-quux/foo/index.html would become /-quux/image.gif without the external redirect!
So, to do this trick we write:
RewriteEngine on
RewriteBase /-quux/
RewriteRule ^foo$ foo/ [R]
The crazy and lazy can even do the following in the top-level .htaccess file of their homedir. But notice that this creates some processing overhead
RewriteEngine on
RewriteBase /~quux/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $!/ [R]
Webcluster through Homogeneous URL Layout
Description:
We want to create a homegenous and consistent URL layout over all WWW servers on a Intranet webcluster,i.e, all URLs (per definition server local and thus server dependend !) becaome actually server independ!What we want is to give the WWW namespace acondidtent server-independend layout: no URL should have to include any physically correct target server.The cluster itself should drive us automatically to the automatically to the physical target host.
Solution:
First, the konwledge of the target servers come from(distributed) external
maps which contain information where our users,groups and entities stay, The have the form
user1 server_of_user1
user2 server_of_user2
RewriteBase /~quux/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $!/ [R]
Webcluster through Homogeneous URL Layout
Description:
We want to create a homegenous and consistent URL layout over all WWW servers on a Intranet webcluster,i.e, all URLs (per definition server local and thus server dependend !) becaome actually server independ!What we want is to give the WWW namespace acondidtent server-independend layout: no URL should have to include any physically correct target server.The cluster itself should drive us automatically to the automatically to the physical target host.
Solution:
First, the konwledge of the target servers come from(distributed) external
maps which contain information where our users,groups and entities stay, The have the form
user1 server_of_user1
user2 server_of_user2
: :
We put them into files map .xxx-to-host. Second we need to instruct all servers to redirest URLs of the forms
/u/user/anypath
/g/user/anypath
/e/entity/anypath
to
http://physical-host/u/user/anypath
http://physical-host/u/user/anypath
http://physical-host/u/user/anypath