Barrett Reduction is a method of reducing a number modulo another number. Barrett reduction, when used to reduce a single number, is slower than a normal divide algorithm. However, by precomputing some values, one can easily far exceed the speed of normal modular reductions.
Because Barrett reduction's benefits are most visible when it is used to reduce various numbers modulo a single number many times, for example, when doing modular exponentiation. Barrett reduction is not particularly useful when used with small numbers (32 or 64 bits); it's benefits occur when using numbers that are implemented by multiple precision arithmetic libraries, such as when implementing the RSA cryptosystem, which uses modular exponentiation with large (> 512 bit) numbers, to encrypt and decrypt.
So, how do you do it? First, keep in mind the following restriction: Barrett Reduction can only reduce numbers that are, at most, twice as long (in words) as the modulus. Thats computer words; usually these are 32 bits long (for example on x86 and PowerPC machines), and sometimes 64 bits (like on the Alpha, UltraSPARC, and MIPS R10000).
So you have some modulus, called m, which is k words long (numbered k-1...0, with 0 being the least significant word). First, pre-calculate the value:
mu = floor(b^k / m)
where b is the "base" of the integers used. For example, if you represented the numbers as a sequence of 32-bit values, b is 2^32, or 0x100000000. You will keep this value mu across function calls (probably stored in a structure somewhere), so you can reuse it.
Now, given a number x, which is an arbitrary integer of size (at most) 2k words (2k-1...0), this procedure (in pseudocode) will return the value of x mod m:
q1 = floor(x / b^(k-1))
q2 = q1 * mu
q3 = floor(q2 / b^(k+1))
r1 = x mod b^(k+1)
r2 = (q3 * m) mod b^(k+1)
r = r1 - r2
if(r < 0)
r = r + b^(k+1)
while(r >= m)
r = r - m
return r
Note that the divisions and modular reductions in this procedure can be replaced by right shifts and AND operations (respectively) if (and only if) b is a power of 2 (which, by far, will be the most common choice). This results in the remaining operations being addition and multiplication, both of which are much cheaper than division for multiple precision integers.
This algorithm is also specified in the Handbook of Applied Cryptography (good book!), and is implemented by some crypto libraries. Another method of doing fast modular reductions is Montgomery Reduction.