We use a 100K-dimentions vector to express a query.
But obviously, it's too large and can't contain all the words.
So we propose a word hashing method,
e.g. "shirt"
first, we expand the word with a pair of '#'.
#shirt#
then, we take every 3 letters.
#sh, shi, hir, irt ,rt#
Finally, the word is represented using a vector of letter n-grams