cite:https://www.metagenomics.wiki/tools/blast/default-word-size
Length of an exact sequence match, as start region for the final alignment
blastn -query genes.ffn -subject genome.fna -word_size 11
A BLAST search starts with finding a perfect sequence match of length given by -word_size. This initial region of an exact sequence match is then extended in both direction allowing gaps and substitutions based on the scoring thresholds.
Changing the initial word-size can help to find more, but less accurate hits; or to limit the results to almost perfect hits.
- Decreasing the word-size will increase the number of detected homologous sequences, but hits can include alignments of higher fragmentation due to gaps and substitutions (example: search for homologous genes between distant species, see also: -task blastn)
- Increasing the word-size