For example:
10 apples, 9 boxes, then at least one box have two apples.
Hash table's repeat problem is inevitable, because the number of Keys is always more than the number of indices.
Any loseless compression algorothm, the smaller the file, which increasing certain other documents to assist, otherwise some information is bound to be lost.