
From: http://stackoverflow.com/questions/1166385/how-many-times-can-a-file-be-compressed

For lossless compression, the only way you can know how many times you can gain by recompressing a file is by trying. It's going to depend on the compression algorithm and the file you're compressing.

Two files can never compress to the same output, so you can't go down to one byte. How could one byte represent all the files you could decompress to?

The reason that the second compression sometimes works is that a compression algorithm can't do omniscient perfect compression. There's a trade-off between the work it has to do and the time it takes to do it. Your file is being changed from all data to a combination of data about your data and the data itself.


Take run-length encoding (probably the simplest useful compression) as an example.

04 04 04 04 43 43 43 43 51 52 11 bytes

That series of bytes could be compressed as:

[4] 04 [4] 43 [-2] 51 52 7 bytes (I'm putting meta data in brackets)

Where the positive number in brackets is a repeat count and the negative number in brackets is a command to emit the next -n characters as they are found.

In this case we could try one more compression:

[3] 04 [-4] 43 fe 51 52 7 bytes (fe is your -2 seen as two's complement data)

We gained nothing, and we'll start growing on the next iteration:

[-7] 03 04 fc 43 fe 51 52 8 bytes

We'll grow by one byte per iteration for a while, but it will actually get worse. One byte can only hold negative numbers to -128. We'll start growing by two bytes when the file surpasses 128 bytes in length. The growth will get still worse as the file gets bigger.

There's a headwind blowing against the compression program--the meta data. And also, for realcompressors, the header tacked on to the beginning of the file. That means that eventually the file will start growing with each additional compression.

