文件压缩

文件压缩

  提高文件的压缩率一直是人们追求的目标。近几年有人提出了这样一种算法,它虽然只是单纯地对文件进行重排,本身并不压缩文件,但是经这种算法调整后的文件在大多数情况下都能获得比原来更大的压缩率。

  该算法具体如下:对一个长度为n的字符串S,首先根据它构造n个字符串,其中第i个字符串由将S的前i-1个字符置于末尾得到。然后把这n个字符串按照首字符从小到大排序,如果两个字符串的首字符相等,则按照它们在S中的位置从小到大排序。排序后的字符串的尾字符可以组成一个新的字符串S',它的长度也是n,并且包含了S中的每一个字符。最后输出S'以及S的首字符在S'中的位置p。举例:

  S: example

  1、构造n个字符串

  example
  xamplee
  ampleex
  mpleexa
  pleexam
  leexamp
  eexampl

  2、将字符串排序

  ampleex
  example
  eexampl
  leexamp
  mpleexa
  pleexam
  xamplee

  3、输出
  xelpame S'
  7    p

  由于英语单词构造的特殊性,某些字母对出现的频率很高,因此在S'中相同的字母有很大几率排在一起,从而提高S'的压缩率。虽然这种算法利用了英语单词的特性,然而在实践的过程中,人们发现它几乎适用于所有的文件压缩。

  任务1:zip1.pas(zip1.exe)
  读入字符串S,输出S'和p。
  输入文件zip1.in包含两行,第1行是一个整数n(1 <=n<=10000),代表S的长度,第2行是字符串S。
  输出文件zip1.out包含两行,第1行是S',第2行是整数p。

  任务2:zip2.pas(zip2.exe)

  读入S'和p,输出字符串S。
  输入文件zip2.in包含三行,第1行是一个整数n(1<=n<=10000),代表S'的长度,第2行是字符串S',第3行是整数p。
  输出文件zip2.out仅包含一行S。
  输入样例1:
  7
  example

  输出样例1:
  xelpame
  7

  输入样例2:
  7
  xelpame
  7

  输出样例2:
  example

Solution:
1. S --> S'
Following is the main process to get S' from S.

                  -------             ------- 
          |example|           |ampleex|
          |xamplee|           |example|
       (1)  |ampleex|    (2)    |eexampl|   (3)
example(S) ====> |mpleexa|  =======> |leexamp| ======> xelpame (S')
          |pleexam|           |mpleexa|
          |leexamp|           |pleexam|
          |eexampl|           |xamplee|
                  -------             -------  
                      <A>                  <B>

                          Figure 1. S-S' Process
                       
If you look at list <A> carefully, you will find that the combination of first character of each word in the list is exactly S, i.e., 'example'. And sorting output of S is exactly the combination of the first character of each word in list <B>, i.e., 'example' -----sort------> 'aeelmpx'.

Now lood at the first and last character of each word in list <B> carefully, do not forget to refer to Figure 2 below, do you find anything interesting? The secret is that for each word in list <B>, the previous char of the first character of the word in the Figure 2 is the exactly the last character of the word, which is part of the result, S'.
Take the first word 'ampleex' in list <B> for example, the first char of 'ampleex' is 'a', in Figure 2, the previous char of 'a' is 'x', so the last char of 'ampleex' is also 'x' and 'x' becomes the first char of S'. Same idea, for the second word 'example', 'e''s previous char is also 'e', so the second char of S' is 'e'.

          |--------->---------------->-----------|
          |--[e]->[x]->[a]->[m]->[p]->[l]->[e]<--|
                         
                          Figure 2. Circle

After getting the secret, we can simplify the process as Figure 3. Then, how to get the position of the first character in S in the result S'? Actually, P equals the position of second character in S in <C>. Refer Figure 3.

      (1)sort each char             (2)get previous
                in S                         of each char
example(S) =====================> aeelmpx ================> xelpame (S')
                                    <C>

                          Figure 3. Simple S-S' Process

2. S' --> S

How to get S by S' and P? See Figure 4 below. We know that P is 7. So it points to the last 'e' in S'. Then we can get the S like this. first write the last 'e' in <D>, then see the what is the conrespondent of 'e' in <E>, it 'x', so we write 'x' and catenate it to 'e'. Now result is 'ex'. Then see what is the conrespondent of 'x' in <E>, it's 'a', catenate it to the result and get 'exa', same idea, continue the process. See Figure 5. which demos the process. The combination of the characters in square brackets is S.

                ----        ----
         | x |       | a |
         | e |       | e |  
         | l | sort  | e |
         | p | ====> | l |
         | a |       | m |
 start point   | m |       | p |
P ----------->  | e |       | x |
                ----        ---- 
                   <D>         <E>
             
                Figure 4. S' - S process


            [e] ---- x -|           
                    |-----<----|
                    V 
            [x] ---- a -|
                    |-----<----|
                    V 
            [a] ---- m -|
                    |-----<----|
                    V 
            [m] ---- p -|
                    |-----<----|
                    V 
            [p] ---- l -|
                    |-----<----|
                    V 
            [l] ---- e -|
                    |-----<----|
                    V 
            [e] ---- e
        
        Figure 4. S' - S process
3. Code

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值