每日三题-Day4-C（HDU 1159 Common Subsequence 最长公共子序列O(nlogn)解法）

本文链接：https://blog.csdn.net/lulu11235813/article/details/70941105

Common Subsequence

Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others)

Total Submission(s): 38185 Accepted Submission(s): 17506

Problem Description

A subsequence of a given sequence is the given sequence with some elements (possible none) left out. Given a sequence X = <x1, x2, ..., xm> another sequence Z = <z1, z2, ..., zk> is a subsequence of X if there exists a strictly increasing sequence <i1, i2, ..., ik> of indices of X such that for all j = 1,2,...,k, xij = zj. For example, Z = <a, b, f, c> is a subsequence of X = <a, b, c, f, b, c> with index sequence <1, 2, 4, 6>. Given two sequences X and Y the problem is to find the length of the maximum-length common subsequence of X and Y.
The program input is from a text file. Each data set in the file contains two strings representing the given sequences. The sequences are separated by any number of white spaces. The input data are correct. For each set of data the program prints on the standard output the length of the maximum-length common subsequence from the beginning of a separate line.

Sample Input

  
  
   
   abcfbc abfcab
programming contest 
abcd mnp

Sample Output

题意：

很明显，就是给你两个字符串，求其最长公共子序列

思路：

最长公共子序列做过挺多了，这次想尝试一下O(nlogn)的做法。

[ 题外话，很多同学遇到做过的题，或者类似的题，就不愿意做了，或者粘贴以前的代码直接A掉，个人感觉这样不好。我觉得最好的方法是，换个思路或者方法再做一次，或者改进上一次自己的算法的复杂度，又或者尝试不看模板自己打一遍不用调bug一发A掉。这样都可以提升自己对这个算法的理解和运用、实现能力 ]

最长公共子序列的O(nlogn)时间复杂度的算法，其实是将最长公共子序列转化为最长上升子序列，然后用O(nlogn)的最长上升子序列的算法实现的。其缺点就是O(nlogn)的最长上升子序列算法的缺点——不能还原其具体的最长上升子序列（在这里就是最长公共子序列）

最长公共子序列转化成最长上升子序列的方法：

给两个序列A，B

假如

A = abcdeac

B = acdabc

先将A中的字符按顺序编号

0123456

abcdeac

将B中的每字符在A串中出现的降序排列连起来

a c d a b c

[5 0][6 2][3][5 0][1][6 2]

这个组成的新序列的最长上升子序列长度，就是最长公共子序列的长度。

原理就是：

将B串中的所有字符，在A串中出现的顺序列出来。

那么B串中最长上升子序列，就是最长的，在A串中按顺序出现的子序列，就是最长公共子序列。

其中，B中的每个字符在A中出现的顺序，一定要降序排列的原因是，保证每个B中的字符在最长上升子序列中只出现一次

比如上例中的a

[5 0]

将序排列决定了，最长上升子序列不可能同时有 5 和 0 出现（如果出现了，说明这个a被用了两次，那就不是最长公共子序列了）

而O(nlogn)的最长上升子序列解法，请看我的另一篇文章：

http://blog.csdn.net/lulu11235813/article/details/70312914

适用情况&注意事项：

1. 因为O(nlogn)的最长上升子序列解法不能还原最长上升子序列，因此需要还原最长公共子序列的题，不适宜用这个算法。

2. 因为B串字符在A串中出现的顺序序列（就是所要求的最长上升子序列的原串），会比B串和A串都要长（理论上可能达到A串长度*B串长度那么大），因此开数组的时候注意大小，并且留意是否会超出内存。

3. A串和B串中字符数量要有限且不宜过多（最好不要是任意整数之类组成的数字串）。

4. 二分的写法很容易出错，谨慎检查。

AC代码：

#include<stdio.h>
#include<stdlib.h>
#include<iostream>
#include<string.h>
#include<algorithm>
using namespace std;

char a[1010];
char b[1010];
int c[30][500000];
int d[10000100];
int e[10000100];

int bsear(int l,int r,int n)
{
    if(l==r)return l;
    int mid = (l+r)/2;
    if(e[mid]>=n) return bsear(l,mid,n);
    else return bsear(mid+1,r,n);
}

int main()
{
    while(~scanf("%s%s",a,b))
    {
        int num=0;
        for(int i=0;i<26;i++)
        {
            c[i][0]=0;
            for(int j=strlen(a)-1;j>=0;j--)
            {
                if(a[j]=='a'+i)
                {
                    c[i][0]++;
                    c[i][c[i][0]]=j;
                }
            }
        }
        for(int i=0;i<strlen(b);i++)
        {
            for(int j=1;j<=c[b[i]-'a'][0];j++)
            {
                d[num++]=c[b[i]-'a'][j];
            }
        }
        if(num==0)
        {
            printf("0\n");
            continue;
        }
        e[1]=d[0];
        int ans=1;
        for(int i=1;i<num;i++)
        {
            if(d[i]>e[ans])
            {
                ans++;
                e[ans]=d[i];
            }
            else
            {
                int p = bsear(1,ans,d[i]);
                e[p] = d[i];
            }
        }
        printf("%d\n",ans);
    }
    return 0;
}