POJ - 3267 The Cow Lexicon 字符串匹配 (dp)

64 篇文章 0 订阅


题意就是给出一个主串,和一本字典,问最少在主串删除多少字母,可以使其匹配到字典的单词序列。

PS:是匹配单词序列,而不是一个单词



此题居然能1Y,我很是欣喜

 

AC后,看了一下网上的解法,没有特别详细的,我提供一个。

网上的讨论,有人说从后面往前面DP会快一些,我觉得没有道理吧。从前往后与从后往前,在DP最优值的递推上要相反,除此以外求remove值时对单词的扫描也要相反(后面会详细解释),基本原理是一样的,复杂度应该是没有区别的。

 

设d[i:0..l-1]表示从头到i位(i位包含)最少需要除去多少个字符,S是原串。

 

d[i]=min( d[j]+remove //如果S[j+1..i]子串中包含了一个单词,j<i
                d[i-1]+1       //如果不存在可包含单词的子串 )

 

因此,主要步骤变成了如何求remove

假如我们把单词都存在w里面,对于其中的一个单词w[k],让now=len[k],j=i开始,找一个j>=0,使得S[j..i]之间包含单词w[k]。所谓包含,其实也就是说w[k]与S[j..i]的LCS是w[k],不过这里我们不用LCS算法,因为慢。接着刚才说,如果S[j..i]之间不包含单词w[k],则remove=i。如果包含,则remove=i-j+1-len[k]。

 

在我的程序里,这里还有一个优化,如果求remove的循环过程中,i-j+1-len[k]<d[i](前面求出的d[i]),则终止继续求remove,j停止--,因为此时即使求出j也会比原d[i]大了,所以让remove=i-j就可以。

 

另外,网上有人说做预处理。我不知道我立即是否正确,做预处理是打一个300*300*600的表,T[i][j][k]表示子串i..j中若包含单词w[k]最少需要删除多少字符,虽然空间才54000000,但完全没必要。即时的算法决不会将表中的每个数据都算出来,也就是说,表中有些数据是白算了,增加了时间



#include<iostream>
#include<cstdio>
#include<cstring>
#include<algorithm>
#include<cmath>
using namespace std;
const int inf = 0x3f3f3f3f;
char s[666],a[666][666];
int dp[666];
int main()
{
    int i,j,n,m,now,len,k,p;
    scanf("%d%d",&m,&n);
    scanf("%s",s+1);
    for(i=1;i<=m;i++) scanf("%s",a[i]+1);
    for(i=1;i<=n;i++) {
        dp[i]=dp[i-1]+1;
        for(j=1;j<=m;j++) {
            now = i;
            p = len = strlen(a[j]+1);
            while(now>0 && p>0 && dp[now]+i-now-len<=dp[i]) {
                if(s[now]==a[j][p]) p--;
                now--;
            }
            if(p==0) k = i-now-len;
            else k = i-now;
            dp[i] = dp[i]<dp[now]+k?dp[i]:dp[now]+k;
        }
    }
    printf("%d\n",dp[n]);
    return 0;
}


Few know that the cows have their own dictionary with W (1 ≤ W ≤ 600) words, each containing no more 25 of the characters 'a'..'z'. Their cowmunication system, based on mooing, is not very accurate; sometimes they hear words that do not make any sense. For instance, Bessie once received a message that said "browndcodw". As it turns out, the intended message was "browncow" and the two letter "d"s were noise from other parts of the barnyard.

The cows want you to help them decipher a received message (also containing only characters in the range 'a'..'z') of length L (2 ≤ L ≤ 300) characters that is a bit garbled. In particular, they know that the message has some extra letters, and they want you to determine the smallest number of letters that must be removed to make the message a sequence of words from the dictionary.

Input
Line 1: Two space-separated integers, respectively:  W and  L 
Line 2:  L characters (followed by a newline, of course): the received message 
Lines 3..  W+2: The cows' dictionary, one word per line
Output
Line 1: a single integer that is the smallest number of characters that need to be removed to make the message a sequence of dictionary words.
Sample Input
6 10
browndcodw
cow
milk
white
black
brown
farmer
Sample Output
2



  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值