（LA 3942）Remember the Word --DP+Trie树

最新推荐文章于 2017-08-29 23:19:43 发布

STILLxjy

最新推荐文章于 2017-08-29 23:19:43 发布

阅读量411

点赞数 1

分类专栏： ——动态规划—— 算法竞赛入门经典 ——数据结构——

本文链接：https://blog.csdn.net/STILLxjy/article/details/52132145

版权

——动态规划—— 同时被 3 个专栏收录

67 篇文章 1 订阅

订阅专栏

——数据结构——

37 篇文章 0 订阅

订阅专栏

算法竞赛入门经典

20 篇文章 0 订阅

订阅专栏

Remember the Word

Time Limit: 3000MS 64bit IO Format: %lld & %llu

Neal is very curious about combinatorial problems, and now here comes a problem about words. Knowing
that Ray has a photographic memory and this may not trouble him, Neal gives it to Jiejie.
Since Jiejie can’t remember numbers clearly, he just uses sticks to help himself. Allowing for Jiejie’s
only 20071027 sticks, he can only record the remainders of the numbers divided by total amount of
sticks.
The problem is as follows: a word needs to be divided into small pieces in such a way that each
piece is from some given set of words. Given a word and the set of words, Jiejie should calculate the
number of ways the given word can be divided, using the words in the set.
Input
The input file contains multiple test cases. For each test case: the first line contains the given word
whose length is no more than 300 000.
The second line contains an integer S, 1 ≤ S ≤ 4000.
Each of the following S lines contains one word from the set. Each word will be at most 100
characters long. There will be no two identical words and all letters in the words will be lowercase.
There is a blank line between consecutive test cases.
You should proceed to the end of file.
Output
For each test case, output the number, as described above, from the task description modulo 20071027.
Sample Input
abcd
4
a
b
cd
ab
Sample Output
Case 1: 2

题意：
给出一个由s个不同单词组成的字典和一个长的字符串。把这个字符串分解为若干个单词的连接（单词可以重复使用），有多少种方法？比如，有四个单词：a,b,cd,ab,则abcd有两种分解方案：a+b+cd 和 ab+cd。

分析：
此题最大最大的难点就是定义好状态和找出状态转移方程。
DP:
令d[i]表示从长的字符串的第i个字符开始（即后缀s[i..L]）的分解方案数。那么状态转移方程为:d[i]=sum{d[i+len(x)]} 其中单词x是s[i..L]的前缀。

找前缀单词x：
找前缀单词时，如果我们枚举x,那么最多有4000个，时间耗费太大。所以我们可以利用Trie树,找前缀时只需要直接查找一次就可以了。查找过程中每经过一个单词节点，就找到了一个x。

例如题目的样例分析：
abcd的分解方案数：
i指向a时,那么abcd的前缀单词由a,ab。所以d[0]=d[1]+d[2]
i指向b时，那么bcd的前缀单词有b。bcd的组合只有b+cd这一种。
i指向c时，那么cd的组合只有cd这一种。
所以d[0]=d[1]+d[2]=1+1=2;

AC代码：

#include<cstring>
#include<vector>
using namespace std;

const int maxnode = 4000 * 100 + 10;
const int sigma_size = 26;

// 字母表为全体小写字母的Trie
struct Trie {
  int ch[maxnode][sigma_size];
  int val[maxnode];
  int sz; // 结点总数
  void clear() { sz = 1; memset(ch[0], 0, sizeof(ch[0])); } // 初始时只有一个根结点
  int idx(char c) { return c - 'a'; } // 字符c的编号

  // 插入字符串s，附加信息为v。注意v必须非0，因为0代表“本结点不是单词结点”
  void insert(const char *s, int v) {
    int u = 0, n = strlen(s);
    for(int i = 0; i < n; i++) {
      int c = idx(s[i]);
      if(!ch[u][c]) { // 结点不存在
        memset(ch[sz], 0, sizeof(ch[sz]));
        val[sz] = 0;  // 中间结点的附加信息为0
        ch[u][c] = sz++; // 新建结点
      }
      u = ch[u][c]; // 往下走
    }
    val[u] = v; // 字符串的最后一个字符的附加信息为v
  }

  // 找字符串s的长度不超过len的前缀
  void find_prefixes(const char *s, int len, vector<int>& ans) {
    int u = 0;
    for(int i = 0; i < len; i++) {
      if(s[i] == '\0') break;
      int c = idx(s[i]);
      if(!ch[u][c]) break;
      u = ch[u][c];
      if(val[u] != 0) ans.push_back(val[u]); // 找到一个前缀
    }
  }
};

#include<cstdio>
const int maxl = 300000 + 10; // 文本串最大长度
const int maxw = 4000 + 10;   // 单词最大个数
const int maxwl = 100 + 10;   // 每个单词最大长度
const int MOD = 20071027;

int d[maxl], len[maxw], S;
char text[maxl], word[maxwl];
Trie trie;

int main() {
  int kase = 1;
  while(scanf("%s%d", text, &S) == 2) {
    trie.clear();
    for(int i = 1; i <= S; i++) {
      scanf("%s", word);
      len[i] = strlen(word);
      trie.insert(word, i);
    }
    memset(d, 0, sizeof(d));
    int L = strlen(text);
    d[L] = 1;
    for(int i = L-1; i >= 0; i--) {
      vector<int> p;
      trie.find_prefixes(text+i, L-i, p);
      for(int j = 0; j < p.size(); j++)
        d[i] = (d[i] + d[i+len[p[j]]]) % MOD;
    }
    printf("Case %d: %d\n", kase++, d[0]);
  }
  return 0;
}

STILLxjy

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
（LA 3942）Remember the Word --DP+Trie树

Remember the WordTime Limit: 3000MS 64bit IO Format: %lld & %lluNeal is very curious about combinatorial problems, and now here comes a problem about words. Knowing that Ray has a photographi
复制链接

扫一扫

专栏目录