POJ 1200--哈希,hash,karp-rabin,离散化(快来复习)

Crazy Search
Time Limit: 1000MS Memory Limit: 65536K
Total Submissions: 22715 Accepted: 6379

Description

Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you soon will discover, you really need the help of a computer and a good algorithm to solve such a puzzle. 
Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text. 

As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5. 

Input

The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not exceed 16 Millions.

Output

The program should output just an integer corresponding to the number of different substrings of size N found in the given text.

Sample Input

3 4
daababac

Sample Output

5

Hint

Huge input,scanf is recommended.

Source

Southwestern Europe 2002



题目大意:

给一个字符串,求不同子串个数。再给两个整数n和nc,

其中n代表要求的子串的长度,nc代表字符串中出现的字母的个数


解题思路:

(1)离散化:因为母串中出现的不同字母的个数是一定的,且可以小于26

所以可以先离散化,减少空间时间占用

scanf("%s",a);   //读取字符串
        int sz=strlen(a),t=0;
        for(int i=0;i<sz;i+=1){    //离散化
            if(name[a[i]-'a']==-1){
            //给出现过的字符编号,可以直接使用a[i]-'a'作为下标索引
                name[a[i]-'a']=t++;
            }
        }


(2)哈希:

这里采用karp-rabin 哈希函数,谈谈自己对这个方法的理解。


大概的思路如上图,但不绝对,可以正可以反。、


但是原理是:

上图的Hash(i,L)的值按照x进制,一定会是一个L位数。

也就是说,对于每一个长度为L的子串,都会对应一个L位的x进制数。

其中L是长度,x由每一位的出现的字符的可能数直接决定(相等)。


下面是AC代码:

#include <iostream>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string>
#include <vector>
#include <list>
#include <map>
#include <queue>
#include <stack>
#include <bitset>
#include <algorithm>
#include <numeric>
#include <functional>
#define maxn 16000005
#define mod 100000007
#define cons 2

using namespace std;
typedef __int64 ll;
char a[maxn];
char name[30];
bool hashv[maxn];

int main()
{
    int n,nc;
    while(scanf("%d %d",&n,&nc)!=EOF){
        memset(name,-1,sizeof(name));
        memset(hashv,false,sizeof(hashv));
        getchar();
        scanf("%s",a);   //读取字符串
        int sz=strlen(a),t=0;
        for(int i=0;i<sz;i+=1){    //离散化
            if(name[a[i]-'a']==-1){
            //给出现过的字符编号,可以直接使用a[i]-'a'作为下标索引
                name[a[i]-'a']=t++;
            }
        }

        int tmp=0;
        t=nc;
        for(int i=0;i<n-1;i+=1){       //先计算前n-1个字符对应的数值
            tmp=tmp*nc+name[a[i]-'a'];
            t*=nc;
        }

        int countt=0;
        for(int i=n-1;i<sz;i+=1){
            tmp=(tmp*nc+name[a[i]-'a'])%t;
            //这句话等效于tmp=tmp*nc-name[a[i-n]-'a']*t+name[a[i]-'a'];
            if(!hashv[tmp]){
                hashv[tmp]=true;
                countt+=1;
                /*string str(a,i-n+1,n);
                cout<<str<<'\n';*/
            }
        }
        printf("%d\n",countt);
    }
    return 0;
}



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值