UPC-Typo(字符串哈希)

世 上 没 有 绝 望 的 处 境
只 有 对 处 境 绝 望 的 人

UPC-Typo

题目描述

It is now far into the future and human civilization is ancient history. Archaeologists from a distant planet have recently discovered Earth. Among many other things, they want to decipher the English language.
They have collected many printed documents to form a dictionary, but are aware that sometimes words are not spelled correctly (typos are a universal problem). They want to classify each word in the dictionary as either correct or a typo. Naïvely, they do this using a simple rule: a typo is any word in the dictionary such that deleting a single character from that word produces another word in the dictionary.
Help these alien archaeologists out! Given a dictionary of words, determine which words are typos. That is,which words result in another word in the dictionary after deleting a single character.
For example if our dictionary is {hoose, hose, nose, noises}. Then hoose is a typo because we can obtain hose by deleting a single ’o’ from hoose. But noises is not a typo because deleting any single
character does not result in another word in the dictionary.
However, if our dictionary is {hoose, hose, nose, noises, noise} then the typos are hoose, noises,and noise.

题目大意

n组字符串,若每组字符串中的其中一个字符被删除掉,与其余的字符串有一样的,则这个字符串需要被删除
eg.

  ILOVEYOU 和 LOVEYOU 第一个字符串去掉I后就会和后面的字符串一样,所以要把第一个字符串输出。

输入

The first line of input contains a single integer n, indicating the number of words in the dictionary.
The next n lines describe the dictionary. The ith of which contains the ith word in the dictionary. Each word consists only of lowercase English letters. All words are unique.
The total length of all strings is at most 1 000 000.

输出

Display the words that are typos in the dictionary. These should be output in the same order they appear in the input. If there are no typos, simply display the phrase NO TYPOS.

Sample

InputⅠ

5
hoose
hose
nose
noises
noise

OutputⅠ

hoose
noises
noise

InputⅡ

4
hose
hoose
oose
moose

OutputⅡ

hoose
moose

InputⅢ

5
banana
bananana
bannanaa
orange
orangers

OutputⅢ

NO TYPOS

思路解析

看到这题的时候天真的我居然认为是一个水题,直接暴力存 unordered_map 找不就得勒。但是~~~
我忘记了一个很重要的事,切割字符串的复杂度没考虑在内。我想这题如果是T了那么就是卡在这了。
可能有的大佬会直接想到这样会T于是乎打算存个字符串合并的所有结果,但是无奈只能感叹到,字符串的长度让我头秃
那么有没有办法,既可以实现字符串合并和同时又duck不必开辟如此多的内存呢?欸!有哈希算法
哈希算法就是将一堆东西映射成一个值的算法,就和map的红黑树,unorder_map的散列表是一个效果。
字符串哈希是将每个字母当作一个26以上进制的其中一位,用ull来存储,且进制保证为质数,可以保证不会重复出现ID
类比十进制演示一下这个题的思路
存储147852369这个数字
我们需要存储
1
14
147
1478
14785
147852
1478523
14785236
147852369
这9组,当你想得到147(8) 52369这个字符串时我们的操作为
147×105+147852369-1478×105=14752369

OK这样就可以节省内存了,接下来就是unorder_map的查询。就很简单了,然后注意的就是不确定字符串有多少,那么就动态开辟。

AC时间到

#include<algorithm>
#include<iostream>
#include<string.h>
#include <iomanip>
#include<stdio.h>
#include<utility>
#include<vector>
#include<string>
#include<math.h>
#include<cmath>
#include<queue>
#include<stack>
#include<deque>
#include<map>
#include<set>
#pragma warning(disable:4244)
#define PI 3.141592653589793
#pragma GCC optimize(2)
#define accelerate cin.tie(NULL);cout.tie(NULL);ios::sync_with_stdio(false);
#define EPS 1.0e-8
using namespace std;
typedef long long ll;
typedef unsigned long long ull;
const ll ll_inf = 9223372036854775807;
const int int_inf = 2147483647;
const short short_inf = 32767;
const char char_inf = 127;
ll gcd(ll a, ll b) { return b ? gcd(b, a % b) : a; }
inline ll read() {
	ll c = getchar(), Nig = 1, x = 0;
	while (!isdigit(c) && c != '-')c = getchar();
	if (c == '-')Nig = -1, c = getchar();
	while (isdigit(c))x = ((x << 1) + (x << 3)) + (c ^ '0'), c = getchar();
	return Nig * x;
}
inline void out(ll a) {
	if (a < 0)putchar('-'), a = -a;
	if (a >= 10)out(a / 10);
	putchar(a % 10 + '0');
}
ll phi(ll n)
{
	ll ans = n, mark = n;
	for (ll i = 2; i * i <= mark; i++)
		if (n % i == 0) { ans = ans * (i - 1) / i; while (n % i == 0)n /= i; }
	if (n > 1)ans = ans * (n - 1) / n; return ans;
}
ll qpow(ll x, ll n, ll mod) {
	ll res = 1;
	while (n > 0) {
		if (n & 1)res = (res * x) % mod;
		x = (x * x) % mod;
		n >>= 1;
	}
	return res;
}
ll mat_mod;
struct Mat {
	ll m[10][10];
};
Mat Mul(Mat A, Mat B, ll mat_size) {
	Mat res;
	memset(res.m, 0, sizeof(res.m));
	for (int i = 0; i < mat_size; i++)for (int j = 0; j < mat_size; j++)for (int k = 0; k < mat_size; k++)
		res.m[i][j] = (res.m[i][j] + (A.m[i][k] * B.m[k][j]) % mat_mod) % mat_mod;
	return res;
}
Mat mat_qpow(Mat data, ll power, ll mat_size) {
	Mat res;
	memset(res.m, 0, sizeof(res.m));
	for (int i = 0; i < mat_size; i++)res.m[i][i] = 1;
	while (power) {
		if (power & 1)res = Mul(res, data, mat_size);
		data = Mul(data, data, mat_size), power >>= 1;
	}
	return res;
}
#define Floyd for(int k = 1; k <= n; k++)for(int i=1;i<=n;i++)for(int j=1;j<=n;j++)
#define read read()
ull base = 211;
ull hash_pow[1000001];
vector<string>save;
#include<unordered_map>
using namespace std;
unordered_map<ull, bool>mp;
vector<vector<ull>>information;
int main()
{
	accelerate;
	int n;
	cin >> n;
	hash_pow[0] = 1;
	for (int i = 1; i < 1000001; i++)
		hash_pow[i] = hash_pow[i - 1] * base;
	for (int i = 0; i < n; i++)
	{
		string temp;
		cin >> temp;
		save.push_back(temp);
		vector<ull>hash_id;
		hash_id.push_back(0);
		ull sum = 0;
		int l = temp.length();
		for (int j = 1; j <= l; j++)
			hash_id.push_back(hash_id[j - 1] * base + temp[j - 1]);
		mp[hash_id[l]] = true;
		information.push_back(hash_id);
	}
	bool sw = true;
	for (int i = 0; i < n; i++)
	{
		int k = save[i].size();
		for (int j = 1; j <= k; j++)
		{
			int r = k, l = j + 1;
			ull ans = information[i][j - 1] * hash_pow[r - l + 1] + information[i][r] - information[i][l - 1] * hash_pow[r - l + 1];
			if (mp[ans])
			{
				sw = false;
				cout << save[i] << endl;
				break;
			}
		}
	}
	if (sw)
		cout << "NO TYPOS" << endl;
}

By-轮月

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Round moon

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值