XVII Open Cup named after E.V. Pankratiev. Eastern Grand Prix. Problem G. Gmoogle 模拟、字符串处理、文本搜索

XVII Open Cup named after E.V. Pankratiev. Eastern Grand Prix.

Problem G. Gmoogle
Input le: standard input
Output le: standard output
Time limit: 1 second
Memory limit: 256 megabytes


You are hired to create alpha version of the new searching engine named GMoogle. Alpha version should
work with the content, represented as a database of sentences:
 • Content is merged into line S, consisting of characters `a'-`z', `A'-`Z', spaces, notation marks (\.!?")
(quotes are not counted) and decimal digits.
 • If one of characters .!?" presents in the S, then it denotes the end of the sentence, except for
one special case: if rst non-space character after `.' is lowercase English letter, then it is an
abbreviation sign but not the end of the sentence; for example, string I like tea in a 500
ml. cup" contains one sentence, but strings Cup is 500 ml. I want it" and Cup is 500
ml. 500 ml is great for me" contains two sentences).
 • First non-space character after the end of sentence is considered as the rst character of the new
sentence.
 • word is contiguous sequence of characters `a'-'z', `A'-`Z', delimited by spaces, notation signs or
beginning/end of the sentence/string. It is guaranteed that digits can not be neighbors of the
letters, i.e. sequences like 10ml" or R2D2" are illegal.
 • S may contain the sentences containing no words. It is guaranteed that S does not contains two or
more characters .!?" in a row.
After the content is indexed, users make requests. Each request can be represented as a string q, consisting
of one or more words (de nition of the word is given above). Words are separated by arbitrary number
of spaces (1 or more), heading and trailing spaces are possible.
Your program has to print all sentences from S, where all words from q are presented (in any order).
Words are considered equal, if all the letters at the corresponding positions are the same (case insensitive,
i.e. `B' and `b' are considered the same.


Input

First line of the input contains non-empty line S, consisting of no more than 1000 characters. Next line
contains one integer n (1 n 100) | number of the requests. Then n requests q1; : : : ; qn follow, each
on separate line in the format, described above. Note that in S and qi trailing and heading spaces are
allowed.


Output
For each request q1; q2; :::; qn print the request at the separate line. Then print the list of found sentences
in same order they present in S, one sentence per line. Requests and answers are printed in the quotes;
answers are preceeded by single `-' and single space; heading and trailing spaces must be eliminated.
Look the sample for clarify.

Example

standard input 
Hello everyone. I want 2 coffee if
you have it. I like coffee very much.
4
HELLO
Coffee
much coffee
VoDka


standard output
Search results for "HELLO":
- "Hello everyone."
Search results for "Coffee":
- "I want 2 coffee if you have it."
- "I like coffee very much."
Search results for "much coffee":
- "I like coffee very much."
Search results for "VoDka":


Source

XVII Open Cup named after E.V. Pankratiev. Eastern Grand Prix.


My Solution

题意:要求模拟一个搜索系统,给出文本,然后每次查询几个单词要求输出所以出现查询单词的句子。


模拟、字符串处理、文本搜索

先把文本预处理成一个一个单独的句子,并标号0、1、2......,并且用map<string, vector<int>>建立单词到句子的映射。

然后对于每个单独查询的每个单词都会有一个集合,然后对这些集合取一个交集就是答案了。

这里用到的求交集的方法是 是用一个map<int, int> check表示这些集合里每个句子出现的次数,最后遍历一遍check,

出现次数为查询的单词的个数的句子构成的集合就是所求的交集。

注意点:1、一个句子里可能出现几个相同的单词,建立映射的时候,一个单词只映射一次到该句子。

                2、当'.'后面的第一个非空字符是小写字母时,这里不是句子的结束。

                3、这里文本的最后一句可能没有标点符号且可能有很多空格,处理一下即可。

                4、故意把文本处理成单个句子的方法是先拿出单独的句子,然后确定该句在此处结尾时,在建立这句的单词带这句话的映射。

                5、无论是单词的映射还是查询,都全部用cctype里的isuppper和tolower来转化成小写字母进行比较。

时间复杂度 O(nlogn + k*qlogn)

空间复杂度 O(n)


#include <bits/stdc++.h>
using namespace std;
string s, word, line;
vector<string> senc;
map<string, vector<int>> mp;
map<int, int> check;
int main () {
    #ifdef LOCAL
    freopen("g.txt", "r", stdin);
    #endif // LOCAL

    getline(cin, s);
    int n, sz = s.size(), i, j, len, cnt = 0, k;
    while(s[sz-1] == ' '){
        sz--;
    }
    bool flag;
    for(i = 0; i < sz; i++){
        if(s[i] == '.' || s[i] == '!' || s[i] == '?'){
            flag = true;
            if(s[i] == '.'){

                for(j = i + 1; j < sz; j++){
                    if(islower(s[j])){
                        flag = false;//cout <<"?"<<endl;
                        break;
                    }
                    else if(s[j] != ' ' && s[j] != '\0'){
                            //cout << s[j] << " ? \n";
                        break;
                    }

                }

            }
            if(!flag){
                line += s[i];
                continue;
            }

            len = line.size();
            if(len != 0){
                //cout << line << endl;
                for(j = 0; j < len; j++){
                    if(islower(line[j])){
                        word += line[j];
                    }
                    else if(isupper(line[j])){
                        word += tolower(line[j]);
                    }
                    else if(!word.empty()){
                        if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
                            mp[word].push_back(cnt);
                        //cout << word << " " << cnt << endl;
                        word.clear();
                    }
                }
                if(!word.empty()){
                    if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
                        mp[word].push_back(cnt);
                    //cout << word << " " << cnt << endl;
                word.clear();
                }
                line += s[i];
                senc.push_back(line);
                line.clear();
                cnt++;
            }
        }
        else{
            if(line.size() == 0 && (s[i] == ' ' || s[i] == '\0')){ //!
                    ;
            }
            else{
                line += s[i];
            }
        }
    }
    len = line.size();
    if(len != 0){
                //cout << line << endl;
                for(j = 0; j < len; j++){
                    if(islower(line[j])){
                        word += line[j];
                    }
                    else if(isupper(line[j])){
                        word += tolower(line[j]);
                    }
                    else if(!word.empty()){
                        if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
                            mp[word].push_back(cnt);
                        //cout << word << " " << cnt << endl;
                        word.clear();
                    }
                }
                if(!word.empty()){
                    if(mp[word].empty() || (!mp[word].empty() && mp[word].back() != cnt))
                        mp[word].push_back(cnt);
                    //cout << word << " " << cnt << endl;
                word.clear();
                }
                //line += s[i];
                senc.push_back(line);
                line.clear();
                cnt++;
            }
    /*
    for(auto x = mp.begin(); x != mp.end(); x++){
        cout << (x->first) << endl;
        sz = (x->second).size();
        for(i = 0; i < sz; i++){
            cout << " " << (x->second)[i] ;
        }
        cout << endl;
    }
    cout << endl;
    */

    cin >> n;
    getchar();
    for(i = 0; i < n; i++){
        getline(cin, s);
        cout << "Search results for \"" << s << "\":\n";
        len = s.size();
        cnt = 0;
        for(j = 0; j < len; j++){
            if(islower(s[j])){
                word += s[j];
            }
            else if(isupper(s[j])){
                word += tolower(s[j]);
            }
            else if(!word.empty()){
                if(mp.find(word) != mp.end()){
                    sz = mp[word].size();
                    for(k = 0; k < sz; k++){
                        check[mp[word][k]]++;
                    }
                }
                cnt++;
                word.clear();
            }
        }
        if(!word.empty()){
                if(mp.find(word) != mp.end()){
                    sz = mp[word].size();
                    for(k = 0; k < sz; k++){
                        check[mp[word][k]]++;
                    }
                }
                cnt++;
                word.clear();
            }

        for(auto x = check.begin(); x != check.end(); x++){
            if((x->second) == cnt){
                cout << "- \"" << senc[x->first] << "\"\n";
            }
        }
        check.clear();
    }
}


  Thank you!

                                                                                                                                             ------from ProLights

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值