SWERC13 Trending Topic


map暴力。。。


Imagine you are in the hiring process for a company whose principal activity is the analysis
of information in the Web. One of the tests consists in writing a program for maintaining up to
date a set of trending topics. You will be hired depending on the efficiency of your solution.
They provide you with text from the most active blogs. The text is organised daily and you
have to provide the sorted list of the N most frequent words during the last 7 days, when asked.
INPUT
Each input file contains one test case. The text corresponding to a day is delimited by tag
<text>. Queries of top N words can appear between texts corresponding to two different days.
A top N query appears as a tag like <top 10 />. In order to facilitate you the process of reading
from input, the number always will be delimited by white spaces, as in the sample.
Notes:
• All words are composed only of lowercase letters of size at most 20.
• The maximum number of different words that can appear is 20000.
• The maximum number of words per day is 20000.
• Words of length less than four characters are considered of no interest.
• The number of days will be at most 1000.
• 1 ≤ N ≤ 20
OUTPUT
The list of N most frequent words during the last 7 days must be shown given a query. Words
must appear in decreasing order of frequency and in alphabetical order when equal frequency.
There must be shown all words whose counter of appearances is equal to the word
at position N. Even if the amount of words to be shown exceeds N.


SAMPLE INPUT
<text>
imagine you are in the hiring process of a company whose
main business is analyzing the information that appears
in the web
</text>
<text>
a simple test consists in writing a program for
maintaining up to date a set of trending topics
</text>
<text>
you will be hired depending on the efficiency of your solution
</text>
<top 5 />
<text>
they provide you with a file containing the text
corresponding to a highly active blog
</text>
<text>
the text is organized daily and you have to provide the
sorted list of the n most frequent words during last week
when asked
</text>
<text>
each input file contains one test case the text corresponding
to a day is delimited by tag text
</text>
<text>
the query of top n words can appear between texts corresponding
to two different days
</text>
<top 3 />
<text>
blah blah blah blah blah blah blah blah blah
please please please
</text>
<top 3 />
2
Problem IProblem I
Trending Topic
SAMPLE OUTPUT
<top 5>
analyzing 1
appears 1
business 1
company 1
consists 1
date 1
depending 1
efficiency 1
hired 1
hiring 1
imagine 1
information 1
main 1
maintaining 1
process 1
program 1
simple 1
solution 1
test 1
that 1
topics 1
trending 1
whose 1
will 1
writing 1
your 1
</top>
<top 3>
text 4
corresponding 3
file 2
provide 2
test 2
words 2
</top>
<top 3>
blah 9
text 4
corresponding 3
please 3
</top>



#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <string>
#include <map>
#include <vector>

using namespace std;

typedef pair<int,int> pII;

map<string,int> Hash;
vector<int> dy[11];
string rHash[20200];
int day_sum[11][20200];
char cache[30];
int now=9,pre=0,id=1;
int arr[20020],na;
string rss[20020];
bool vis[20020];

void DEBUG(int x)
{
    int sz=dy[x].size();
    for(int i=0;i<sz;i++)
    {
        cout<<"ID: "<<dy[x][i]<<" : "<<rHash[dy[x][i]]<<endl;
        cout<<"sum: "<<day_sum[x][dy[x][i]]<<endl;
    }
}

struct RSP
{
    int times;
    string word;
}rsp[20020];

bool cmpRSP(RSP a,RSP b)
{
    if(a.times!=b.times)
        return a.times>b.times;
    else
        return a.word<b.word;
}

void get_top(int now,int k)
{
    int sz=dy[now].size();
    na=0;
    int _7dayago=(now+3)%10;
    memset(vis,false,sizeof(vis));
    for(int i=0;i<sz;i++)
    {
        if(vis[dy[now][i]]==false)
        {
            arr[na++]=day_sum[now][dy[now][i]]-day_sum[_7dayago][dy[now][i]];
            vis[dy[now][i]]=true;
        }
    }
    sort(arr,arr+na);
    int sig=arr[max(0,na-k)];
    int rn=0;
    memset(vis,false,sizeof(vis));
    for(int i=0;i<sz;i++)
    {
        int times=day_sum[now][dy[now][i]]-day_sum[_7dayago][dy[now][i]];
        if(times >= sig &&vis[dy[now][i]]==false)
        {
            rsp[rn++]=(RSP){times,rHash[dy[now][i]]};
            vis[dy[now][i]]=true;
        }
    }
    sort(rsp,rsp+rn,cmpRSP);
    printf("<top %d>\n",k);
    for(int i=0;i<rn;i++)
    {
        cout<<rsp[i].word<<" "<<rsp[i].times<<endl;
    }
    printf("</top>\n");
}

int main()
{
    while(scanf("%s",cache)!=EOF)
    {
        if(strcmp(cache,"<text>")==0)
        {
            ///read cache
            pre=now;
            now=(now+1)%10;
            dy[now]=dy[pre];
            memcpy(day_sum[now],day_sum[pre],sizeof(day_sum[0]));
            ///7 day ago    ....
            while(scanf("%s",cache))
            {
                if(cache[0]=='<') break;
                if(strlen(cache)<4) continue;
                string word=cache;
                if(Hash[word]==0)
                {
                    rHash[id]=word;
                    Hash[word]=id++;
                }
                int ID=Hash[word];
                if(day_sum[pre][ID]==0)
                    dy[now].push_back(ID);
                day_sum[now][ID]++;
            }
        }
        else if(strcmp(cache,"<top")==0)
        {
            int top;
            scanf("%d",&top); scanf("%s",cache);
            get_top(now,top);
        }
    }
    return 0;
}


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值