2013网易游戏暑期实习生面试题

最新推荐文章于 2018-09-06 15:11:03 发布

jiaowopan

最新推荐文章于 2018-09-06 15:11:03 发布

阅读量1.6k

点赞数

分类专栏：笔试面试

本文链接：https://blog.csdn.net/jiaowopan/article/details/8844236

版权

笔试面试专栏收录该内容

22 篇文章 0 订阅

订阅专栏

最近去网易游戏面试，感觉自己水的一比，有两道题没回答出来，在网上搜索了一下解法。

1. 英文字符串分词，已知一长串字符串，这是一个句子，但是单词之间没有空格，现有单词本，问如何能把句子分词。

以下分析及程序来源于http://www.cnblogs.com/speedmancs/archive/2011/06/05/2073339.html

/*

给你一个没有间隔的字符串“thisisasentence”，如何将他分割成如下的句子：“this is a sentence”。

提供一个函数用来检验一个字符串是不是单词：bool dic(char* w)；

完成下列的函数。要求效率尽可能快。

bool Detect(char* str)
{

}

尽量写出完整思路，最好有伪代码。

提示: 递归，回溯。这里使用最长单词优先匹配 + 深度优先搜索+回溯的方法解决此问题。

其中数据来源为一篇普通的英文文字，测试时大概有几千个英文单词，先进行预处理，

得到长字符串和单词词典。在实现时，由于使用的是stl的string,接口和题目中给出的有所处理，

但不影响解决该问题。本程序中 Go(str,startIdx) 意为对str(startIdx:)进行分词

而bool dic(char * w)其实就是程序中对词典map的find操作

*/


#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <map>
using namespace std;
 

int maxWordLen = 0;
char puncs[] = {'.',',','-',')','(','[',']','\"'};
string splitResult[1000000];
bool bSuc = false;
//已分的词的个数
int splittedWordNum = 0;
map<string,int> wordDic;

 

//判断是否是标点

bool isPunc(char ch)
{
    for (int i = 0;i < sizeof(puncs);i++)
    {
        if (ch == puncs[i])
        {
            return true;
        }
    }
    return false;
}

 

//从文件构造长字符串和词典

void ReadFromFile(const string & filePath,string& strSentence,
                  map<string,int>& wordDic)

{
    ifstream fin(filePath.c_str());
    string str;
    string word;
    int wordOccured = 0;
    while (fin >> str)
    {
        int firstIdx = 0;
        while(firstIdx < str.size() && isPunc(str[firstIdx]))
        {
            firstIdx++;
        }

        int secIdx = str.size() - 1;
        while(secIdx >=0 && isPunc(str[secIdx]))
        {
            secIdx --;
        }
        if (secIdx >= firstIdx)
        {
            word = str.substr(firstIdx,secIdx - firstIdx + 1);
            wordDic[word] = 1;
            strSentence = strSentence + word;
            if (secIdx - firstIdx + 1 > maxWordLen)
            {
                maxWordLen = secIdx - firstIdx + 1;
            }
            wordOccured++;
            //cout << word << " ";
        }
    }
    cout << wordOccured << endl;
    fin.close();
}

void PrintSplitResult()
{
    for (int i = 0;i<splittedWordNum;i++)
    {
        cout << splitResult[i] << " ";
    }
}

void Go(string & strSentence,int startIdx)
{
    //如果已经有分词成功，则结束
    if (bSuc)
    {
        return;
    }
    //分词完毕
    if (startIdx == strSentence.size())
    {
        PrintSplitResult();
        //cout << endl;
        //cout << splittedWordNum << endl;
        bSuc = true;
        return;
    }
     //否则从最长的词开始匹配
    int maxLen = strSentence.size() - startIdx;
    if (maxLen > maxWordLen)
    {
        maxLen = maxWordLen;
    }
    for (int len = maxLen;len >0;len--)
    {
        string candidateWord = strSentence.substr(startIdx,len);
        //该词存在于词典
        if (wordDic.find(candidateWord) != wordDic.end())
        {
            splittedWordNum ++;
            splitResult[splittedWordNum - 1] = candidateWord;
            //递归对下标startIdx + len开头的字符串进行分词
            Go(strSentence,startIdx + len);
            splittedWordNum --; // 这里需要回溯
        }
    }    
}
int main(int argc, char* argv[])
{
    string strSentence;
    string filePath = "in.txt";
    ReadFromFile(filePath,strSentence,wordDic);  
     Go(strSentence,0); 
    //cout << wordDic.size() << endl;
    //cout << splittedWordNum << endl;
    if (!bSuc)
    {
        cout << "分词失败!!" << endl;
    }
}

2. 已知有随机生成1和7之间数的函数，设计生成1到10之间随机数的函数

以下分析及解法来自 http://blog.csdn.net/ssjhust123/article/details/7753012

题目：

已知一个函数rand7()能够生成1-7的随机数，请给出一个函数，该函数能够生成1-10的随机数。

思路：

假如已知一个函数能够生成1-49的随机数，那么如何以此生成1-10的随机数呢？

解法：

该解法基于一种叫做拒绝采样的方法。主要思想是只要产生一个目标范围内的随机数，则直接返回。如果产生的随机数不在目标范围内，则丢弃该值，重新取样。由于目标范围内的数字被选中的概率相等，这样一个均匀的分布生成了。

显然rand7至少需要执行2次，否则产生不了1-10的数字。通过运行rand7两次，可以生成1-49的整数，

   1  2  3  4  5  6  7
1  1  2  3  4  5  6  7
2  8  9 10  1  2  3  4
3  5  6  7  8  9 10  1
4  2  3  4  5  6  7  8
5  9 10  1  2  3  4  5
6  6  7  8  9 10  *  *
7  *  *  *  *  *  *  *

由于49不是10的倍数，所以我们需要丢弃一些值，我们想要的数字范围为1-40，不在此范围则丢弃并重新取样。

代码：

int rand10() {
  int row, col, idx;
  do {
    row = rand7();
    col = rand7();
    idx = col + (row-1)*7;
  } while (idx > 40);
  return 1 + (idx-1)%10;
}

由于row范围为1-7，col范围为1-7，这样idx值范围为1-49。大于40的值被丢弃，这样剩下1-40范围内的数字，通过取模返回。下面计算一下得到一个满足1-40范围的数需要进行取样的次数的期望值：

E(# calls to rand7) = 2 * (40/49) +
                      4 * (9/49) * (40/49) +
                      6 * (9/49)² * (40/49) +
                      ...

                      _∞
                    = ∑ 2k * (9/49)^k-1 * (40/49)
                      k=1

                    = (80/49) / (1 - 9/49)²
                    = 2.45

优化：

上面的方法大概需要2.45次调用rand7函数才能得到1个1-10范围的数，下面可以进行再度优化。

对于大于40的数，我们不必马上丢弃，可以对41-49的数减去40可得到1-9的随机数，而rand7可生成1-7的随机数，这样可以生成1-63的随机数。对于1-60我们可以直接返回，而61-63则丢弃，这样需要丢弃的数只有3个，相比前面的9个，效率有所提高。而对于61-63的数，减去60后为1-3，rand7产生1-7，这样可以再度利用产生1-21的数，对于1-20我们则直接返回，对于21则丢弃。这时，丢弃的数就只有1个了，优化又进一步。当然这里面对rand7的调用次数也是增加了的。代码如下：

int rand10Imp() {
  int a, b, idx;
  while (true) {
    a = rand7();
    b = rand7();
    idx = b + (a-1)*7;
    if (idx <= 40)
      return 1 + (idx-1)%10;
    a = idx-40;
    b = rand7();
    // get uniform dist from 1 - 63
    idx = b + (a-1)*7;
    if (idx <= 60)
      return 1 + (idx-1)%10;
    a = idx-60;
    b = rand7();
    // get uniform dist from 1-21
    idx = b + (a-1)*7;
    if (idx <= 20)
      return 1 + (idx-1)%10;
  }
}

下面计算下优化后方法的调用rand7函数的期望次数：

E(# calls to rand7) = 2 * (40/49) +
                      3 * (9/49) * (60/63) +
                      4 * (9/49) * (3/63) * (20/21) + 

                      (9/49) * (3/63) * (1/21) *
                      [ 6 * (40/49) +
                        7 * (9/49) * (60/63) +
                        8 * (9/49) * (3/63) * (20/21) ] +

                      ((9/49) * (3/63) * (1/21))² *
                      [ 10 * (40/49) +
                        11 * (9/49) * (60/63) +
                        12 * (9/49) * (3/63) * (20/21) ] +
                      ...

                    = 2.2123

这里期望次数为2.21，比起未优化的2.45次减少了大概10%。

jiaowopan

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
2013网易游戏暑期实习生面试题

最近去网易游戏面试，感觉自己水的一比，有两道题没回答出来，在网上搜索了一下解法。 1.英文字符串分词，已知一长串字符串，这是一个句子，但是单词之间没有空格，现有单词本，问如何能把句子分词。以下分析及程序来源于http://www.cnblogs.com/speedmancs/archive/2011/06/05/2073339.html /*给你一个没有间隔的字符串
复制链接

扫一扫

专栏目录