矩阵单词拼写游戏的作弊优化(目标:毫秒级,初步达成)

   如图所示,有这么一类游戏,要求连续不重复地拼写所有可能的单词。 

   转换成类似ACM的题目其实很简单,一个八方向的回溯遍历即可出结果,只是到底要多长时间?

  

   笔者做了个很简单的测试,只是简单的回溯,没有判断是不是单词的情况下列举所有可能,测试用例如下 

(Intel(R) Core(TM)2 Duo E7500   RAM: 2.00G windows7-64bit Enterprise Service Pack 1 鉴于机子本身比较慢和迟缓,可能不够准确)

   5
   a b c d e
   f g h i j
   k l m n o
   p q r s t
   u v w x y

   一小时的时间,仅仅还在以a字母开头的回溯下,也就是不到4%的完成率,约莫14W条数据,这样肯定不行,于是做了第一次优化:仅允许不超过六个字母的单词(优化1

   实际最后包含输入与输出到txt文件操作,用时155.1S完成列举,数据量在6W-7W之间,虽不优秀但还可以忍受。

   因为是单词,固然需要排除不合格的组合,使用13000词的MySql数据库来进行排除,在逐条排除下,有如下的测试结果。

    

    接近30分钟的耗时,绝对无法忍,很明显优化1显然不够用。

    最初的想法是这样的:6W多的实际数据和1.3W的库,反向查询直接可以提高效率,大概是原耗时的1/6,但是一算时间五分钟?!还是放弃吧。

    那么怎么优化?

    记得优化1是牺牲了准确性上下功夫,最多的耗时在哪?是回溯,那么让不可能的回溯尽快结束就行了。由此有了优化2:在优化1的基础上用模糊匹配的方法判断组合的词能不能成为单词,于是有了两次判断:1、字符数大于6了吗?2、小于6的时候能不能构成单词?

     以测试用例为上图为例,实际用时如下:

       

       和上面的时间不同,这里给出了更详细的时间,且排除了人为输入和数据库初次连接的耗时。(第一行为回溯用时,第二行为数据库查找单词验证用时,包含文件输出,第三行为数据库连接后到程序结束的用时 【下同】)

       很明显提高了效率不止一点点,那么能不能更进一步?

       无论优化1还是优化2其实很大程度上是牺牲了准确性而存在的,在实际的项目中并不是特别推荐这种,那么怎么办?

       笔者的答案是:去掉字符判断。

       因为字符判断不仅多消耗了系统资源,同时造成了不准确性,由此有了单一的优化3。实际效果如何,检验下:

       

      很惊喜的是,竟然慢了?!当然这是笔者在写本文章时实际的测试,之前的测试优化2还是要比优化3慢一秒,这取决于运行时的电脑状态和数据库的返回速度,并不是特别准确。

       但如果五阶会怎么样?以最开始的25个字母为测试样例:

        优化2                                                                                  优化3

            

       在这次测试中两次开启速度相当,而且差距如此之大,也就不用重测了。优化3显然完胜。

       于是我们又得问,能不能更快?答案显然可以,比如我最初的想法,将1.3W的单词反向做判断,但怎么设计?目前笔者还没有很好的设计,暂时搁置。

       这个问题到底要多快解决:笔者的想法是 毫秒级,50ms内,只是目前笔者做不到,那么各位看官,有没有好的想法?

       下面是笔者的代码,求更优算法。

#include "stdafx.h"
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <fstream>
#include <time.h>
#include <WinSock2.h>
#include <stdlib.h>
#include <stdio.h>
#include "mysql.h"

using namespace std;

const long long MAXN = 100;
static bool judge_flag[MAXN][MAXN];
static long long ActualN = 5;
static vector<string> results;
static char original_data[MAXN][MAXN];
static MYSQL mysql;
static MYSQL_RES *resultset;
static MYSQL_ROW row;

#define   CLOCKS_PER_SEC ((clock_t)1000)

void CleanFlag()
{
	for (long long i = 0; i < ActualN; i++)
	{
		for (long long j = 0; j < ActualN; j++)
		{
			judge_flag[i][j] = true;
		}
	}
}

inline bool CheckForBoundary(long long value_line, long long value_column)
{
	if ((value_line >= ActualN) || (value_line < 0))
	{
		return false;
	}
	else
	{
		if ((value_column >= ActualN) || (value_column < 0))
		{
			return false;
		}
		else
		{
			return true;
		}
	}
}

void Explore(long long line, long long column, string current_result);
bool JudgeWord(string judge_str);

inline void Handle(long long cur_line, long long cur_col, string cur_result)
{
		if (judge_flag[cur_line][cur_col])
		{
			string temp = cur_result;
			temp += original_data[cur_line][cur_col];
		    Explore(cur_line, cur_col, temp);
			return;
		}
}

void Explore(long long line, long long column, string current_result)
{
	if (!JudgeWord(current_result))
			return;
	vector<string>::iterator iter = std::find(results.begin(), results.end(), current_result);


	if (iter == results.end())
	{
		results.insert(results.end(), current_result);
	}


	judge_flag[line][column] = false;
	//Eight Directions
	long long current_line, current_column;
	//Left Up
	current_line = line;
	current_column = column;
	if (CheckForBoundary(--current_line, --current_column))
	{
		Handle(current_line, current_column, current_result);
	}
	//UP
	current_line = line;
	current_column = column;
	if (CheckForBoundary(--current_line, current_column))
	{
		Handle(current_line, current_column, current_result);
	}
	//Right UP
	current_line = line;
	current_column = column;
	if (CheckForBoundary(--current_line, ++current_column))
	{
		Handle(current_line, current_column, current_result);
	}
	//Left
	current_line = line;
	current_column = column;
	if (CheckForBoundary(current_line, --current_column))
	{
		Handle(current_line, current_column, current_result);
	}
	//Right
	current_line = line;
	current_column = column;
	if (CheckForBoundary(current_line, ++current_column))
	{
		Handle(current_line, current_column, current_result);
	}
	//Left Down
	current_line = line;
	current_column = column;
	if (CheckForBoundary(++current_line, --current_column))
	{
		Handle(current_line, current_column, current_result);
	}
	//Down
	current_line = line;
	current_column = column;
	if (CheckForBoundary(++current_line, current_column))
	{
		Handle(current_line, current_column, current_result);
	}
	//Right Down
	current_line = line;
	current_column = column;
	if (CheckForBoundary(++current_line, ++current_column))
	{
		Handle(current_line, current_column, current_result);
	}
	judge_flag[line][column] = true;
}


bool RemoveNoWord(string judge_str)
{
		string sqlstr="select words from cetsix where words='"+judge_str+"'";
		if (!mysql_query(&mysql, sqlstr.c_str()))
		{
			resultset = NULL;
			resultset = mysql_store_result(&mysql);
			int numRows = mysql_num_rows(resultset);
			if (numRows > 0) return true;
		}
		return false;
}


bool JudgeWord(string judge_str)
{
		string sqlstr="select words from cetsix where words like '"+judge_str+"%'";
		if (!mysql_query(&mysql, sqlstr.c_str()))
		{
			resultset = NULL;
			resultset = mysql_store_result(&mysql);
			int numRows = mysql_num_rows(resultset);
			if (numRows > 0) return true;
		}
		return false;
}


int _tmain(int argc, _TCHAR* argv[])
{
	
	clock_t start_main, end_main, time_1, time_1e;
	
	//Input
	long long n;
	cin >> n;
	ActualN = n;
	for (long long i = 0; i < ActualN; i++)
	{
		for (long long j = 0; j < ActualN; j++)
		{
			cin >> original_data[i][j];
		}
	}
	//Handle
	results.clear();
	mysql_init(&mysql);//Initial MySQL
    if (!mysql_real_connect(&mysql, "localhost", "root","", "test", 3306, NULL, 0))	
	{
		cout << "Fail to connect with MySql" << endl;
	}
	else
	{
		cout << "Success of connect" << endl;
	}


	start_main = clock();
	cout << "Explore now" << endl;
	time_1 = clock();
	for (long long i = 0; i < ActualN; i++)
	{
		for (long long j = 0; j < ActualN; j++)
		{
			CleanFlag();
			string temp;
			temp += original_data[i][j];
			Explore(i, j, temp);
		}
	}
	time_1e = clock();
	cout << (double)(time_1e - time_1)/CLOCKS_PER_SEC << "S" << endl;
	//Output into document
	time_1 = clock();
	ofstream out;
	out.open("D:\\Word.txt",ios::out|ios::app);
	vector<string>::iterator outiter;
	int countNum = 0;
	for (outiter = results.begin(); outiter != results.end(); outiter++)
	{
		if (RemoveNoWord(*outiter))
		{
		   out << *outiter << "; ";
		   countNum++;
		}
		if (countNum > 10) 
		{
			out << endl;
			countNum = 0;
		}
	}
	time_1e = clock();
	cout << (double)(time_1e - time_1)/CLOCKS_PER_SEC << "S" << endl;
	out.close();
	mysql_close(&mysql);
	end_main = clock();
	cout << (double)(end_main - start_main)/CLOCKS_PER_SEC << "S" << endl;
	system("pause");
	return 0;
}

2014 4.24

      虽然腾哥在19日就给出了全新的解法(优化4),不过最近一直忙着找新工作,因而耽搁了,没有作详细分析大家可以自己来做分析。

      需要说明的是,这段程序主要使用Xcode进行开发,实际在苹果的G++下大约耗时是0.0004s,G++纯环境更快,只要0.0002s,稍后上图。

      下面的结果是原平台条件下进行的测试。

      首先是没有使用数据库,即将数据库做成了txt文件,进行读入。

      先上测试时间,在同等环境下提高了百倍以上:

      

        上面显示的是执行时间,而且没有进行去重处理,不过结果一致。

        如果加上算上读取原始词典的时间,

        

         依然快很多,还是灰常优秀的。

         那么,五阶会如何?还是以在之前的测试用例为例,之前的执行时间是0.7s。

         第一次电脑很明显的卡顿,总时间为0.43s,第二次比较流畅,就以第二次为例。

         

         实际的执行时间如下,准备好惊叹吧~

         

         依然是优化3的百倍效率,优化4完胜。

         至此,50ms的目标已经完全达到了,在mac的测试稍后奉上,绝对更惊艳。

         再次感谢Devil(Cao Teng)作出的杰出贡献,上代码,结贴啦~

//
//  trie.h
//  word_game
//
//  Created by 曹 腾 on 14-4-19.
//  Copyright (c) 2014年 zju. All rights reserved.
//

#ifndef word_game_trie_h
#define word_game_trie_h

/*
 Name: Trie树的基本实现
 Author: MaiK
 Description: Trie树的基本实现 ,包括查找 插入和删除操作
 */

#include <algorithm>
#include <iostream>
using namespace std;

const int sonnum=26,base='a';
struct Trie
{
    int num;//to remember how many word can reach here,that is to say,prefix
    bool terminal;//If terminal==true ,the current point has no following point
    struct Trie *son[sonnum];//the following point
};

// create a new node
Trie *NewTrie()
{
    Trie *temp=new Trie;
    temp->num=1;temp->terminal=false;
    for(int i=0;i<sonnum;++i)temp->son[i]=NULL;
    return temp;
}

// insert a new word to Trie tree
void Insert(Trie *pnt,const char *s,int len)
{
    Trie *temp=pnt;
    for(int i=0;i<len;++i)
    {
        if(temp->son[s[i]-base]==NULL)temp->son[s[i]-base]=NewTrie();
        else temp->son[s[i]-base]->num++;
        temp=temp->son[s[i]-base];
    }
    temp->terminal=true;
}

// delete the whole tree
void Delete(Trie *pnt)
{
    if(pnt!=NULL)
    {
        for(int i=0;i<sonnum;++i)if(pnt->son[i]!=NULL)Delete(pnt->son[i]);
        delete pnt;
        pnt=NULL;
    }
}

//trie to find the current word
Trie* Find(Trie *pnt,char *s,int len)
{
    Trie *temp=pnt;
    for(int i=0;i<len;++i)
        if(temp->son[s[i]-base]!=NULL)temp=temp->son[s[i]-base];
        else return NULL;
    return temp;
}


#endif

//
//  main.cpp
//  word_game
//
//  Created by 曹 腾 on 14-4-19.
//  Copyright (c) 2014年 zju. All rights reserved.
//

#include <iostream>
#include <fstream>
#include <ctype.h>
#include <ctime>
#include "trie.h"

using namespace std;

Trie* db;
int total_find = 0;
int total_search = 0;

#define GAME_SIZE 4
#define WORD_MAX  128
char game[GAME_SIZE][GAME_SIZE] = {'b', 'a', 't', 's',
    't', 'r', 'p', 'g',
    'v', 'e', 'i', 'p',
    'n', 'm', 's', 'i'};
bool flag[GAME_SIZE][GAME_SIZE];

void ResetFlag()
{
    for (int i=0; i<GAME_SIZE; ++i)
        for (int j=0; j<GAME_SIZE; ++j)
            flag[i][j] = false;
}

Trie* ConstructTrie(const char* file)
{
    Trie* db = NewTrie();
    
    ifstream infile(file);
    string line;
    char word[WORD_MAX];
    int trie_size = 0;
    int total_lines = 0;
    int invalid_lines = 0;
    while (!infile.eof()) {
        infile >> line;
        ++total_lines;
        
        // check for validity
        int len = (int)line.length();
        if (len < 1 || len > WORD_MAX-1) {
            ++invalid_lines;
            continue;
        }
        
        bool check_fail = false;
        for (int i=0; i<len; ++i) {
            char ch = line[i];
            if (!isalpha(ch)) {
                check_fail = true;
                break;
            }
            word[i] = tolower(ch);
        }
        if (check_fail) {
            ++invalid_lines;
            continue;
        }
        
        Insert(db, word, len);
        ++trie_size;
    }
    infile.close();
    cout << "total lines: " << total_lines << endl;
    cout << "invalid lines: " << invalid_lines << endl;
    cout << "trie size: " << trie_size << endl;
    return db;
}

void DFS(int x, int y, char* word, int len)
{
    Trie* leaf = Find(db, word, len);
    if (!leaf)
        return;
    ++total_search;
    if (leaf->terminal) {
        ++total_find;
        for (int i=0; i<len; ++i)
            printf("%c", word[i]);
        printf(" ");
    }
    
    flag[y][x] = true;
    static int dx[] = {-1, 0, 1, -1, 1, -1, 0, 1};
    static int dy[] = {-1, -1, -1, 0, 0, 1, 1, 1};
    for (int i=0; i<8; ++i) {
        int nx = x+dx[i];
        int ny = y+dy[i];
        if (flag[ny][nx])
            continue;
        if (nx < 0 || nx >= GAME_SIZE || ny < 0 || ny >= GAME_SIZE)
            continue;
        word[len] = game[ny][nx];
        DFS(nx, ny, word, len+1);
    }
    flag[y][x] = false;
}

int main(int argc, const char * argv[])
{
    db = ConstructTrie("/Users/caoteng/Documents/word_game/English.txt");
    
    clock_t t = clock();
    char word[WORD_MAX];
    for (int y=0; y<GAME_SIZE; ++y) {
        for (int x=0; x<GAME_SIZE; ++x) {
            ResetFlag();
            word[0] = game[y][x];
            int len = 1;
            DFS(x, y, word, len);
        }
    }
    if (total_find > 0)
        cout << endl;
    cout << "total search: " << total_search << endl;
    cout << "total find: " << total_find << endl;
    
    t = clock() - t;
    printf ("%f seconds\n",((float)t)/CLOCKS_PER_SEC);
    
    Delete(db);
    return 0;
}

      以上代码请注明原作者,为Xcode 5.1版调试成功代码。

      最后,Mac惊叹时间~

      Xcode下,整体运行时间Output:

total lines: 13654

invalid lines: 130

trie size: 13524

bat bat bar bare brim a art art ape tab tart tare tap tape taper tapir trap trip trip trim stab state star start starve stare strap strip stripe strip spa spat spat spate spar sprig spire tb tab tart tare tap tape taper tapir trap trip trip trim ten rat rat rate rap rapt rape re rep rem rip ripe ripen rig rip rim rise risen pat pat pate par part part paris ps prime prism pe per pert pert pen pirate pig pie pier pip g verb i is pirate pip pipe piper pig pie pier ps n net m me men mire ms se set sept semi sire siren sip sip spire sip i is 

total search: 449

total find: 118

0.021821 seconds

Program ended with exit code: 0

  执行时间Output:

total lines: 13654

invalid lines: 130

trie size: 13524

bat bat bar bare brim a art art ape tab tart tare tap tape taper tapir trap trip trip trim stab state star start starve stare strap strip stripe strip spa spat spat spate spar sprig spire tb tab tart tare tap tape taper tapir trap trip trip trim ten rat rat rate rap rapt rape re rep rem rip ripe ripen rig rip rim rise risen pat pat pate par part part paris ps prime prism pe per pert pert pen pirate pig pie pier pip g verb i is pirate pip pipe piper pig pie pier ps n net m me men mire ms se set sept semi sire siren sip sip spire sip i is 

total search: 449

total find: 118

0.000477 seconds

Program ended with exit code: 0

    五阶的时候:

otal lines: 13654

invalid lines: 130

trie size: 13524

a abc bag cg chi chin de dim din dint g hi him i ic id in ins into ion join joint lf m mid mint mr mrs ms n no not ns on os snide so son sonic stoic to ton tonic up 

total search: 164

total find: 44

0.000172 seconds

Program ended with exit code: 0

     G++直接编译结果 

     五阶

total lines: 13654

invalid lines: 130

trie size: 13524

a abc bag cg chi chin de dim din dint g hi him i ic id in ins into ion join joint lf m mid mint mr mrs ms n no not ns on os snide so son sonic stoic to ton tonic up 

total search: 164

total find: 44

0.000110 seconds

     四阶

total lines: 13654

invalid lines: 130

trie size: 13524

bat bat bar bare brim a art art ape tab tart tare tap tape taper tapir trap trip trip trim stab state star start starve stare strap strip stripe strip spa spat spat spate spar sprig spire tb tab tart tare tap tape taper tapir trap trip trip trim ten rat rat rate rap rapt rape re rep rem rip ripe ripen rig rip rim rise risen pat pat pate par part part paris ps prime prism pe per pert pert pen pirate pig pie pier pip g verb i is pirate pip pipe piper pig pie pier ps n net m me men mire ms se set sept semi sire siren sip sip spire sip i is 

total search: 449

total find: 118

0.000338 seconds


真正的不足1ms的结果。

The End

转载请注明出处。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值