编程珠玑之第二章questionC: 求变位词问题

最新推荐文章于 2023-02-12 13:55:47 发布

JohnnyHu90

最新推荐文章于 2023-02-12 13:55:47 发布

阅读量1.2k

点赞数

分类专栏：编程珠玑文章标签：编程珠玑变位词

本文链接：https://blog.csdn.net/JohnnyHu90/article/details/42565353

版权

编程珠玑专栏收录该内容

31 篇文章 3 订阅

订阅专栏

问题描述：
C. 给定一个英语词典，找出其中的所有变位词集合。例如，“pots”、“stop”和“tops”互为变位词，因为每一个单词都可以通过改变其他单词中的字母的顺序来得到。
问题解析：
1、变位词具有相同的长度，相同的字符，唯一的区别就是这些相同的字符按照不同的顺序排列成不同的字符串而已。如果有一种方法唯一标识这些相同的字符，那么这个问题好解决了。

解决方案：

方案1：按照字母顺序对每个单词进行标识并把这些具有相同标识的词集合到一起。
(1)将输入文件中的所有单词加标识并输出到另一个文件中。代码如下：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

#include <cstdio>
#include <cstdlib>       // qsort
#include <cctype>        // toupper\tolower
#include <cstring>       // strlen
#include <cassert>       // assert

#define     WORDMAX      100
#define error( str )         fatal_error( str )
#define fatal_error( str )   fprintf( stderr, "%s\n", str ), exit( 1 )

int charcomp( const void* x, const void* y) { return *( char*)x - *( char*)y; }

/************************************************************************/
// 函数名称：mytolower
// 函数目的：将字符串中的大写字符转为小写字符
// 函数参数：lword：转换后的字符串，word：要转换的字符串
// 函数返回：转换后的字符串
// 使用条件：
/************************************************************************/
char* mytolower( char* lword, char* word)
{
     while ( *word != '\0' ){
         if (isalpha(*word) && isupper(*word)){ *lword++ = tolower(*word++); }
         else { *lword++ = *word++; }
    }
    *lword = '\0';   // 末尾加结束字符

     return lword;
}

/************************************************************************/
// 函数名称：add_sign
// 函数目的：获取单词标识并输出到文件中
// 函数参数：rfile：要读文件，wfile：要写的文件
// 函数返回：无
// 使用条件：输入的单词长度小于100
/************************************************************************/
void add_sign(FILE* rfile, FILE* wfile)
{
    assert(rfile != NULL && wfile != NULL);

     char word[WORDMAX], lword[WORDMAX], sign[WORDMAX];

      while(fscanf(rfile, "%s", word) != EOF){
        mytolower(lword, word);
        strcpy(sign, lword);
        qsort(sign, strlen(sign), sizeof( char), charcomp);

        fprintf(wfile, "%s\t%s\r\n", sign, word);
     }

     return;
}

int main()
{
    FILE* rfile = fopen( "dictionary.txt", "r");
     if ( NULL ==  rfile){ fatal_error( "不能打开dictionary.txt文件！\n"); }

    FILE* wfile = fopen( "sign_dictionary.txt", "w");
     if ( NULL == wfile){ fatal_error( "不能打开sign_dictionary.txt文件！\n"); }

    add_sign(rfile, wfile);

    fclose(rfile);
    fclose(wfile);

    printf( "生成完毕！！");
     return 0;
}

简单的测试数据 dictionary.txt和生成的数据 sign_dictionary.txt见：

http://download.csdn.net/detail/johnnyhu90/8346745

(2)将有标识的输出文件中所有的词依(标识，单词)对的形式存储到内存。这里使用C++的mutimap和set来完成，代码如下：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

#include <iostream>
#include <map>
#include <set>
#include <string>
using namespace std;

/************************************************************************/
// 函数名称：print_anagrams
// 函数目的：输出打印所有的变位词
// 函数参数：rfile：要读文件
// 函数返回：无
// 使用条件：rfile是有（标识、变位词）对的文件
/************************************************************************/
void print_anagrams(FILE* rfile)
{
     char word[WORDMAX], sign[WORDMAX];
    multimap<string,string> angrams;
    std::set<string> myset;

     while(fscanf(rfile, "%s\t%s", sign, word) != EOF){
        myset.insert(sign);
        angrams.insert(std::make_pair(sign, word));
    }

     for (set<string>::iterator iter = myset.begin(); iter != myset.end(); ++iter) {
        multimap<string, string>::iterator it = angrams.equal_range(*iter).first;
         for (; it != angrams.equal_range(*iter).second; ++it){
            std::cout << ' ' << (*it).second;
        }
        cout << endl;
    }

     return;
}

int main()
{
    FILE* rfile = fopen( "sign_dictionary.txt", "r");
     if ( NULL == rfile){ fatal_error( "不能打开sign_dictionary.txt文件！\n"); }

    print_anagrams(rfile);

    fclose(rfile);
    printf( "执行完毕！！");
     return 0;
}

输出结果如下：

心得与疑惑：
1、当我们把所有的单词进行标识并且将标识和其对应的单词存储到硬盘的一个文件中，那么怎么样对这个文件按照标识进行排序(假设内存不能够一次性全部加载这些数据)？

JohnnyHu90

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
编程珠玑之第二章questionC: 求变位词问题

问题描述：C. 给定一个英语词典，找出其中的所有变位词集合。例如，“pots”、“stop”和“tops”互为变位词，因为每一个单词都可以通过改变其他单词中的字母的顺序来得到。问题解析：1、2、3、解决方案：
复制链接

扫一扫