编程珠玑变位词算法

最新推荐文章于 2022-07-25 15:25:44 发布

gochenguowei

最新推荐文章于 2022-07-25 15:25:44 发布

阅读量213

点赞数

分类专栏：编程珠玑文章标签：编程珠玑

编程珠玑专栏收录该内容

1 篇文章 0 订阅

订阅专栏

转自：https://blog.csdn.net/workformywork/article/details/16963613

问题描述
给定一个英语字典，找出其中的所有变位词集合。例如，“pots”，“stop”和“tops”互为变位词，因为每一个单词都可以通过改变其他单词中的字母顺序来得到。

解决思路
首先想到的方法应该是使用一个两层循环，遍历字典，然后再比较每个单词是否是变位词，在比较单词的时候，可以先对每个单词的所有字母进行排序，然后再比较，总的时间复杂度为O(m*logm*n2)，也可以使用一个大小为26的字符数组做Hash来进行对比。这样的算法总的时间复杂度为O(m*n2)。对字典使用一个两层循环遍历，可见效率并不高。所以需要再考虑其他算法。

更好些的方法是，我们可以考虑为每个单词增加一个标识，然后再以标识对单词进行排序，这样排序后，相同标识的单词就分在一起，这样就找出了所有单词的变位词集合。可以将该方法分为以下三个步骤：

为每个单词增加一个标识，这个步骤的关键是怎么找每个单词的标识，使得一个单词的所有的变位词都又相同的标识，相当于找到一个Hash函数，使得一个单词的所有变位词都有相同的Hash值，合适的方式是对单词中的字母进行排序，如“pots”，“stop”和“tops”这三个单词是变位词，他们的标识是“opst”，即对单词中的字母按照字母顺序进行排序，最后得到标识为“opst”。
以单词的标识对字典中的单词排序，经过上面的处理，就得到了一个单词与标识的二元组，将这个二元组视为一个整体，可以使用一个结构体（或对象）来理解，比如结构体可以定义为：
struct {
char identity[MAX_SIZE];
char word[MAX_SIZE];
}
所有的单词经过步骤1后就得到了一个包含标识和单词的数组，这样使用快速排序对这个数组按照标识大小排序。
汇总单词的变位词，经过步骤2，将有相同标识的单词汇集在一起了，然后再进行汇总。
编码实现
下面编码实现上述算法，实现对于步骤2，编码时不定义结构体，直接使用一个单词数组和一个标识数组来实现，如单词数组中第i个单词的标识保存再标识数组的第i个元素中。
#define MAX_SIZE 100

#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include<ctype.h>

void str_to_lower(char *str);
void sign(char words[][MAX_SIZE], int length, char sig[][MAX_SIZE]);
int char_compare(const void *c1, const void *c2);
int str_compare(const void *s1, const void *s2);
int partition(char words[][MAX_SIZE], char sig[][MAX_SIZE], int left, int right);
void qsort_str(char words[][MAX_SIZE], char sig[][MAX_SIZE], int left, int right);
void sort(char words[][MAX_SIZE], char sig[][MAX_SIZE], int length);

/**
* 比较两个字符的大小
*/
int char_compare(const void *c1, const void *c2){
return *(char *)c1 - *(char *)c2;
}
/**
* 所有的单词都存在words数组中，为words单词中的所有单词求他的标识，
* 其中标识按照单词中的字母顺序排列的字母序列
*/
void sign(char words[][MAX_SIZE], int length, char sig[][MAX_SIZE]) {
int i;
for(i = 0; i < length; i++) {
strcpy(sig[i], words[i]);
qsort(sig[i], strlen(sig[i]), sizeof(char), char_compare);
}
}

void str_to_lower(char *str) {
int i;
int len = strlen(str);
for(i = 0; i < len; i++) {
str[i] = tolower(str[i]);
}
}

/**
* 比较两个字符串的大小，封装之后可以用qsort函数排序字符串
*/
int str_compare(const void *s1, const void *s2) {
return strcmp((char *)s1, (char *)s2);
}

int partition(char words[][MAX_SIZE], char sig[][MAX_SIZE], int left, int right) {
char temp_word[MAX_SIZE];
char temp_sig[MAX_SIZE];

strcpy(temp_sig, sig[left]);
strcpy(temp_word, words[left]);

while(left < right) {
while(str_compare(temp_sig, sig[right]) < 0 && left < right) right--;
strcpy(sig[left], sig[right]);
strcpy(words[left], words[right]);

while(str_compare(temp_sig, sig[left]) >= 0 && left < right) left++;
strcpy(sig[right], sig[left]);
strcpy(words[right], words[left]);
}
strcpy(words[right], temp_word);
strcpy(sig[right], temp_sig);
return right;
}

void qsort_str(char words[][MAX_SIZE], char sig[][MAX_SIZE], int left, int right) {
int part;
char temp[MAX_SIZE];
if(left >= right) return;
part = partition(words, sig, left, right);
qsort_str(words, sig, left, part - 1);
qsort_str(words, sig, part + 1, right);
}
/**
* 使用快速排序对字符串数组进行排序
*/
void sort(char words[][MAX_SIZE], char sig[][MAX_SIZE], int length) {
qsort_str(words, sig, 0, length - 1);
}
/*汇总变位词*/
void squash(char words[][MAX_SIZE], char sig[][MAX_SIZE], int length) {
int i;
char oldsig[MAX_SIZE];
strcpy(oldsig, sig[0]);
printf("%-6s:", oldsig);
for(i = 0; i < length; i++) {
if(strcmp(oldsig, sig[i]) != 0) {
strcpy(oldsig, sig[i]);
printf("\n");
printf("%-6s:", oldsig);
}
printf("%-6s", words[i]);
}
printf("\n");

}

void print(char words[][MAX_SIZE], char sig[][MAX_SIZE], int length) {
int i;
for(i = 0; i < length; i++) {
printf("%s %s\n", sig[i], words[i]);
}

}

int main() {
char words[][100] = {"pans", "pots", "opt", "snap", "stop", "tops"};
char sig[6][100];
int i;
sign(words, 6, sig);//为单词添加标识
print(words, sig, 6);
sort(words, sig, 6);//使用标识排序
printf("****************************************\n");
print(words, sig, 6);
printf("----------------------------------------\n");
squash(words, sig, 6);//汇总变位词
return 0;
}
以上实现是以6个单词为例来实现，如果字典中单词较多，内存容不下时，可以对上述的3个步骤使用文件来保存临时结果。

---------------------
作者：我的十六亩三分地
来源：CSDN
原文：https://blog.csdn.net/workformywork/article/details/16963613
版权声明：本文为博主原创文章，转载请附上博文链接！

gochenguowei

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
编程珠玑变位词算法

转自：https://blog.csdn.net/workformywork/article/details/16963613问题描述给定一个英语字典，找出其中的所有变位词集合。例如，“pots”，“stop”和“tops”互为变位词，因为每一个单词都可以通过改变其他单词中的字母顺序来得到。解决思路首先想到的方法应该是使用一个两层循环，遍历字典，然后再比较每个单词是否是变位词，在比较单...
复制链接

扫一扫