GLIB 常用数据结构介绍 (4)_glib好的数据结构-CSDN博客

队列

概念

队列是另一个便利的数据结构。一个队列会保存一列条目，而且访问形式通常是向最后添加条目，从最前删除条目。当需要按到达顺序进行处理时，这很有实用。标准队列的一个变种是“双端队列（double-ended queue）”，或者说是 dequeue，它支持在队列的两端进行添加或者删除。

不过，在很多情况下最好避免使用队列。队列搜索不是特别快（是 O(n) 操作），所以，如果需要经常进行搜索，那么哈希表或者树可能更实用。这同样适用于需要访问队列中随机元素的情形；如果是那样，那么将会对队列进行很多次线性扫描。

GLib 提供了一个使用 GQueue 的 dequeue 实现；它支持标准队列操作。它的基础是双向链表（GList），所以它也支持很多其他操作，比如在队列之中进行插入和删除。不过，如果您发现自己经常要使用这些功能，那么可能需要重新考虑容器的选择；或许另一个容器更为合适。

基本操作

这里是以“排队买票（ticket line）”为模型的一些基本的 GQueue 操作：

//ex-gqueue-1.c
#include <glib.h>
int main(int argc, char** argv) {
GQueue* q = g_queue_new();
g_printf("Is the queue empty? %s, adding folks/n", g_queue_is_empty(q) ? "Yes" : "No");
g_queue_push_tail(q, "Alice");
g_queue_push_tail(q, "Bob");
g_queue_push_tail(q, "Fred");
g_printf("First in line is %s/n", g_queue_peek_head(q));
g_printf("Last in line is %s/n", g_queue_peek_tail(q));
g_printf("The queue is %d people long/n", g_queue_get_length(q));
g_printf("%s just bought a ticket/n", g_queue_pop_head(q));
g_printf("Now %s is first in line/n", g_queue_peek_head(q));
g_printf("Someone's cutting to the front of the line/n");
g_queue_push_head(q, "Big Jim");
g_printf("Now %s is first in line/n", g_queue_peek_head(q));
g_queue_free(q);
return 0;
}

***** Output *****

Is the queue empty? Yes, adding folks
First in line is Alice
Last in line is Fred
The queue is 3 people long
Alice just bought a ticket
Now Bob is first in line
Someone's cutting to the front of the line
Now Big Jim is first in line

大部分方法名称都是完全自我描述的，不过有一些更细致之处：

    * 向队列压入和取出条目的各种操作不返回任何内容，所以，为了使用队列，您需要保持 g_queue_new 返回的指针。
    * 队列的两端都可以用于添加和删除。如果要模拟排队买票时排在后面的人离开转到另一个队列去购买，也是完全可行的。
    * 有非破坏性的 peek 操作可以检查队列头或尾的条目。
    * g_queue_free 不接受帮助释放每个条目的函数，所以需要手工去完成；这与 GSList 相同。

删除和插入条目

虽然通常只通过在队列的末端添加/删除条目来修改它，但 GQueue 允许删除任意条目以及在任意位置插入条目。这里是其示例：

//ex-gqueue-2.c
#include <glib.h>
int main(int argc, char** argv) {
GQueue* q = g_queue_new();
g_queue_push_tail(q, "Alice");
g_queue_push_tail(q, "Bob");
g_queue_push_tail(q, "Fred");
g_printf("Queue is Alice, Bob, and Fred; removing Bob/n");
int fred_pos = g_queue_index(q, "Fred");
g_queue_remove(q, "Bob");
g_printf("Fred moved from %d to %d/n", fred_pos, g_queue_index(q, "Fred"));
g_printf("Bill is cutting in line/n");
GList* fred_ptr = g_queue_peek_tail_link(q);
g_queue_insert_before(q, fred_ptr, "Bill");
g_printf("Middle person is now %s/n", g_queue_peek_nth(q, 1));
g_printf("%s is still at the end/n", g_queue_peek_tail(q));
g_queue_free(q);
return 0;
}

***** Output *****

Queue is Alice, Bob, and Fred; removing Bob
Fred moved from 2 to 1
Bill is cutting in line
Middle person is now Bill
Fred is still at the end

有很多新函数：

    * g_queue_index 在队列中扫描某个条目并返回其索引；如果它不能找到那个条目，则返回 -1。
    * 为了向队列的中间插入一个新条目，需要一个指向希望插入位置的指针。如您所见，通过调用一个“peek link”函数，就可以进行此处理；这些函数包括：g_queue_peek_tail_link、g_queue_peek_head_link 以及 g_queue_peek_nth_link，它们会返回一个 GList。然后可以将一个条目插入到 GList 之前或者之后。
    * g_queue_remove 允许从队列中的任何位置删除某个条目。继续使用“排队买票”模型，这表示人们可以离开队列；他们组成队列后并不固定在其中。

查找条目

在先前的示例中已经看到，在拥有一个指向条目数据的指针或者知道其索引的条件下如何去得到它。不过，类似其他 GLib 容器， GQueue 也包括一些查找函数：g_queue_find 和 g_queue_find_custom：

//ex-gqueue-3.c
#include <glib.h>
gint finder(gpointer a, gpointer b) {
return strcmp(a,b);
}
int main(int argc, char** argv) {
GQueue* q = g_queue_new();
g_queue_push_tail(q, "Alice");
g_queue_push_tail(q, "Bob");
g_queue_push_tail(q, "Fred");
g_queue_push_tail(q, "Jim");
GList* fred_link = g_queue_find(q, "Fred");
g_printf("The fred node indeed contains %s/n", fred_link->data);
GList* joe_link = g_queue_find(q, "Joe");
g_printf("Finding 'Joe' yields a %s link/n", joe_link ? "good" : "null");
GList* bob = g_queue_find_custom(q, "Bob", (GCompareFunc)finder);
g_printf("Custom finder found %s/n", bob->data);
bob = g_queue_find_custom(q, "Bob", (GCompareFunc)g_ascii_strcasecmp);
g_printf("g_ascii_strcasecmp also found %s/n", bob->data);
g_queue_free(q);
return 0;
}

***** Output *****

The fred node indeed contains Fred
Finding 'Joe' yields a null link
Custom finder found Bob
g_ascii_strcasecmp also found Bob

注意，如果 g_queue_find 找不到条目，则它会返回 null。并且可以在上面的示例中传递一个库函数（比如 g_ascii_strcasecmp）或者一个定制的函数（比如 finder）作为 g_queue_find_custom 的 GCompareFunc 参数。

使用队列：拷贝、反转和遍历每一个（foreach）

由于 GQueue 的基础是 GList，所以它支持一些列表处理操作。这里是如何使用 g_queue_copy、 g_queue_reverse 和 g_queue_foreach 的示例：

//ex-gqueue-4.c
#include <glib.h>
int main(int argc, char** argv) {
GQueue* q = g_queue_new();
g_queue_push_tail(q, "Alice ");
g_queue_push_tail(q, "Bob ");
g_queue_push_tail(q, "Fred ");
g_printf("Starting out, the queue is: ");
g_queue_foreach(q, (GFunc)g_printf, NULL);
g_queue_reverse(q);
g_printf("/nAfter reversal, it's: ");
g_queue_foreach(q, (GFunc)g_printf, NULL);
GQueue* new_q = g_queue_copy(q);
g_queue_reverse(new_q);
g_printf("/nNewly copied and re-reversed queue is: ");
g_queue_foreach(new_q, (GFunc)g_printf, NULL);
g_queue_free(q);
g_queue_free(new_q);
return 0;
}

***** Output *****

Starting out, the queue is: Alice Bob Fred
After reversal, it's: Fred Bob Alice
Newly copied and re-reversed queue is: Alice Bob Fred

g_queue_reverse 和 g_queue_foreach 很直观；您已经看到它们在各种其他有序集合中得到了应用。不过，使用 g_queue_copy 时需要稍加留心，因为拷贝的是指针而不是数据。所以，当释放数据时，一定不要进行重复释放。

使用链接的更多乐趣

已经了解了链接的一些示例；这里是一些便利的链接删除函数。不要忘记 GQueue 中的每个条目实际上是都是一个 GList 结构体，数据存储在“data”成员中：

//ex-gqueue-5.c
#include <glib.h>
int main(int argc, char** argv) {
GQueue* q = g_queue_new();
g_queue_push_tail(q, "Alice ");
g_queue_push_tail(q, "Bob ");
g_queue_push_tail(q, "Fred ");
g_queue_push_tail(q, "Jim ");
g_printf("Starting out, the queue is: ");
g_queue_foreach(q, (GFunc)g_printf, NULL);
GList* fred_link = g_queue_peek_nth_link(q, 2);
g_printf("/nThe link at index 2 contains %s/n", fred_link->data);
g_queue_unlink(q, fred_link);
g_list_free(fred_link);
GList* jim_link = g_queue_peek_nth_link(q, 2);
g_printf("Now index 2 contains %s/n", jim_link->data);
g_queue_delete_link(q, jim_link);
g_printf("Now the queue is: ");
g_queue_foreach(q, (GFunc)g_printf, NULL);
g_queue_free(q);
return 0;
}

***** Output *****

Starting out, the queue is: Alice Bob Fred Jim
The link at index 2 contains Fred
Now index 2 contains Jim
Now the queue is: Alice Bob

注意，g_queue_unlink 并不释放没有被链接的 GList 结构体，所以需要自己去完成。并且，由于它是一个 GList 结构体，所以需要使用 g_list_free 函数来释放它 —— 而不是简单的 g_free 函数。当然，更简单的是调用 g_queue_delete_link 并让它为您释放内存。

排序

队列排序好像不太常见，不过由于各种其他链表操作都得到了支持（比如 insert 和 remove），所以此操作也得到了支持。如果恰巧您希望重新对队列进行排序，将高优先级的条目移动到前端，那么这也会很便利。这里是一个示例：

//ex-gqueue-6.c
#include <glib.h>
typedef struct {
char* name;
int priority;
} Task;
Task* make_task(char* name, int priority) {
Task* t = g_new(Task, 1);
t->name = name;
t->priority = priority;
return t;
}
void prt(gpointer item) {
g_printf("%s   ", ((Task*)item)->name);
}
gint sorter(gconstpointer a, gconstpointer b, gpointer data) {
return ((Task*)a)->priority - ((Task*)b)->priority;
}
int main(int argc, char** argv) {
GQueue* q = g_queue_new();
g_queue_push_tail(q, make_task("Reboot server", 2));
g_queue_push_tail(q, make_task("Pull cable", 2));
g_queue_push_tail(q, make_task("Nethack", 1));
g_queue_push_tail(q, make_task("New monitor", 3));
g_printf("Original queue: ");
g_queue_foreach(q, (GFunc)prt, NULL);
g_queue_sort(q, (GCompareDataFunc)sorter, NULL);
g_printf("/nSorted queue: ");
g_queue_foreach(q, (GFunc)prt, NULL);
g_queue_free(q);
return 0;
}

***** Output *****

Original queue: Reboot server   Pull cable   Nethack   New monitor
Sorted queue: Nethack   Reboot server   Pull cable   New monitor

现在您就拥有了一个模拟您的工作的 GQueue，偶尔还可以对它进行排序，可以欣喜地发现，Nethack 被提升到了其正确的位置，到了队列的最前端!

实际应用

GQueue 没有在 Evolution 中得到应用，但是 GIMP 和 Gaim 用到了它。

GIMP：

    * gimp-2.2.4/app/core/gimpimage-contiguous-region.c 在一个查找相邻片段的工具函数中使用 GQueue 存储一系列坐标。只要片段保存邻接，新的点就会被压入到队列末端，然后在下一个循环迭代中取出并被检查。
    * gimp-2.2.4/app/vectors/gimpvectors-import.c 使用 GQueue 作为 Scalable Vector Graphics（SVG）解析器的一部分。它被当做栈使用，条目的压入和取出都在队列的头上进行。

Gaim：

    * gaim-1.2.1/src/protocols/msn/switchboard.c 使用 GQueue 来追踪发出的消息。新的消息压入到队列的尾部，当发送后从头部取出。
    * gaim-1.2.1/src/proxy.c 使用 GQueue 追踪 DNS 查找请求。它使用队列作为应用程序代码与 DNS 子进程之间的临时保存区域。

关系

概念

GRelation 类似一张简单的数据库表；它包含一系列记录，或者元组（tuples），每一个包含某干个域。每个元组必须拥有相同数目的域，可以为任意的域指定索引，以支持对那个域进行查找。

作为示例，可以使用一系列元组来保存名字，一个域中保存名，第二个域中保存姓。两个域都可以被索引，以使得使用名或者姓都可以进行快速查找。

GRelation 有一个缺点，那就是每个元组最多只能包含两个域。因此，不能将它作为内存中的数据库表缓存，除非表中列非常少。我在 gtk-app-devel-list 邮件列表中搜索关于此问题的注解，发现早在 2000 年 2 月讨论到了一个补丁，它可以将此扩展到四个域，但好像它从来没有加入到发行版本中。

GRelation 好像是一个鲜为人知的结构体；本教程中研究的开放源代码的应用程序当前都没有使用它。在 Web 上浏览时发现了一个开放源代码的电子邮件客户机（Sylpheed-claws），出于各种不同目的使用了它，包括追踪 IMAP 文件夹和消息线程。所有它需要的可能只是一些宣传!

基本操作

这里是一个示例，创建一个具有两个索引域的新的 GRelation，然后插入一些记录并执行一些基本的信息查询：

//ex-grelation-1.c
#include <glib.h>
int main(int argc, char** argv) {
GRelation* r = g_relation_new(2);
g_relation_index(r, 0, g_str_hash, g_str_equal);
g_relation_index(r, 1, g_str_hash, g_str_equal);
g_relation_insert(r, "Virginia", "Richmond");
g_relation_insert(r, "New Jersey", "Trenton");
g_relation_insert(r, "New York", "Albany");
g_relation_insert(r, "Virginia", "Farmville");
g_relation_insert(r, "Wisconsin", "Madison");
g_relation_insert(r, "Virginia", "Keysville");
gboolean found = g_relation_exists(r, "New York", "Albany");
g_printf("New York %s found in the relation/n", found ? "was" : "was not");
gint count = g_relation_count(r, "Virginia", 0);
g_printf("Virginia appears in the relation %d times/n", count);
g_relation_destroy(r);
return 0;
}

***** Output *****

New York was found in the relation
Virginia appears in the relation 3 times

注意，索引恰好是在调用 g_relation_new 之后而在调用 g_relation_insert 之前添加的。这是因为 g_relation_count 等其他 GRelation 函数要依赖现有的索引，如果索引不存在，则在运行时会出错。

上面的代码中包括一个 g_relation_exists，用来查看“New York”是否在 GRelation 中。这个请求会精确匹配关系中的每一个域；可以在任意一个索引的域上使用 g_relation_count 进行匹配。

在前面的 GHashTable 部分已经接触过 g_str_hash 和 g_str_equal；在这里使用它们来对 GRelation 中的索引域进行快速查找。

选择元组

数据存入 GRelation 中后，可以使用 g_relation_select 函数来取出它。结果是一个指向 GTuples 结构体的指针，通过它进一步查询可以获得实际的数据。这里是它的使用方法：

//ex-grelation-2.c
#include <glib.h>
int main(int argc, char** argv) {
GRelation* r = g_relation_new(2);
g_relation_index(r, 0, g_str_hash, g_str_equal);
g_relation_index(r, 1, g_str_hash, g_str_equal);
g_relation_insert(r, "Virginia", "Richmond");
g_relation_insert(r, "New Jersey", "Trenton");
g_relation_insert(r, "New York", "Albany");
g_relation_insert(r, "Virginia", "Farmville");
g_relation_insert(r, "Wisconsin", "Madison");
g_relation_insert(r, "Virginia", "Keysville");
GTuples* t = g_relation_select(r, "Virginia", 0);
g_printf("Some cities in Virginia:/n");
int i;
for (i=0; i < t->len; i++) {
    g_printf("%d) %s/n", i, g_tuples_index(t, i, 1));
}
g_tuples_destroy(t);
t = g_relation_select(r, "Vermont", 0);
g_printf("Number of Vermont cities in the GRelation: %d/n", t->len);
g_tuples_destroy(t);
g_relation_destroy(r);
return 0;
}

***** Output *****

Some cities in Virginia:
0) Farmville
1) Keysville
2) Richmond
Number of Vermont cities in the GRelation: 0

关于选择和遍历元组的一些注解：

    * g_relation_select 返回的 GTuples 结构体中的记录没有特定的次序。要找出返回了多少记录，请使用 GTuple 结构体中的 len 成员。
    * g_tuples_index 接受三个参数：
          o GTuple 结构体
          o 正在查询的记录的索引
          o 希望获得的域的索引
    * 注意，需要调用 g_tuples_destroy 来正确地释放在 g_relation_select 期间所分配的内存。就算是记录实际上并没有被 GTuples 对象引用，这也是有效的。

总结

结束语

在本教程中，研究了如何使用 GLib 程序库中的数据结构。研究了可以如何使用这些容器来有效地管理程序的数据，还研究了在几个流行的开放源代码项目中这些容器如何得到应用。在此过程中介绍了很多 GLib 类型、宏以及字符串处理函数。

GLib 包括很多其他的优秀功能：它有一个线程-抽象（threading-abstraction）层，一个可移植-套接字（portable-sockets）层，消息日志工具，日期和时间函数，文件工具，随机数生成，等等，还有很多。值得去研究这些模块。并且，如果有兴趣且有能力，您甚至可以改进某些文档 —— 例如，记法扫描器的文档中包含了一个注释，内容是它需要一些示例代码，并需要进一步详述。如果您从开放源代码的代码中受益，那么请不要忘记帮助改进它!